NetBSD/othersrc kxs2Sycexternal/bsd/agcre/dist internal.h

   Just what this world needs - another regexp library. However, for
   something I was doing, I needed a regexp library in C, BSD-licensed,
   and able to be exposed to a wide range of expressions, some better
   controlled than others.

   The resulting library is libagcre, which implements regular expression
   compilation and execution. It uses the Pike Virtual Machine approach,
   and features:

   + standard POSIX features where sane
   + some/most Perl escapes
   + lazy matching via '?'
   + non-capture parenthese (?:...)
   + in-expression case-insensitive directives are supported (?i)...(?-i)
   + all case-insensitivity is actioned at expression exec time.
   Case-insensitivity can be specified at expression compile-time,
   and, if so, it will be remembered.  But the expression itself, once
   compiled, can be used to match in both a case-sensitive and insensitive
   manner
   + utf8 is supported both for expressions and for input text when
   matching
   + unicode escapes (in the Java format of \uABCD) are supported
   + exact multiple repetition specifiers {N}, and {N,M} are supported
   + backreferences are supported
   + utf16 (LE and BE) and utf32 (LE and BE) are supported, both for the
   expression and for the input being searched
   + at the most basic level, individual 32bit unicode characters are
   matched
   + an egrep/grep implementation for matching unicode regexps
   is included

   A simple implementation of sets is used to provide inclusion and
   exclusion information for unicode characters, which is taken directly
   from unicode.org. No bitmasks are used - ranges are specified by
   using an upper and a lower bound for the codepoints. Callbacks can
   also be added to these sets, to provide functionality similar to
   the ctype macros across the whole unicode character set.

   The standard regular expression basic3 torture test is passed with
   4 known (and, I'd argue, incorrect) results flagged.  As expected,
   the expression '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' matches
   in linear time, as does the expression
   '((((((((((((((((((((((((((((((x))))))))))))))))))))))))))))))'

        % time agcre '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' dist/tests/2.in
        aaaaaaaaaaaaaaaaaaaaaaaaaaaaa
        0.063u 0.000s 0:00.06 100.0%    0+0k 0+0io 0pf+0w
        % time egrep '(a?){9999}aaaaaaaaaaaaaaaaaaaaaaaaaaaaa' dist/tests/2.in
        ^C88.462u 0.730s 1:29.21 99.9%  0+0k 0+0io 0pf+0w
        %

   The library and agcre utility have been run through valgrind to
   confirm no memory leaks.

   In general, the emphasis is on a modern, predictable, VM-style,
   well-featured regexp library, in C, with a BSD license. In
   particular, sljit has not been used to speed up on certain platforms,
   most Perl regexp features are supported, as are back references,
   and UTF-8, UTF-16 and UTF32.

   Once again, I wouldn't expect anyone to use this as the main engine
   in egrep. But I am always amazed at the uses for some of the things
   that I write.

   For more information about the Pike VM, and comparison to other
   regexp implementations, please see:

        https://swtch.com/~rsc/regexp/regexp2.html

   Alistair Crooks
   Tue Aug 15 07:43:34 PDT 2017
VersionDeltaFile
1.1+165-0external/bsd/agcre/dist/internal.h
+165-01 files

UnifiedSplitRaw