Wed Jan 15 12:00:38 1997 Tom Lord * rxsuper.c (rx_superset_cons): reference count tweak. * rxnode.c (rx_rexp_equal): fixed test for equality of interval expressions. * rxgnucomp.h (enum RE_SYNTAX_BITS): turned the syntax bits from "#define" into "enum" to ease debugging. Mon Jan 13 10:07:39 1997 Tom Lord * rxsuper.c (rx_superset_cons): While hash_store will protect cdr itself it might first allocate hash tables and stuff which might cause it to be garbage collected before it's protected -- (from Greg Stark) * rxgnucomp.c (isa_blank): Test for ==, not != '\t'. (from Andreas Schwab) Tue Dec 3 00:33:27 1996 Tom Lord * rxposix.c (regnexec): When testing to consider freeing REGS, watch out for PMATCH == NULL. * rxspencer.c (rx_next_solution): In case r_parens: Before trying to match a parenthesized subexpression, restore the corresponding regs to their value prior to attempting the match. If the match finally fails, be sure sure to restore the old values then, too. Mon Dec 2 00:52:06 1996 Tom Lord * rxspencer.c (rx_next_solution): After "star_try_next_left_match:"... Only return yes from a star expression whose subexpression fails if the target string has 0 length. * rxposix.c (regnexec, regncomp): reversed the order of the string and string-length arguments to be more like other functions (e.g. strncmp). (Suggested by Mike Haertel) * inst-rxposix.h, rxgnucomp.h (REG_E*): moved declarations of POSIX error codes into the posix header file. * rxgnucomp.c (rx_parse): Don't permit a backreference to an enclosing subexpression. This change returns some code that was bogusly deleted somewhere along the line. This fixes a bug that causes a pattern such as: ((.*)\1)x to core dump. (Reported by Mike Haertel) Sun Nov 24 04:24:13 1996 Tom Lord * rxposix.c (rx_regexec): Added a new optimization that generalizes the fastmap. The new optimization is applied if the length of the string exceeds RX_MANY_CASES. Fri Nov 8 09:07:14 1996 Tom Lord * rxsuper.h (RX_DEFAULT_DFA_CACHE_SIZE): * rxbasic.h (RX_DEFAULT_NFA_DELAY): New macros so these values can be set at compile time. Tue Nov 5 09:37:03 1996 Tom Lord * rxspencer.c (rx_make_solutions): watch out for solns->exp == NULL. Eric Johnson (johnsone@uiuc.edu) detected this bug and also performed useful testing of Rx memory management. Tue Jun 18 11:44:46 1996 Tom Lord * rxanal.c (rx_start_superstate): Don't release an old superstate unless it is known that the new superstate has been successfully constructed. Thu Jun 13 11:18:25 1996 Tom Lord * rxspencer.c etc. (rx_next_solution et al.): remove all traces of rx_maybe Wed May 22 12:28:22 1996 Tom Lord * rxanal.c (rx_start_superstate): Preserve the invariant that a locked superstate is never semifree. Fri May 17 10:21:26 1996 Tom Lord * rgx.c (scm_regexec): added match data support for "#\c" -- the final_tag of the match (for the cut operator). * rxspencer.c (rx_next_solution): propogate is_final data up through the tree of solution streams. * rxnfa.h (struct rx_nfa_state): unsigned int is_final:1 => int is_final for the cut operator. * rxanal.c (rx_match_here_p): * rxanal.c (rx_fit_p): * rxanal.c (rx_longest): When a final state is detected, propogate the value of the is_final flag back to the caller. It may contain data generated by a "cut" operator. * rxsuper.c (superset_allocator): when marking a superset final, mark it with the maximum of the is_final fields of the constituent nfa states (for the "cut" operator which allows users to set that value). * rxgnucomp.c (rx_parse): Replace "[[:set...:]]" with "[[:cut n:]]". cut is regular but set is not, so cut leads to much faster running patterns. * rxnfa.c (rx_build_nfa): compile r_cut nodes. r_cut nodes match the empty string and nothing more. A parameter to the cut node determines whether the empty match leads to a final state, or to a failure. * rx.c (rx_free_rx): * rxsuper.c (release_superset_low): * rxanal.c (rx_start_superstate): fixed the test for a cached starting superset to reflect the simplified memory management of `struct rx' (they are now explicitly freed using rx_free_rx) and `struct rx_superset' (they are now ultimately freed using free and not kept on a free-list). Now the `start_set' field of a `struct rx' is only non-0 if it is valid. Tue May 14 08:56:22 1996 Tom Lord * rxspencer.h (typedef rx_contextfn): take an entire expression tree instead of just a context type since for some context types, parameters in the tree matter ([[:set...:]]) * rxstr.c (rx_str_contextfn): handle [[:set...:]] operator. * rxgnucomp.c (rx_parse): added the [[:set n = x:]] construct to make it easier to lex using regexps. * rxposix.c (regnexec): "This pattern (with 10 subexpressions and 9 backreferences) made no entries in a match array of size 5." (from doug@plan9.att.com) * rxgnucomp.c (rx_parse): new compilation state variable: last_non_regular_expression When compiling, keep track of two, not one point in the tree for concatenating new nodes. The *last_non_regular_expression point is always the same as the *last_expression or is a parent of that node. Concatenations of regular constructs happen at last_expression, others at last_non_regular_expression. The resulting trees have "observable" constructs clustered near the root of the tree which allows those optimizations that apply only to regular subtrees to have a greater impact on overall performance. * rxspencer.c (rx_next_solution): interval satisfaction test was wrong. * rxanal.c (rx_posix_analyze_rexp): An interval is always observed (not truly a regular expression). * rxstr.c (rx_str_contextfn): "when you're doing back-reference matching case insensitively (with REG_ICASE set), you are supposed to also do the BR matching without paying attention to case. Mon May 13 09:59:48 1996 Tom Lord * rxspencer.c (rx_next_solution): Don't construct an NFA when comparing an r_string to some text -- just do a strcmp-like operation. * rxgnucomp.c (rx_parse): new variable: n_members An array keeping track of the size of csets generated by inverting the translation table. (rx_parse): validate_inv_tr and n_members were way to big -- each only needs CHAR_SET_SIZE elements. Mon May 13 09:29:42 1996 Zachary Weinberg * rxnode.c (rx_init_string): New data structure for strings -- part of the overall support for constant string optimization. * rxnode.c (rx_mk_r_str etc.): a new type of rexp-node -- an abbreviation for a concatenation of characters. * rxdbug.c (print_rexp): Added support for printing r_str nodes. * rxgnucomp.c (rx_parse): initial support for constant strings. Wed Jan 31 19:59:46 1996 Preston L. Bannister Changes to compile clean under MSVC 4.0 (w/o warnings). Added makefile for MSVC 4.0 (librx.mak). [! Changes marked *** were made differently from the submitted patches -- the descriptions may not apply exactly.] hashrexp.c: Added __STDC__ variant of function definition. *** rxall.h: Pull in standard C header files. *** Map bzero() to memset(). rxanal.c: Remove unused variable. rxdbug.c: Added stdio include. rxhash.c: Remove unused variable. rxnfa.c: Remove {re,m}alloc definition. rxposix.c: Remove unused variable. *** Cast parameter nmatch declared as size_t to int on use. *** Perhaps nmatch should be passed as int? [made related variables size_t] rxspencer.c: Add rxsimp.h include. Remove unused variables and labels. rxunfa.c: Remove unused variable. Tue Jan 30 10:29:16 1996 Tom Lord * rxsimp.c (rx_simple_rexp): move assignment out of if. ("Preston L. Bannister" ) * Makefile.in (CFLAGS, ALL_CFLAGS): rearranged to allow user specified CFLAGS. * rxposix.h: comment stuff after #endif. (reported by Eric Backus ) Mon Jan 1 13:03:28 1996 Jason Molenda (crash@phydeaux.cygnus.com) * rxbasic.c (rx_basic_make_solutions): argument called 'rexp' is now called 'expression'. Argument 'str' should be unsigned char. * rxbasic.h (rx_basic_make_solutions): argument 'str' should be unsigned char. * rxsuper.h (rx_handle_cache_miss, rx_superstate_eclosure_union): syntax error in prototypes. [Actually fixed in rxsuper.c, from which that section of rxsuper.h is derived.] * rxnode.c (rx_mk_r_cset): fix function decl. Tue Jan 30 09:43:28 1996 Tom Lord * rxposix.c (regnexec): pass rx_regexec "regs", not "pmatch". "regs" is valid even if "pmatch" is NULL. (Fixes testsuite bug "pragma" reported by John.Szetela@amd.com (John J. Szetela) also fixes bug reported by Jongki Suwandi ) Fri Jan 26 14:23:20 1996 Tom Lord * rxdbug.c (AT): Use the GCC feature only if HAVE_POSITIONAL_ARRAY_INITS is defined. * Makefile.in: Fixed depends target to not include system header files. Use @exec_prefix@. (Derek Clegg ) Thu Jan 4 16:13:07 1996 Tom Lord * rxposix.c (rx_regexec): Don't bother checking to see if an anchored pattern matches other than at the beginning of a string. (rx_regmatch): Don't bother looking for matches that are the wrong length if the overall length of the expression is known. This duplicates an optimization already in rx_make_solutions and rx_basic_make_solutions, but its worth it. The make_solutions optimization applies to fixed length subexpressions of a variable length expression. The regmatch optimization can avoid (in sed, for example) many, many uneeded calls to make_solutions and rx_next_solution. * rxspencer.c (rx_make_solutions, rx_basic_make_solutions): If the expression is fixed length and that length doesn't match the buffer, don't bother constructing a new solution stream -- just return the canonical "no solution" stream. Sat Dec 30 21:19:31 1995 Tom Lord * *.[ch]: posixification and algorithmic improvement (thanks henry!).