regcomp() slowness

regcomp() slowness

Post by Giuliano Pochin » Fri, 24 Nov 2000 04:00:00



I have a performance problem. I have to perform a lot of pattern
matching operations (a few strings to check against many different
patterns) using regcomp()/regexec().
The problem is that regcomp() is damn slow and since I use it
from inside a cgi-bin I have to re-precompile the patterns each
time.
Do some of you know a way to workaround the problem ?

Tnx.

 
 
 

regcomp() slowness

Post by Mark Ra » Fri, 24 Nov 2000 04:00:00



Quote:>I have a performance problem. I have to perform a lot of pattern
>matching operations (a few strings to check against many different
>patterns) using regcomp()/regexec().
>The problem is that regcomp() is damn slow and since I use it
>from inside a cgi-bin I have to re-precompile the patterns each
>time.

Two things to look at:
1) avoid CGI.  Look into fastCGI or some other persistent program that can
handle many requests without starting up each time.

2) examine your regexps and see if different ways of doing the equivalent
match have different performance characteristics.  Post them here for
feedback.
--


 
 
 

1. regcomp(3C) man page errors lasted for three versions?

I am doing some C programming in which I need to use the regexp stuff.  So,
I was consulting man pages, and noticed the following:

Here is a brief section cut and pasted from a SPARC running Solaris 2.5.1:

          regoff_t rm_so      Byte offset from start of string to
                              start of substring.
          regoff_t rm_eo      Byte offset from start of string of
                                                               ^^^ <--- should
                                                                        be "to"
                              the  first  character after the end
                              of substring.

As trained in suspecting myself first, and not being a native English
speaker, so I looked up the same thing man -s 3c regcomp on a Solaris 2.6
Ultra, and a Solaris 2.7 SPARC, and finally:


All the same.  I don't believe this.  After three versions of OS release
it says the same way.  This can't be correct.  It may sound nitty-picky,
but if it's incorrect, it's incorrect.

Anyone cares to confirm my suspicision?

If I (an non-English speaker) were to write the above, it would be:

          regoff_t rm_so      Byte offset from "the" start of "the parent"
                              string to "the" start of "the" substring.
          regoff_t rm_eo      Byte offset from "the" start of "the parent"
                              string "to" the first  character after the end
                              of "the" substring.

Poor grammar and writing style doesnt' convey confidence.

Thanks,

Chin Fang

2. FTDi USB Serial - dropped LF characters?

3. regexec() and regcomp()

4. Floating point exception with exp()

5. Availability of regcomp?

6. Sound and Video through Xmosaic?

7. Solaris vs. POSIX regcomp()/regexec()

8. KDE mouse problem

9. regexec() and regcomp()

10. getting regcomp() to work

11. fnmatch and regcomp on Solaris 2.4

12. nested subexpressions and regcomp

13. regcomp(), regexec(), regfree(), and regerror() in Solaris 2.4