shell regex matching question

shell regex matching question

Post by Ron! » Sat, 06 Nov 2004 10:33:02



i have a quick question. it's more shell related than os specific, however,
i administrate nothing but sun boxes.
so...

i have this:

print "* ([Cc][Aa][Nn]-?[Ss][Pp][Aa][Mm]|[Rr]olex|[Pp][Ee][Nn][Ii1l][Ss]|
[Vv]?[Aa][Gg][Rr][Aa]|"Save.[0-9]0%.on")" >> $SR

appends a line to a file (procmail recipe snippet), simple enough...

i have literally hundred of these lines in different files. it's part of my
home brewed spam filter.

my problem is this, ocassionally, a regular expression catches a valid email
and marks it as spam. is it possible for me determine which specific regular
expression is matching within this ->  "(this|that|theother)" ? i can
ksh -x, or run the filters and visually look for what is matching, however,
this is an admin intensive task that i'd rather let the shell (and
alt.solaris) figure out.

am i making sense?

of course the goal is to write a better spam filter...

Ron

 
 
 

shell regex matching question

Post by all mail refus » Sat, 06 Nov 2004 18:43:50



>print "* ([Cc][Aa][Nn]-?[Ss][Pp][Aa][Mm]|[Rr]olex|[Pp][Ee][Nn][Ii1l][Ss]|
>[Vv]?[Aa][Gg][Rr][Aa]|"Save.[0-9]0%.on")" >> $SR

>appends a line to a file (procmail recipe snippet), simple enough...
>and marks it as spam. is it possible for me determine which specific regular
>expression is matching within this ->  "(this|that|theother)" ? i can
>ksh -x, or run the filters and visually look for what is matching, however,

If you are writing recipies for procmail remember they are
case-insensitive (by default).

I'd tend to make more procmail recipes, with a rule each and avoid
the alternation you are using so you can cut out the [Cc] stuff..

Perl has a $+ variable with capability similar to what you ask for.

--
Elvis Notargiacomo  master AT barefaced DOT cheek
http://www.notatla.org.uk/goen/
    7.031: OnACPower returned value( 0x1 ) which is Equal To 0x1

 
 
 

1. Novice regex matching question

Hi everybody,
   From a command line, I'd like to be able to extract certain parts
of a file matching a regular expression.  So, let's call my file
"example" and say it contains the contents:

weblogic.jdbc.connectionPool.eng=\
    url=jdbc:weblogic:oracle,\
    props=user=SCOTT;password=tiger;server=DEMO,\
    initialCapacity=4,\

I would like to write some kind of Solaris command that extracts the
string "eng" immediately following "connectionPool" and "SCOTT"
immediately following "user=".  I am not very experienced with
Solaris/Unix and it seems "grep" only returns the full line where the
regular expression occurs.  OF course here I'd like my regular
expression to encompass more than one line.

Anyway, does anyone have any suggestions?

Much thanks, Dave A.

2. Linux, distributions and what I'm allowed to do

3. Deleting multplie lines after matching a regex in the first line

4. PPC405 Slave

5. (patch for Bash) regex(3) splitting/matching

6. HACKERS UNITE!

7. regex matching empty line

8. Linux and RAID

9. Regex for matching repeated (wild) patterns

10. get matched parts of regex pattern

11. how do I do replace and greedy match using regex

12. need help with a regex match pattern

13. need help with a regex match