Pattern matching and extracting the data which matches the pattern

Pattern matching and extracting the data which matches the pattern

Post by Mark Hounslo » Wed, 27 Oct 1999 04:00:00



Help,

I am running Solaris 2.6, and have only ever written scripts for the Korn
Shell......and have the following query......

I have a very long text file ( each line represents a CAD model name ).
Somewhere within the model name (i.e. line of the text file ) , is  the
drawing reference number......this number will always match a particular
pattern of letters, hyphens and numbers - which will not be matched
elsewhere within the remainder of the line.
I need a mechanism to search each line in turn, match the pattern of
letters, hyphens and numbers and extract purely that portion of the line and
write it to a separate file. My problem , is that the pattern could appear
anywhere within each line and the number of fields can differ ( if it was
always in the same place or had a constant number of fields, I would use
'cut' )
For example, an extract from the input file could be :-

LD100 WEIGHT DAMPER              W7A   22 Z  535210043-A  USE30SE99MVS
99/10/26
LD100 GROMMET  GUIDE  ROD          W5D   12 Z  545180058-B  USE30SE99MAK
99/10/26
LD100 GROMMET  GUIDE  ROD          W5D   22 Z  545180058-B  USE30SE99MAK
99/10/26
LD100 BRACKET-CABLE,SELECT       W7A   12 Z  535210034-C  USE06SE99LJL
99/10/26
LD100 BRACKET-CABLE,SELECT       W7A   22 Z  535210034-C  USE06SE99LJL
99/10/26

And I need to extract the data matching the following pattern: nine
numbers,a hypen and a letter and put it in an output file - thus the output
I would like to see is :-

535210043-A
545180058-B
545180058-B
535210034-C
535210034-C

Many thanks in advance for your help

Mark Hounslow
LDV Limited

 
 
 

Pattern matching and extracting the data which matches the pattern

Post by Barry Margoli » Wed, 27 Oct 1999 04:00:00




>For example, an extract from the input file could be :-

>LD100 WEIGHT DAMPER              W7A   22 Z  535210043-A  USE30SE99MVS
>99/10/26
>LD100 GROMMET  GUIDE  ROD          W5D   12 Z  545180058-B  USE30SE99MAK
>99/10/26
>LD100 GROMMET  GUIDE  ROD          W5D   22 Z  545180058-B  USE30SE99MAK
>99/10/26
>LD100 BRACKET-CABLE,SELECT       W7A   12 Z  535210034-C  USE06SE99LJL
>99/10/26
>LD100 BRACKET-CABLE,SELECT       W7A   22 Z  535210034-C  USE06SE99LJL
>99/10/26

>And I need to extract the data matching the following pattern: nine
>numbers,a hypen and a letter and put it in an output file - thus the output
>I would like to see is :-

>535210043-A
>545180058-B
>545180058-B
>535210034-C
>535210034-C

sed -n 's/^.*\([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[A-Z]\).*/\1/p'

--

GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

Pattern matching and extracting the data which matches the pattern

Post by Nick Levert » Wed, 27 Oct 1999 04:00:00




>And I need to extract the data matching the following pattern: nine
>numbers,a hypen and a letter and put it in an output file - thus the output
>I would like to see is :-

>535210043-A
>545180058-B
>545180058-B
>535210034-C
>535210034-C

gawk --posix '{ for (i=1; i<=$NF; i++) if ($i ~ /^[0-9]{9}-[A-Z]$/) print $i }'

I expect someone will provide a shorter and even more efficient one in
perl ...

N.

 
 
 

Pattern matching and extracting the data which matches the pattern

Post by Richard Corfiel » Thu, 28 Oct 1999 04:00:00


An answer in perl (which could be squashed to one line if you need to) is:

#!/usr/bin/perl
while (<>)
{
   if(/(^| +)([0-9]{9}\-[A-Z])( +|$)/){
     print "$2\n"
   }

Quote:}

This is case sensitive on the A-Z. You can change the if statement to
   /(^| +)([0-9]{9}\-[A-Z])( +|$)/i
(the extra i on the end) to make things case insensitive.
The expression above also works if the thing your looking for is the first
or last field.

Things can be made even more generic by matching on type of character
(\s for space and \d for digit)

#!/usr/bin/perl
while (<>)
{
   if(/(^|\s+)(\d{9}\-[A-Z])(\s+|$)/){
     print "$2\n"
   }

Quote:}

 - Richard.

--

  _/  _/    _/    _/      Web Page, CV:   http://www.littondale.freeserve.co.uk
 _/_/      _/    _/       Dance (Ballroom, RnR), Hiking, SJA, Linux, ... [ENfP]
_/  _/  _/_/    _/_/_/    PGP2.6 Key ID: 0x0FB084B1     PGP5 Key ID: 0xFA139DA7

 
 
 

Pattern matching and extracting the data which matches the pattern

Post by Colin Smi » Thu, 28 Oct 1999 04:00:00





>>For example, an extract from the input file could be :-

>>LD100 WEIGHT DAMPER              W7A   22 Z  535210043-A  USE30SE99MVS
>>99/10/26
>>LD100 GROMMET  GUIDE  ROD          W5D   12 Z  545180058-B  USE30SE99MAK
>>99/10/26
>>LD100 GROMMET  GUIDE  ROD          W5D   22 Z  545180058-B  USE30SE99MAK
>>99/10/26
>>LD100 BRACKET-CABLE,SELECT       W7A   12 Z  535210034-C  USE06SE99LJL
>>99/10/26
>>LD100 BRACKET-CABLE,SELECT       W7A   22 Z  535210034-C  USE06SE99LJL
>>99/10/26

>>And I need to extract the data matching the following pattern: nine
>>numbers,a hypen and a letter and put it in an output file - thus the output
>>I would like to see is :-

>>535210043-A
>>545180058-B
>>545180058-B
>>535210034-C
>>535210034-C

>sed -n 's/^.*\([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[A-Z]\).*/\1/p'

Less typing.
sed -n 's/^.*\([0-9]\{9\}-[A-Z]\).*/\1/p' inputfile > outputfile

--

|Linux: Delivers on the promises Microsoft make. | The Zeppelin of   |
|             http://www.linux.org/              | operating systems.|

 
 
 

1. Pattern Matching Not Working When Pattern Assign To A Variable

I'm having a problem getting pattern matching to work on Solaris when
I assign the pattern to a variable (I'll be reading them from a file
in my program).  It works fine when I compare the same pattern
directly:

#!/usr/bin/ksh

mask=$(ORA-?(0|00)600)
error=ORA-00600

print "mask=$mask"
print "error=$error"

#This doesn't work
if [[ $error = $mask ]]
then
  print "match - with variable"
else
  print "no match - with variable"
fi

#This does work
if [[ $error = ORA-?(0|00)600 ]]
then
  print "match - without variable"
else
  print "no match - without variable"
fi

The output is as follows:

mask=ORA-?(0|00)600
error=ORA-00600
no match - with variable
match - without variable

This same test works fine on AIX and DEC (both statements match).
Does anyone know how to assign a pattern to a variable in Solaris and
get them to match?

2. Gnome panel on RH 7.2

3. ksh pattern matching when pattern is in a variable

4. problems with the kernel mathcode ?

5. Matching Line After Pattern (Pattern Occurs Multiple Times)

6. Problems with 3c905b on Red Hat 7.1

7. Matching a pattern in a file and inserting variable string above the line matched?

8. How do I use diald to connect to AT&T Worldnet

9. Pattern matching.

10. pattern matching help, please

11. ksh - checking for filenames matching a pattern

12. KSH Pattern Match Deletions

13. ksh pattern matching