egrep on rh 7.1 and 7.2 is ...

egrep on rh 7.1 and 7.2 is ...

Post by Farid Hamjav » Sat, 22 Jun 2002 03:02:05



greetings,

strange behavior of egerp on rh 7.1 and 7.2  have been getting my
attention for some times but this is very strange.

Any ideas?

I have a file that at the end has some garbage meaning
strange non-printable characters. so I do this:

   egrep '^[a-z|A-Z|0-9]' d1  > junk

   junk now has the garbage also.
   egrep on rh 7.1 and 7.2 is at 2.4.2
   i.e. egrep --version

   I do not have access to rh 7.3

   I can NOT reproduce this on these following combinations:

   rh 6.2 whose  egrep is 2.4

   GNU egrep compiled on some AIX 4.3.3 system
   whose version is 2.1

thanks,
farid
unm

 
 
 

egrep on rh 7.1 and 7.2 is ...

Post by Dave Bro » Sat, 22 Jun 2002 05:34:13



> I have a file that at the end has some garbage meaning
> strange non-printable characters. so I do this:

>    egrep '^[a-z|A-Z|0-9]' d1  > junk

>    junk now has the garbage also.

What do you think your regular-expression is selecting?

What it says to me (and egrep) is pass any string which begins with an
alpha or numeric character; (the "|" are superfluous, and simply say that
you can also have a string which starts with a "|"--you're not
really 'egrepping').  

grep processes strings separated by newline characters,
string by string.

So if the string being processed starts with an alphanumeric, then
regardless of the amount of 'garbage' contained in the string, the entire
string will be passed to the output.

If you're trying to get rid of "garbage characters", you might try

  tr -dc 'character_list_to_preserve' < in_file > out_file

(and don't forget to include punctuation, <space>, and "\n" in your list).

Eg:

  tr -dc 'A-Za-z0-9 ,.;:\t\n' <in_file.txt >output.txt

--
Dave Brown  Austin, TX

 
 
 

egrep on rh 7.1 and 7.2 is ...

Post by Bill Marc » Sat, 22 Jun 2002 07:59:04


On Thu, 20 Jun 2002 18:02:05 +0000 (UTC),

Quote:

>greetings,

>strange behavior of egerp on rh 7.1 and 7.2  have been getting my
>attention for some times but this is very strange.

>Any ideas?

>I have a file that at the end has some garbage meaning
>strange non-printable characters. so I do this:

>   egrep '^[a-z|A-Z|0-9]' d1  > junk

>   junk now has the garbage also.
>   egrep on rh 7.1 and 7.2 is at 2.4.2
>   i.e. egrep --version

Your expression will match any line that begins with a letter, digit or
'|'.

I think the expression you want is
egrep -v '[^a-zA-Z0-9]' d1 >junk

This will filter out any line containing characters other than a-zA-Z0-9.
If you want to filter out garbage characters without losing entire lines,
use tr.

 
 
 

1. Upgrading RH 7.1 to 7.2 says 'some volumes not unmounted cleanly'

        Hi all,

        When I try to upgrade my RH7.1 to the 7.2,
after asking me if I want to customize packages to
be upgraded (I say no to this) it says that some
volume have not been unmounted cleanly, reboot and
fsck them... There are no problems with any partition
so I don't understant... I even tried to boot in init
level 1 and unmounted all partitions 1 by 1 and
remounted the / read only to be sure... I checked
with fdisk and there are no other partitions than
the 3 I have...

        Any clue ? Thanx...
--
   \^/   Cordialement/Regards,
 -/ O \--Alexandre (Midnite) Jousset-----------

 -|___|----------------------------------------

2. bourne shell script

3. Scsi driver timeout on rh 7.1 & 7.2

4. Apache log question

5. RH 7.1 & 7.2 and 3Com905B-TX-NM NIC drivers

6. WANTED: RPC-help

7. Upgrade Mandrake 7.1 to RH 7.2 - Trouble initializing during quota check

8. Solaris 8 behind a NAT router

9. Upgrading RH 7.1 to 7.2

10. Gnome: Term in RH 7.1/7.2

11. Something broke in cciss driver from RedHat 7.1 to 7.2

12. 7.1 or 7.2

13. HPFS module Redhat 7.1 & 7.2