replacing high order ascii chars using sed

replacing high order ascii chars using sed

Post by Josh Buede » Wed, 16 Apr 2003 06:43:46



I'm trying to write a sed script that will replace certain control
characters and high order ascii chars with a printable character.  I'm using
cygwin on a win2000 machine, from the windows command (cmd.exe) line.

I have a couple of questions:

1.    This command line
            sed "s/[[:cntrl:]]/X/g" b2.bin
        and this command line
            sed "s/[\x00-\x19\x7F]/X/g" b2.bin
        do not produce the same results.  Why is this?

2.    The byte size of the output was 12 bytes bigger than b2.bin's original
size of 851, for both expressions.  Why is this?  I'm replacing a single
character with a single character (I think).

3.    What would the correct regular expression/command line be?  I want to
replace all characters 0x7F and up with an 'X'.

Thanks,
Josh

 
 
 

replacing high order ascii chars using sed

Post by Stephane CHAZELA » Wed, 16 Apr 2003 18:44:01



> I'm trying to write a sed script that will replace certain control
> characters and high order ascii chars with a printable character.  I'm using
> cygwin on a win2000 machine, from the windows command (cmd.exe) line.

> I have a couple of questions:

> 1.    This command line
>             sed "s/[[:cntrl:]]/X/g" b2.bin
>         and this command line
>             sed "s/[\x00-\x19\x7F]/X/g" b2.bin
>         do not produce the same results.  Why is this?

> 2.    The byte size of the output was 12 bytes bigger than b2.bin's original
> size of 851, for both expressions.  Why is this?  I'm replacing a single
> character with a single character (I think).

It seems to me that MS Windows has two types of file: binary and
text, sed is a text utility, I don't know how it handles binary
files, it may do some conversions (on NL or CR chars). I don't
know wether CR is in cntrl on Windows or not.

Quote:

> 3.    What would the correct regular expression/command line be?  I want to
> replace all characters 0x7F and up with an 'X'.

Use "tr" instead of sed (but pay attention to CR and NL chars).

--
Stphane

 
 
 

replacing high order ascii chars using sed

Post by Bill Marcu » Wed, 16 Apr 2003 18:44:36


On Mon, 14 Apr 2003 16:43:46 -0500, Josh Buedel

> I'm trying to write a sed script that will replace certain control
> characters and high order ascii chars with a printable character.  I'm using
> cygwin on a win2000 machine, from the windows command (cmd.exe) line.

> I have a couple of questions:

> 1.    This command line
>             sed "s/[[:cntrl:]]/X/g" b2.bin
>         and this command line
>             sed "s/[\x00-\x19\x7F]/X/g" b2.bin
>         do not produce the same results.  Why is this?

Sed does not recognize the \x00 notation.  I'm not even sure about
[[:cntrl:]], but that might work in GNU sed.

Quote:> 2.    The byte size of the output was 12 bytes bigger than b2.bin's original
> size of 851, for both expressions.  Why is this?  I'm replacing a single
> character with a single character (I think).

Since you are using Windows, it might be converting the line endings
from \n to \r\n.

Quote:> 3.    What would the correct regular expression/command line be?  I want to
> replace all characters 0x7F and up with an 'X'.

man tr

--
bill marcum the mushroom-eating laboratory monkey
What kind of monkey are you? http://thesurrealist.co.uk/monkey.cgi

 
 
 

replacing high order ascii chars using sed

Post by Charles Dem » Wed, 16 Apr 2003 22:34:43





>> I'm trying to write a sed script that will replace certain control
>> characters and high order ascii chars with a printable character.  I'm using
>> cygwin on a win2000 machine, from the windows command (cmd.exe) line.

>> I have a couple of questions:

>> 1.    This command line
>>             sed "s/[[:cntrl:]]/X/g" b2.bin
>>         and this command line
>>             sed "s/[\x00-\x19\x7F]/X/g" b2.bin
>>         do not produce the same results.  Why is this?

>> 2.    The byte size of the output was 12 bytes bigger than b2.bin's original
>> size of 851, for both expressions.  Why is this?  I'm replacing a single
>> character with a single character (I think).

>It seems to me that MS Windows has two types of file: binary and
>text, sed is a text utility, I don't know how it handles binary
>files, it may do some conversions (on NL or CR chars). I don't
>know wether CR is in cntrl on Windows or not.

>> 3.    What would the correct regular expression/command line be?  I want to
>> replace all characters 0x7F and up with an 'X'.

>Use "tr" instead of sed (but pay attention to CR and NL chars).

or use perl, which one person described as sed on steroids.

Chuck Demas

--
  Eat Healthy        |   _ _   | Nothing would be done at all,

  Die Anyway         |    v    | That no one could find fault with it.

 
 
 

1. Using sed to replace extended ASCII characters

I'm writing a script that processes some text files, and I'd like to use
sed to convert certain extended control characters (left over from a Mac
word processor) to their ASCII equivalent.

One of the specific characters I want to convert is ASCII 320 to '-'.
I tried the following command:

   sed 's/\320/-/g' file > output

Obviously, the \320 notation is somehow incorrect.  Just for the record,
I also tried entering the extended character directly using emacs and
Ctl-Q.

Help would be greatly appreciated.  Please reply by email.  Thanks.

--
== Eugene Eric Kim =========================================================

==       "Dangerous stuff, science.  Lots of us not fit for it."          ==
========================================= -H.C. Bailey, "The Long Dinner" ==

2. Using IDE zip drive with solaris 2.6/x86

3. Solved: Q: replacing control chars with sed

4. large bookmarks.xml, speed up in 2.2??

5. Q: replacing control chars with sed

6. isapnp contradiction

7. Replace newline char in sed

8. netscape hangs

9. Help: use SED to move or replace "New-line" & "return" char

10. how to replace a char. with a variable in sed?

11. Using Sed and Shell Variables in Multiple Lines Search and Replace using /c\

12. reversing lines char by char, but not the line order in a file