Strange behaviour of replace-regular-expression (?)

Strange behaviour of replace-regular-expression (?)

Post by Alois Stein » Sat, 16 May 1992 23:32:49



While trying to recover an damaged archive file I obtained an
`almost' valid fortran file:
There are a lot of lines starting with single letters.
Now I tried to remove these lines by
(replace regular-expression "^[^cC0-9* ].*$" "" nil)
This deleted all lines of the file, including comments and fortran lines
starting with blanks.
When I did a 're-serach-forward' with the same regexp, I found
precisely the lines I wished to remove.

After playing a while on a narrowed region, I succeeded on that region,
but the rest of the file was deleted again.

Is there a different behaviour of these regexps in search and replace?

Any help appreciated!

(Current version: 18.57)

With kind regards

--

___________________________________________________________________________

Alois Steindl,                  Tel.: (0222) 58801 / 5529      
Inst. for Mechanics II,         Fax.: (0222) 5875863
TU Vienna,
A-1040 Wiedner Hauptstr. 8-10  
___________________________________________________________________________

 
 
 

Strange behaviour of replace-regular-expression (?)

Post by Pete Peters » Thu, 28 May 1992 21:41:39



 >While trying to recover an damaged archive file I obtained an
 >`almost' valid fortran file:
 >There are a lot of lines starting with single letters.
 >Now I tried to remove these lines by
 >(replace regular-expression "^[^cC0-9* ].*$" "" nil)
 >This deleted all lines of the file, including comments and fortran lines
 >starting with blanks.
 >When I did a 're-serach-forward' with the same regexp, I found
 >precisely the lines I wished to remove.
 >
 >After playing a while on a narrowed region, I succeeded on that region,
 >but the rest of the file was deleted again.
 >
 >Is there a different behaviour of these regexps in search and replace?
 >
 >Any help appreciated!
 >
 >(Current version: 18.57)
 >
 >With kind regards
 >  
 >--
 >
 >___________________________________________________________________________
 >
 >Alois Steindl,                     Tel.: (0222) 58801 / 5529      
 >Inst. for Mechanics II,            Fax.: (0222) 5875863
 >TU Vienna,
 >A-1040 Wiedner Hauptstr. 8-10      
 >___________________________________________________________________________

Emacs regular expressions are less "line-oriented" than we are accustomed
to from other Unix tools which operate on single-line buffers and patterns.
One tends to forget that one can match newlines in the pattern.

The problem is that your "(replace-regexp  "^[^cC0-9* ].*$" "" nil)"
replaces the text on the offending lines but leaves the linefeed.

The pointer is now pointing to the linefeed which matches "[^cC0-9* ]", so
everything up to the next linefeed is deleted and the process continues.
Everything following the first offending line will be deleted.

If you want to replace the offending lines with blank lines, you could, for
instance, do:
        (replace-regexp  "^[^cC0-9* \\n].*$" "" nil)

If you want to delete the whole line, including the linefeed, you could do
something like:
        (replace-regexp  "^[^cC0-9* ].*\\n" "" nil)

where the "\\" is just "\" if you're typing it interactively using
"M-x replace-regexp".

        pete peterson

        {decvax,mit-eddie}!genrad!rep
        (508)369-4400 x2478; Home: (508)256-5829