sed'ers & awk'ers part II - help!

sed'ers & awk'ers part II - help!

Post by Peter Samuels » Wed, 03 Jun 1998 04:00:00




Quote:> "ABC-12345/X/Y","98-06-02 12:31","983432 99","Some text", "etc..."
> "ABC-54321/Z/Y","98-06-01 10:44","345534 32","Some more text", "etc..."
> The first column always begins with "ABC-" and the X,Y and Z is
> always only one character, but I want to get rid of all but the
> number in the first column.  The number (12345) can be anything
> between 1-99999 (one to five chars).

Simple.  If you're sure of the file format and don't mind making
assumptions that it's _exactly_ as you say:

  sed -e s:ABC-:: -e s:/...::

Possibly a little safer, since it checks for the correct format before
transforming it:

  sed 's:^"ABC-\([0-9][0-9]*\)/[XYZ]/[XYZ]/":"\1":'

Both untested so do test them first to make sure.

--
Peter Samuelson
<sampo.creighton.edu ! psamuels>

 
 
 

sed'ers & awk'ers part II - help!

Post by Charles Dem » Wed, 03 Jun 1998 04:00:00




>Hi again. Thank you for helping me with my last problem.

>"ABC-12345/X/Y","98-06-02 12:31","983432 99","Some text", "etc..."
>"ABC-54321/Z/Y","98-06-01 10:44","345534 32","Some more text", "etc..."

>The first column always begins with "ABC-" and the X,Y and Z is always only
>one character, but I want to get rid of all but the number in the first
>column.
>The number (12345) can be anything between 1-99999 (one to five chars).

>The example should look like:
>"12345","98-06-02 12:31","983432 99","Some text", "etc..."
>"54321","98-06-01 10:44","345534 32","Some more text", "etc..."

sed -e 's/^"[^0-9"]*\([0-9]*\)[^0-9"]*"/"\1"/'

Chuck Demas
Needham, Mass.

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.


 
 
 

sed'ers & awk'ers part II - help!

Post by Kurt J. Lanz » Wed, 03 Jun 1998 04:00:00



> Hi again. Thank you for helping me with my last problem.

> "ABC-12345/X/Y","98-06-02 12:31","983432 99","Some text", "etc..."
> "ABC-54321/Z/Y","98-06-01 10:44","345534 32","Some more text", "etc..."

> The first column always begins with "ABC-" and the X,Y and Z is always only
> one character, but I want to get rid of all but the number in the first
> column.
> The number (12345) can be anything between 1-99999 (one to five chars).

> The example should look like:
> "12345","98-06-02 12:31","983432 99","Some text", "etc..."
> "54321","98-06-01 10:44","345534 32","Some more text", "etc..."

> (only the first column changed)

> Crack this nut ;)

I think it's high time that you learned enough about the unix tools
to do your own work. After all, we aren't all sitting around waiting
for the opportunity to help you.
 
 
 

sed'ers & awk'ers part II - help!

Post by Pete Houst » Thu, 04 Jun 1998 04:00:00



|The first column always begins with "ABC-" and the X,Y and Z is always only
|one character, but I want to get rid of all but the number in the first
|column.
|The number (12345) can be anything between 1-99999 (one to five chars).

gawk -F: '{gsub(/[0-9]/, "", $1); print}' text.dat

Untested, but you get the gist.

                        Pete

FUs set.
--

PO Box 220, Whiteknights, Reading, | Phone: +44-118-9875123 ext 7594
Berkshire, RG6 6AF, United Kingdom | Fax:   +44-118-9750203                
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
WWW: http://www.rdg.ac.uk/~spr96phh/pete.html Use lynx - you know you want to!

 
 
 

sed'ers & awk'ers part II - help!

Post by Peter Swedo » Thu, 04 Jun 1998 04:00:00


: Hi again. Thank you for helping me with my last problem.

: "ABC-12345/X/Y","98-06-02 12:31","983432 99","Some text", "etc..."
: "ABC-54321/Z/Y","98-06-01 10:44","345534 32","Some more text", "etc..."

: The first column always begins with "ABC-" and the X,Y and Z is always only
: one character, but I want to get rid of all but the number in the first
: column.
: The number (12345) can be anything between 1-99999 (one to five chars).

: The example should look like:
: "12345","98-06-02 12:31","983432 99","Some text", "etc..."
: "54321","98-06-01 10:44","345534 32","Some more text", "etc..."

: (only the first column changed)

sed -e 's/^....-//g' oldfile | sed -e 's/\/.\/.//g' > newfile

: Crack this nut ;)

Crunchy...

Petr
--
GTEI, Powered By BBN.

No, no, no... it should read:

"A well regulated militia, being neccessary to the welfare of the state,the right-wing people who keep and bear arms, shall not be unhinged."

 
 
 

sed'ers & awk'ers part II - help!

Post by Charles Dem » Thu, 04 Jun 1998 04:00:00


[followups partially ignored]




>|The first column always begins with "ABC-" and the X,Y and Z is always only
>|one character, but I want to get rid of all but the number in the first
>|column.
>|The number (12345) can be anything between 1-99999 (one to five chars).

>gawk -F: '{gsub(/[0-9]/, "", $1); print}' text.dat

>Untested, but you get the gist.

Actually I believe that will eliminate the numbers from the first
field, which was:

"ABC-12356/X/Y"

the code should have been:

gawk -F: 'gsub{/[^"0-9]*/, "", $1); print}' text.dat

which should leave just the numbers in place within double quotes.

This is also untested.  :-)

Chuck Demas
Needham, Mass.
[posted and emailed]

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.

 
 
 

sed'ers & awk'ers part II - help!

Post by Ken Pizzi » Thu, 04 Jun 1998 04:00:00




>sed -e 's/^....-//g' oldfile | sed -e 's/\/.\/.//g' > newfile

Does this group have a "useless use of pipes award"?
(Actually, contestents for the "useless use of cat award"
could (almost?) always enter the "useless pipes" contest too...)

  sed -e 's/^....-//g' -e 's,/./.,,g' oldfile > newfile

(I also ameliorated the "leading toothpick syndome" while I was at it.)

                --Ken Pizzini

 
 
 

sed'ers & awk'ers part II - help!

Post by Charles Dem » Fri, 05 Jun 1998 04:00:00






>>sed -e 's/^....-//g' oldfile | sed -e 's/\/.\/.//g' > newfile

>Does this group have a "useless use of pipes award"?
>(Actually, contestents for the "useless use of cat award"
>could (almost?) always enter the "useless pipes" contest too...)

>  sed -e 's/^....-//g' -e 's,/./.,,g' oldfile > newfile

>(I also ameliorated the "leading toothpick syndome" while I was at it.)

Both you guys lost track of the original problem, because these
solutions, while close, don't do the right thing, they eliminate the
leading double quote.  There is no need for the g flags either, as
only one substitution per line is desired/wanted.

Here's part of the original post:

| "ABC-12345/X/Y","98-06-02 12:31","983432 99","Some text", "etc..."
| "ABC-54321/Z/Y","98-06-01 10:44","345534 32","Some more text", "etc..."
|
| The first column always begins with "ABC-" and the X,Y and Z is always only
| one character, but I want to get rid of all but the number in the first
| column.
| The number (12345) can be anything between 1-99999 (one to five chars).
|
| The example should look like:
| "12345","98-06-02 12:31","983432 99","Some text", "etc..."
| "54321","98-06-01 10:44","345534 32","Some more text", "etc..."

BTW, if you want to eliminate things, then your expression could be
written more compactly to properly do the job like this:

  sed  's/...-//;s,/./.,,' oldfile > newfile

Personally, I prefer this, even though it's longer:

  sed -e 's/^"[^"0-9]*\([0-9][0-9]*\)[^"]*"/"\1"/' oldfile > newfile

It takes the field contained by double quotes starting a line and
eliminates everything between the quotes except the first string of
1 or more numbers, when that is possible.

Chuck Demas
Needham, Mass.

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.

 
 
 

sed'ers & awk'ers part II - help!

Post by Peter Swedo » Fri, 05 Jun 1998 04:00:00




: >sed -e 's/^....-//g' oldfile | sed -e 's/\/.\/.//g' > newfile

: Does this group have a "useless use of pipes award"?
: (Actually, contestents for the "useless use of cat award"
: could (almost?) always enter the "useless pipes" contest too...)

I hardly think it "useless"... I use pipes for good reasons like clarity
and organizational effectiveness; these things are an absiolute neccessity
to a person with a learning disability.

I don't operate under the principle that the least amount of typing
always produces the most elegant and effective solution, and I'd thank  
you to keep your aesthetics of sed calligraphy out of my face.

Petr
--
GTEI, Powered By BBN.

No, no, no... it should read:

"A well regulated militia, being neccessary to the welfare of the state,
the right-wing people who keep and bear arms, shall not be unhinged."

 
 
 

sed'ers & awk'ers part II - help!

Post by Peter Swedo » Fri, 05 Jun 1998 04:00:00



: >  sed -e 's/^....-//g' -e 's,/./.,,g' oldfile > newfile
: >
: >(I also ameliorated the "leading toothpick syndome" while I was at it.)

: Both you guys lost track of the original problem, because these
: solutions, while close, don't do the right thing, they eliminate the
: leading double quote.  There is no need for the g flags either, as
: only one substitution per line is desired/wanted.

Well, not quite...

: Here's part of the original post:

: | "ABC-12345/X/Y","98-06-02 12:31","983432 99","Some text", "etc..."
: | "ABC-54321/Z/Y","98-06-01 10:44","345534 32","Some more text", "etc..."
: |
: | The first column always begins with "ABC-" and the X,Y and Z is always only
: | one character, but I want to get rid of all but the number in the first
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
: | column.
   ^^^^^^^

That's what the man said he wanted...

Petr

--
GTEI, Powered By BBN.

No, no, no... it should read:

"A well regulated militia, being neccessary to the welfare of the state,
the right-wing people who keep and bear arms, shall not be unhinged."

 
 
 

sed'ers & awk'ers part II - help!

Post by Kurt J. Lanz » Fri, 05 Jun 1998 04:00:00






> : >sed -e 's/^....-//g' oldfile | sed -e 's/\/.\/.//g' > newfile

> : Does this group have a "useless use of pipes award"?

> I hardly think it "useless"... I use pipes for good reasons like clarity
> and organizational effectiveness; these things are an absiolute neccessity
> to a person with a learning disability.

But you do realize you pay a price? Having two processes in a pipeline
where one (with two substitiutions) will do th job costs you in machine
resources. If you don't care, fine. But just stay aware that it is
not cost-free.

Quote:

> I don't operate under the principle that the least amount of typing
> always produces the most elegant and effective solution, and I'd thank
> you to keep your aesthetics of sed calligraphy out of my face.

Let's try not to make this personal, shall we?
 
 
 

sed'ers & awk'ers part II - help!

Post by Charles Dem » Fri, 05 Jun 1998 04:00:00






>:>  sed -e 's/^....-//g' -e 's,/./.,,g' oldfile > newfile
>:>
>:>(I also ameliorated the "leading toothpick syndome" while I was at it.)

>:Both you guys lost track of the original problem, because these
>:solutions, while close, don't do the right thing, they eliminate the
>:leading double quote.  There is no need for the g flags either, as
>:only one substitution per line is desired/wanted.

>Well, not quite...

>:Here's part of the original post:

>:| "ABC-12345/X/Y","98-06-02 12:31","983432 99","Some text", "etc..."
>:| "ABC-54321/Z/Y","98-06-01 10:44","345534 32","Some more text", "etc..."
>:|
>:| The first column always begins with "ABC-" and the X,Y and Z is always only
>:| one character, but I want to get rid of all but the number in the first
>               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>:| column.
>   ^^^^^^^

>That's what the man said he wanted...

You're being deliberately obtuse and snipping to make your point.

He went on to show the output he wanted (which I included in my
last post, and you snipped when you posted)

Once again, here's part of the original post:

| "ABC-12345/X/Y","98-06-02 12:31","983432 99","Some text", "etc..."
| "ABC-54321/Z/Y","98-06-01 10:44","345534 32","Some more text", "etc..."
| The first column always begins with "ABC-" and the X,Y and Z is always only
| one character, but I want to get rid of all but the number in the first
| column.
| The number (12345) can be anything between 1-99999 (one to five chars).
| The example should look like:
| "12345","98-06-02 12:31","983432 99","Some text", "etc..."
| "54321","98-06-01 10:44","345534 32","Some more text", "etc..."

See the last 2 lines, see the double quotes, see Pete see, run spot run  :-)

Chuck Demas
Needham, Mass.

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.

 
 
 

sed'ers & awk'ers part II - help!

Post by Ken Pizzi » Fri, 05 Jun 1998 04:00:00




>>  sed -e 's/^....-//g' -e 's,/./.,,g' oldfile > newfile
>BTW, if you want to eliminate things, then your expression could be
>written more compactly to properly do the job like this:

>  sed  's/...-//;s,/./.,,' oldfile > newfile

Depends; POSIX.2 does not specify that it is okay to use ;s as
command seperators, so I prefered to use the only slightly more
verbose version above, just in case someone is using a hobbled
version of sed that does not support ; notation.  Also, I wasn't
concerned so much with ultimate brevity of expression but with
gratuitous use of a pipleline where a single instance of sed
could do the job just as easily, and the version two "-e"  expressions
highlighted the nature of my change more clearly.

                --Ken Pizzini

 
 
 

sed'ers & awk'ers part II - help!

Post by Peter Swedo » Sat, 06 Jun 1998 04:00:00


: >
: >That's what the man said he wanted...

: You're being deliberately obtuse and snipping to make your point.

No, I'm pointing out that the description of what he wanted was right there.

I'm not trying to make points or to prove you wrong... we're both right because
he's unclear.

Petr

--
GTEI, Powered By BBN.

No, no, no... it should read:

"A well regulated militia, being neccessary to the welfare of the state,
the right-wing people who keep and bear arms, shall not be unhinged."