Sorting By Second Column With Unique First Column

Sorting By Second Column With Unique First Column

Post by Matt Steinhof » Thu, 03 Dec 1998 04:00:00



I need to sort data by the second column while keeping
the first column unique. There is probably an easy way to
do this that I'm totally missing.
    The first column is an inmate number. The second column
is the bond amount. I'm going to output the criminals based
on the bond amount, with the highest bonds at the top.
    My problem is that if someone has two bond amounts, that
person makes the list twice. I only one folks listed based on
the highest individual bond amount. (A guy with a $10k bond
should be before someone with three $5k bonds.)

Unsorted                          What I Want
98051461~250.00                   98051385~5000.00
98051461~0.00                     98051487~750.00
98051403~500.00                   98051403~500.00
98051487~750.00                   98051461~250.00
98051487~750.00
98051385~1500.00
98051385~1500.00
98051385~0.00
98051385~0.00
98051385~5000.00

    I'm using... sort -rnt "~" -k 2 $INPUT | uniq ...which
solves the problem when someone has multiple bonds of the
same amount but doesn't help when the bonds are varying.
    A perl solution would be a wonderful thing but I'm currently
getting the data externally (open DINNER, "sort -rnt \"~\" -k 2
$Temp | uniq |" or die "Can't fork: $!" ;) so I'll take any
solution that works. (Yes, the above is cheezy.)
    All suggestions (and requisite mocking) welcome.
    Many thanks.

    Matt

 
 
 

Sorting By Second Column With Unique First Column

Post by Jim Mellande » Thu, 03 Dec 1998 04:00:00



> I need to sort data by the second column while keeping
> the first column unique. There is probably an easy way to
> do this that I'm totally missing.
>     The first column is an inmate number. The second column
> is the bond amount. I'm going to output the criminals based
> on the bond amount, with the highest bonds at the top.
>     My problem is that if someone has two bond amounts, that
> person makes the list twice. I only one folks listed based on
> the highest individual bond amount. (A guy with a $10k bond
> should be before someone with three $5k bonds.)

> Unsorted                          What I Want
> 98051461~250.00                   98051385~5000.00
> 98051461~0.00                     98051487~750.00
> 98051403~500.00                   98051403~500.00
> 98051487~750.00                   98051461~250.00
> 98051487~750.00
> 98051385~1500.00
> 98051385~1500.00
> 98051385~0.00
> 98051385~0.00
> 98051385~5000.00

>     I'm using... sort -rnt "~" -k 2 $INPUT | uniq ...which
> solves the problem when someone has multiple bonds of the
> same amount but doesn't help when the bonds are varying.
>     A perl solution would be a wonderful thing but I'm currently
> getting the data externally (open DINNER, "sort -rnt \"~\" -k 2
> $Temp | uniq |" or die "Can't fork: $!" ;) so I'll take any
> solution that works. (Yes, the above is cheezy.)
>     All suggestions (and requisite mocking) welcome.
>     Many thanks.

>     Matt

Well, awk could do it like this:

awk -F\~ '{ if (0+$2 > largest[$1] ) largest[$1]=$2 }        #Keep array of
largest per inmate #
END     { for (i in largest)    print  i "~" largest[i] }
' <$INPUT | sort -rnt "~" -k 2

but I'm sure the perl guys will give you a solution too.

 
 
 

Sorting By Second Column With Unique First Column

Post by J. S. Jense » Thu, 03 Dec 1998 04:00:00



> There is probably an easy way to do this that I'm totally missing.

Well, I was missing this too.  I spent a long time trying to relate
uniq'ed line numbers and eventually sed -n print out those lines,
however, your missing just one part:

Quote:> sort -rnt "~" -k 2 $INPUT | uniq

Just reverse numerically sort the whole damn thing, then uniq it based
on inmate id, then re-reverse numerical sort what's left on the amount:

sort -rn ${INPUT} |uniq -w 8 |sort -rnt '~' -k 2

The only problem is that the uniq is based of the 8 character inmate #.

--
J. S. Jensen

http://www.paramin.com

 
 
 

Sorting By Second Column With Unique First Column

Post by jwpurpl » Fri, 04 Dec 1998 04:00:00


Have you tried sort -rntu ? the -u option does the same as uniq, but if
you're lucky it'll limit itself to the column sorted on.

>I need to sort data by the second column while keeping
>the first column unique. There is probably an easy way to
>do this that I'm totally missing.
>    The first column is an inmate number. The second column
>is the bond amount. I'm going to output the criminals based
>on the bond amount, with the highest bonds at the top.
>    My problem is that if someone has two bond amounts, that
>person makes the list twice. I only one folks listed based on
>the highest individual bond amount. (A guy with a $10k bond
>should be before someone with three $5k bonds.)

>Unsorted                          What I Want
>98051461~250.00                   98051385~5000.00
>98051461~0.00                     98051487~750.00
>98051403~500.00                   98051403~500.00
>98051487~750.00                   98051461~250.00
>98051487~750.00
>98051385~1500.00
>98051385~1500.00
>98051385~0.00
>98051385~0.00
>98051385~5000.00

>    I'm using... sort -rnt "~" -k 2 $INPUT | uniq ...which
>solves the problem when someone has multiple bonds of the
>same amount but doesn't help when the bonds are varying.
>    A perl solution would be a wonderful thing but I'm currently
>getting the data externally (open DINNER, "sort -rnt \"~\" -k 2
>$Temp | uniq |" or die "Can't fork: $!" ;) so I'll take any
>solution that works. (Yes, the above is cheezy.)
>    All suggestions (and requisite mocking) welcome.
>    Many thanks.

>    Matt

 
 
 

Sorting By Second Column With Unique First Column

Post by Matt Steinhof » Sat, 05 Dec 1998 04:00:00


.
    Many thanks. The `uniq -w 8' was exactly what I needed. Here's
what I finally ended up using...

sort -rnt "~" +0 +1 $Temp | uniq -w 8 | sort -rnt "~" -k 2

    Now once I figure out how to do that in perl, my life will
be worth living.

    Matt



> > There is probably an easy way to do this that I'm totally missing.

> Well, I was missing this too.  I spent a long time trying to relate
> uniq'ed line numbers and eventually sed -n print out those lines,
> however, your missing just one part:

> > sort -rnt "~" -k 2 $INPUT | uniq

> Just reverse numerically sort the whole damn thing, then uniq it based
> on inmate id, then re-reverse numerical sort what's left on the amount:

> sort -rn ${INPUT} |uniq -w 8 |sort -rnt '~' -k 2

> The only problem is that the uniq is based of the 8 character inmate #.

> --
> J. S. Jensen

> http://www.paramin.com

 
 
 

Sorting By Second Column With Unique First Column

Post by J. S. Jense » Sat, 05 Dec 1998 04:00:00



> sort -rnt "~" +0 +1 $Temp | uniq -w 8 | sort -rnt "~" -k 2

> Now once I figure out how to do that in perl, my life will be worth living.

Ah, don't tell me that :-) for I still have yet to write a line of perl.

--
J. S. Jensen

http://www.paramin.com

 
 
 

Sorting By Second Column With Unique First Column

Post by Al A » Mon, 07 Dec 1998 04:00:00


        sort has a switch  -u
                            unique


: > sort -rnt "~" +0 +1 $Temp | uniq -w 8 | sort -rnt "~" -k 2
: >
: > Now once I figure out how to do that in perl, my life will be worth living.

: Ah, don't tell me that :-) for I still have yet to write a line of perl.

: --
: J. S. Jensen

: http://www.paramin.com

--
=-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
al aab, seders moderator                                      sed u soon
               it is not zat we do not see the  s o l u t i o n          
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+

 
 
 

1. sed/awk : need just the first column in a multi-column file


assuming you will have a space as the delimiter for your data ...
here it is in AWK.

AWK script would be ...
{ print $1 }

and then you would execute this on the command-line ...

AWK -f [script file] [data file] > output.file (if you want to save the
output)

note:  there are other elegent ways of doing the above, but this
        example is probably the easiest way to think about it.

-Bob-

Bob Angell                               | Data Integration (multi-platform)
Principal, Management Systems Engineer   | Health Systems Engineering
Applied Information & Management Systems | Database design/development
1238 Fenway Avenue - SLC, UT 84102-3212  | Simulation/Modeling/Neural Nets

     [Standard Disclaimer: Speaking for the University of Utah, NOT!]

2. How to change my prompt?

3. AIX Script to Summarize By First Column By Adding Values in Numerical Columns

4. e2fsck consistently finds corrupt inodes...

5. sort -n with more than one column

6. sockd

7. Sorting by two columns

8. Sendmail "network unreachable"

9. How to have sort start sorting froma given column?

10. SORT command > sorting decimal numbers in columns on a file not working !

11. Newbie: adding and averaging columns, based on feature in column 2 (awk?)

12. arrange columns and find average of each column

13. setting 100-column & 132-column modes