Newbie: adding and averaging columns, based on feature in column 2 (awk?)

Newbie: adding and averaging columns, based on feature in column 2 (awk?)

Post by John Larso » Sat, 27 Sep 2003 19:17:22



Hi All,

I have got a large number of data sets which I would like to add and average
up by columns using a shell script, but in such a way that it is calculated
in groups as determined by the content of column 2:

Input:

q43;kwd1_inq_iso;698

q45;kwd1_inq_iso;817

q36;kwd1_inq_iso;821

q31;references;1430

q42;references;1471

q60;references;1578

q49;tbl_ti1_inq_iso;927

q43;tbl_ti1_inq_iso;956

q31;tbl_ti1_inq_iso;97

q42;trans_intro;1085

q60;trans_intro;1358

q31;trans_intro;1375

Desired output:

kwd1_inq_iso;2336;778.7

references;4479;1493.0

tbl_ti1_inq_iso;1980;660.0

trans_intro;3818;1272.7

Any help is highly appreciated,

- John

-------

RedHat Linux 9.0 on an Intel box

 
 
 

Newbie: adding and averaging columns, based on feature in column 2 (awk?)

Post by Andreas Kahar » Sat, 27 Sep 2003 19:34:23



> I have got a large number of data sets which I would like to add and average
> up by columns using a shell script, but in such a way that it is calculated
> in groups as determined by the content of column 2:

Assuming there are no blank lines in the input:

awk 'BEGIN { FS=";" } { sum[$2] += $3; cnt[$2]++ }
     END { for (i in cnt) { print i, sum[i]/cnt[i] } }' indata

It gives you slightly higher precision in the output than what
you had, but hopefully that's not a problem, otherwise change
the print statement to a printf statement and provide a format
string.

--
Andreas K?h?ri

 
 
 

Newbie: adding and averaging columns, based on feature in column 2 (awk?)

Post by Florian Stiass » Sat, 27 Sep 2003 20:05:16



Quote:>Hi All,

>I have got a large number of data sets which I would like to add and average
>up by columns using a shell script, but in such a way that it is calculated
>in groups as determined by the content of column 2:

>Input:

>q43;kwd1_inq_iso;698

>q45;kwd1_inq_iso;817

>q36;kwd1_inq_iso;821

>q31;references;1430

>q42;references;1471

>q60;references;1578

>q49;tbl_ti1_inq_iso;927

>q43;tbl_ti1_inq_iso;956

>q31;tbl_ti1_inq_iso;97

>q42;trans_intro;1085

>q60;trans_intro;1358

>q31;trans_intro;1375

>Desired output:

>kwd1_inq_iso;2336;778.7

>references;4479;1493.0

>tbl_ti1_inq_iso;1980;660.0

>trans_intro;3818;1272.7

>Any help is highly appreciated,

>- John

>-------

>RedHat Linux 9.0 on an Intel box

awk -F';' '/^$/ {next}; {x[$2]+=$3;y[$2]+=1};
        END{for(n in x){printf("%s;%d;%-8.1f\n", n, x[n], x[n]/y[n])}}'

---
                        \|/
                        o o
 ___________________oOO_(_)_OOo______________________________________

 
 
 

Newbie: adding and averaging columns, based on feature in column 2 (awk?)

Post by John Larso » Sat, 27 Sep 2003 22:00:09


Thanks - it works beatifully :-)

- John




> > I have got a large number of data sets which I would like to add and
average
> > up by columns using a shell script, but in such a way that it is
calculated
> > in groups as determined by the content of column 2:

> Assuming there are no blank lines in the input:

> awk 'BEGIN { FS=";" } { sum[$2] += $3; cnt[$2]++ }
>      END { for (i in cnt) { print i, sum[i]/cnt[i] } }' indata

> It gives you slightly higher precision in the output than what
> you had, but hopefully that's not a problem, otherwise change
> the print statement to a printf statement and provide a format
> string.

> --
> Andreas K?h?ri

 
 
 

Newbie: adding and averaging columns, based on feature in column 2 (awk?)

Post by John Larso » Sat, 27 Sep 2003 22:01:28


Thanks Floridan - this also works beautifully :-)

- John

Quote:> awk -F';' '/^$/ {next}; {x[$2]+=$3;y[$2]+=1};
> END{for(n in x){printf("%s;%d;%-8.1f\n", n, x[n], x[n]/y[n])}}'

 
 
 

1. arrange columns and find average of each column

I have a text file containing:
0.076
0.047
0.016

0.069
0.046
0.025

0.070
0.046
0.025

0.073
0.055
0.015

They need to be arranged as
0.076    0.047    0.016
0.069    0.046    0.025
0.070    0.046    0.025
0.073    0.055    0.015

and the average of each column should be printed...I think awk can be
used for this. But, I am not sure of how to go about this.

2. Adaptec AHA-2940 controller support?

3. sed/awk : need just the first column in a multi-column file

4. IP address logging?

5. AIX Script to Summarize By First Column By Adding Values in Numerical Columns

6. cannot clear cache - bug?

7. Sorting By Second Column With Unique First Column

8. ncr53c8xx SCSI card

9. setting 100-column & 132-column modes

10. Wanted: editor w/ column/table manipulation features

11. printing in columns using awk

12. AWK or otherway to convert transpose Columns to Rows

13. filtering two columns with awk