>Hi,
>My name Van Nguyen. I'm knowlegable about korn shell but never write a script.
>Now, I have to write one. The following is the purpose of the script:
>I got a access log which records every hit to my company website. What I want
>to do is to calculate all browsers that hit our site every day. The following
>is the example of the access log:
>199.76.206.71 - - [20/May/1998:23:59:20 -0400] "GET
>/banners/4708_17_Jul_28_1997_151916.gif HTTP/1.1" 200 9220
>"http://yp.bellsouth.com/?mkt=MCMS" "Mozilla/4.0 (compatible; MSIE 4.01; MSN
>2.5; Windows 95)" 199.76.206.71 - - [20/May/1998:23:59:21 -0400] "GET
>/GFX/imap.gif HTTP/1.1" 200 1690 "http://yp.bellsouth.com/?mkt=MCMS"
>"Mozilla/4.0 (compatible; MSIE 4.01; MSN 2.5; Windows 95)" 199.76.206.71 - -
>[20/May/1998:23:59:19 -0400] "GET /GFX/nresad.gif HTTP/1.1" 200 1467
>"http://yp.bellsouth.com/?mkt=MCMS" "Mozilla/4.0 (compatible; MSIE 4.01; MSN
>2.5; Windows 95)" 199.76.206.71 - - [20/May/1998:23:59:16 -0400] "GET
>/GFX/nnarrow.gif HTTP/1.1" 200 1765 "http://yp.bellsouth.com/?mkt=MCMS"
>"Mozilla/4.0 (compatible; MSIE 4.01; MSN 2.5; Windows 95)" 199.76.206.71 - -
>[20/May/1998:23:59:28 -0400] "GET /GFX/opinion.gif HTTP/1.1" 200 722
>"http://yp.bellsouth.com/?mkt=MCMS" "Mozilla/4.0 (compatible; MSIE 4.01; MSN
>2.5; Windows 95)" 209.136.1.82 - - [20/May/1998:23:59:37 -0400] "GET
>/GFX/flap2.gif HTTP/1.0" 304 16501
>"http://www.gamesville.com/art_ad/92/bellsouth-go.htm" "Mozilla/4.04 [en]
>(Win95; I)" 152.202.15.144 - - [20/May/1998:23:59:46 -0400] "GET
>/GFX/retry.gif HTTP/1.0" 200 1775 "http://yp.bellsouth.com/?mkt=MBAL"
>"Mozilla/4.04 [en] (Win95; I)" 152.168.133.44 - - [20/May/1998:23:59:55
>-0400] "GET /GFX/anifinal.gif HTTP/1.0" 200 3075
>"http://www.yp.bellsouth.com/" "Mozilla/3.0C-GZone (Win95; I)" 209.136.1.82
>- - [20/May/1998:23:59:49 -0400] "GET /banners/50695_2_Dec_18_1997_083453.gif
>HTTP/1.0" 200 8394 "http://www.gamesville.com/art_ad/92/bellsouth-go.htm"
>"Mozilla/4.04 [en] (Win95; I)" 209.136.1.82 - - [20/May/1998:23:59:49 -0400]
>"GET /GFX/hobsrf.gif HTTP/1.0" 200 2189
>"http://www.gamesville.com/art_ad/92/bellsouth-go.htm" "Mozilla/4.04 [en]
>(Win95; I)" 206.49.117.90 - - [20/May/1998:23:59:32 -0400] "GET
>/GFX/honad.gif HTTP/1.0" 200 1584 "http://www.yp.bellsouth.com/"
>"Mozilla/4.04 [en] (Win95; I)"
>By knowing the kinds of the browser which are located at the end of each >I know that there is about more than one thousand types of browser hitting us >The following command is what I thought might work but still missing a lot of >gzcat access.gz | cat access.gz | fgrep "type of browser" | wc -l ( this is
>line,I want to figure out how many times the same type of browser hitting us
>daily. The output should be something like: Browser Type # Times used
>--------------------------------------------------------------------------
>"(compatible; MSIE 4.01; MSN 2.5; Windows 95)" 37,000 "[en] (Win95; I)"
>19,999 . . . . ect
>daily.
>pieces to accomplish the result:
>not enough to do what I want).
This Bourne script:
#!/bin/sh
#
sed 's/[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]/#&/g' temp22 |
tr '\012#' ' \012' | sed '1d;s/.*\("[^"]*"\)/\1/' | sort | uniq -c
# end of script
produced this output from your sample data:
1 "Mozilla/3.0C-GZone (Win95; I)"
5 "Mozilla/4.0 (compatible; MSIE 4.01; MSN 2.5; Windows 95)"
4 "Mozilla/4.04 [en] (Win95; I)"
I'll leave it to you to figure it out and adjust it to fit your needs.
Personally, I thought the other free package was a better long term
approach. YMMV
Chuck Demas
Needham, Mass.
--
Eat Healthy | _ _ | Nothing would be done at all,
Die Anyway | v | That no one could find fault with it.