repost: looking for software to purge "robot" hits from logs

repost: looking for software to purge "robot" hits from logs

Post by Peter » Wed, 15 Sep 1999 04:00:00



I didn't get any good answers last time around... is anyone aware of
reasonably good software for purging Web log files of requests by
robots? We're currently filtering hundreds of user agent strings and
IP addresses, but would like software that can better detect robots
and scrub our log files for more accurate page view stats.

Thanks.

-Peter

 
 
 

repost: looking for software to purge "robot" hits from logs

Post by Lachlan Cranswi » Thu, 16 Sep 1999 04:00:00



around... is anyone aware of>reasonably good software for purging Web log
files of requests by>robots? We're currently filtering hundreds of user agent
strings and>IP addresses, but would like software that can better detect robots

Quote:>and scrub our log files for more accurate page view stats.

You could try http-analyze to generate your logs - you can tell it
to not include certain types of robots including user defined ones.

http://www.netstore.de/Supply/http-analyze/

Lachlan.

Lachlan M. D. Cranswick

Collaborative Computational Project No 14 (CCP14)
    for Single Crystal and Powder Diffraction
Daresbury Laboratory, Warrington, WA4 4AD U.K
Tel: +44-1925-603703  Fax: +44-1925-603124

                           http://www.ccp14.ac.uk

 
 
 

repost: looking for software to purge "robot" hits from logs

Post by Jacob Sparre Anders » Thu, 16 Sep 1999 04:00:00



> I didn't get any good answers last time around... is
>anyone aware of reasonably good software for purging Web
> log files of requests by robots?

What about "grep"?

If you make sure to include the user agent identifier in
your log files, you can simply use the command

   grep -f robots < access_log > access_log.filtered

where "robots" is the file with your list of user agent
identifiers from robots, "access_log" is your log file, and
"access_log.filtered" is your log file without robots[1].

Good luck,

Jacob
--

Jacob

----------------------------------------------------------------------------

--  National Laboratory Ris?  --  Phone.: (+45) 46 77 51 23               --
--  Systems Analysis          --  Fax...: (+45) 46 77 51 99               --
----------------------------------------------------------------------------

 
 
 

1. Looking for "voice mail"/"answering machine" software

I'd like to find some software that I can run under eithe Linux or
NeXTSTEP to answer the phone and take messages for me. Pretty much
just some kind of "answering machine" software.

At the moment it doesn't need to be any more complex than to answer,
play a message, record a message, and email the recorded message to
me. Don't need any voice mail menu system or anything.

I've been getting people leaving messages on the answering machine in
my office meant for other people in my dept., and it would sure be
easy to get them their messages if I could just forward it to them
over email.

--
Chris Osborn, Network Administrator     Napa Valley College
707 253 3130 - Voice                    2277 Napa-Vallejo Hwy.
707 253 3063 - Fax                      Napa, CA 94558

2. error message help, please

3. GETSERVBYNAME()????????????????????"""""""""""""

4. Single process memory limitation

5. How to distinguish "robot" and "normal user" in WWW?

6. ATAPI Zip drive on RH 8.0?

7. "Logging" or "Log structured" file systems with news spool.

8. Whoa! Both processors running!

9. """"""""My SoundBlast 16 pnp isn't up yet""""""""""""

10. I must hit "power" and then "reset" to power up my PC

11. Looking for "dropsafe" logging software

12. tools for cleaning log files of "robot" entries?

13. Looking for "high" number of hits on Linux/NCSA