: Anyone know how I can invoke wget
: to get all urls in, say, a home directory and all
: subdirectories only, with a particular key word in
: the file name?
: Something like:
: wget http://somewhere.com/~joe/*football*.html
: where I'm looking for web pages with the word "football"
: somewhere in them (i.e., somewhere in the web page FILE NAME).
: In this instance I'd like to have retrieved pages such as
: scores-football.html home_football.html footballquotes.html etc..
I'm not a wget expert, but as far as I know, this is quite impossible. The
reason is that you can't really use wildcards at all with HTTP URL's. If, say,
http://somewhere.com/~joe/index.html existed, then the contents of the ~joe
directory are completely hidden. I suppose what you might want is for
wget to be able to only download html files with a certain pattern while
using the -r option, which could be possible. I don't think wget currently
supports something like that though (only extensions and domains are matched
against). Why not join the wget mailing list or contact the author? You might
get some more information there. The addresses are in the docs.
Hope this helps,
-----BEGIN GEEK CODE BLOCK-----
GM/CS d- s+: a--->? C++(+++) UL+(++) P+ L++(+++) E- W+(+++) N++ o+(++) K?
------END GEEK CODE BLOCK------
Get your own Geek Code at http://www.geekcode.com
Visit the Z at http://www.math.swt.edu/~mz33062/