Wget recursive is strange ?

Wget recursive is strange ?

Post by Gisbert Berge » Fri, 14 Nov 1997 04:00:00



I'm tried to use wget, but it works very strange for me.
I've read the man page, and as far as I know I have to give the
command like this (example):

wget -r -l0
ftp://ftp.germany.eu.net/pub/os/Linux/Distributions/Slackware/slakwar...

to get the data from this directory and from all its subdirs. But
it really does not. It get only the files (not dirs) from this
directory and than (very strange) it downloads files only of all the
upper directories, but not recures into any subdir.
When reached the top level (in the example) ftp://ftp.germany.eu.net/
than it starts to download all files in this directory. This is
really not, what I wanted.
Anybody knows what goes wrong here?

------------------------------------------------------------


 
 
 

Wget recursive is strange ?

Post by Gisbert Berge » Fri, 14 Nov 1997 04:00:00



> command like this (example):

> wget -r -l0
> ftp://ftp.germany.eu.net/pub/os/Linux/Distributions/Slackware/slakwar...

> to get the data from this directory and from all its subdirs. But
> it really does not. It get only the files (not dirs) from this

Ok, I have it. Have tried some switches, here is the result:

wget -r -nH --no-parent -l0 ftp://hostname.ext/dir/subdir/

will do the job (but very strange default behavior...)

------------------------------------------------------------



 
 
 

Wget recursive is strange ?

Post by Caranica A.N.D. Robert Cristia » Sat, 15 Nov 1997 04:00:00



> I'm tried to use wget, but it works very strange for me.
> I've read the man page, and as far as I know I have to give the
> command like this (example):

> wget -r -l0
> ftp://ftp.germany.eu.net/pub/os/Linux/Distributions/Slackware/slakwar...

> to get the data from this directory and from all its subdirs. But
> it really does not. It get only the files (not dirs) from this
> directory and than (very strange) it downloads files only of all the
> upper directories, but not recures into any subdir.
> When reached the top level (in the example) ftp://ftp.germany.eu.net/
> than it starts to download all files in this directory. This is
> really not, what I wanted.
> Anybody knows what goes wrong here?

> ------------------------------------------------------------



 use wget -np  <URL>   np=no parent ( i.e it won't cd ..  )
 
 
 

Wget recursive is strange ?

Post by Steve Bryan » Tue, 18 Nov 1997 04:00:00




> : I'm tried to use wget, but it works very strange for me.
> : I've read the man page, and as far as I know I have to give the
> : command like this (example):
> :
> : wget -r -l0
> : ftp://ftp.germany.eu.net/pub/os/Linux/Distributions/Slackware/slakwar...
> :
> : to get the data from this directory and from all its subdirs. But
> : it really does not. It get only the files (not dirs) from this
> : directory and than (very strange) it downloads files only of all the
> : upper directories, but not recures into any subdir.
> : When reached the top level (in the example) ftp://ftp.germany.eu.net/
> : than it starts to download all files in this directory. This is
> : really not, what I wanted.
> : Anybody knows what goes wrong here?

> Or you can try something like this..works fine for me.. :) Edit some file
> with your text editor, and put the URL in it...then simply do this...
> wget -o log.log -i file.txt -t 0 -r -l <number> -c &
> where file.txt is the filename where you put your requested URL, and
> <number> is the level of recursion... 1 will dl only one level in depth from
> the requested URL, 2 will do 2 levels...0 will take all of it.. :) log.log
> is the file where you can see how much is dowloaded, and -c & will continue
> after the broken line.. :) see ya, hope this helps...
> --
> Blue.

Did you try the "--no-parent" option ?  It's in the manpage...

FYI:
I've found that using wget on FTP sites recursively through a proxy
produces strange results (caused by the proxy), which generates
HTML listings of directories.  This confuses wget and it saves the
result as a file instead of creating a directory of the same name;
if the (local) directory already exists, it creates "index.html"
files which weren't on the original site !

                        Steve

--
Steve Bryant
Internet/WWW Solutions, Application Services Europe,
Hewlett-Packard GmbH, Boeblingen, Germany

 
 
 

Wget recursive is strange ?

Post by Nuno Loureir » Wed, 26 Nov 1997 04:00:00



> I'm tried to use wget, but it works very strange for me.
> I've read the man page, and as far as I know I have to give the
> command like this (example):

> wget -r -l0
> ftp://ftp.germany.eu.net/pub/os/Linux/Distributions/Slackware/slakwar...

> to get the data from this directory and from all its subdirs. But
> it really does not. It get only the files (not dirs) from this
> directory and than (very strange) it downloads files only of all the

Try:

wget -m -nH -x -o slack.log --no-parent ftp://ftp.germany.eu.net/pub/os/Linux/Distributions/Slackware/slakwar...  &

This will get a1/ and respective subdirs..

-----
Nuno Andre Henriques Loureiro

PGP FingerPrint: 85 B2 B7 DA 28 C0 D9 BC  E8 4D DC 23 8E 2B 72 B4

 
 
 

1. wget and recursive page grabbing

I am trying to get an archive of student work.  i am using the
following:

wget -l 4 -r http://literacy.english.louisville.edu/~stacy/english102_Spring2002/1...

Here is the output I get.  It only grabs two graphic files:

Can someone show me a sample wget command line that will work in this
instance?

Thanks.

wget -l 3 -r http://literacy.english.louisville.edu/~stacy/english102_Spring2002/1...
--17:49:44--  http://literacy.english.louisville.edu/%7Estacy/english102_Spring2002...
           => `literacy.english.louisville.edu/%7Estacy/english102_Spring2002/102students.htm'
Resolving literacy.english.louisville.edu... done.
Connecting to literacy.english.louisville.edu[136.165.63.101]:80...
connected.
HTTP request sent, awaiting response... 200 OK
Length: 3,763 [text/html]

100%[========================================================================================>]
3,763         52.50K/s    ETA 00:00

17:49:44 (52.50 KB/s) -
`literacy.english.louisville.edu/%7Estacy/english102_Spring2002/102students.htm'
saved [3763/3763]

Loading robots.txt; please ignore errors.
--17:49:44--  http://literacy.english.louisville.edu/robots.txt
           => `literacy.english.louisville.edu/robots.txt'
Reusing connection to literacy.english.louisville.edu:80.
HTTP request sent, awaiting response... 404 Not Found
17:49:44 ERROR 404: Not Found.

--17:49:44--  http://literacy.english.louisville.edu/%7Estacy/english102_Spring2002...
           => `literacy.english.louisville.edu/%7Estacy/english102_Spring2002/smilefacebackground.jpg'
Connecting to literacy.english.louisville.edu[136.165.63.101]:80...
connected.
HTTP request sent, awaiting response... 200 OK
Length: 6,832 [image/jpeg]

100%[========================================================================================>]
6,832         17.42K/s    ETA 00:00

17:49:44 (17.42 KB/s) -
`literacy.english.louisville.edu/%7Estacy/english102_Spring2002/smilefacebackground.jpg'
saved [6832/6832]

2. Linux BUGS? True or False!

3. wget --mirror: a different kind of recursive web-suck?

4. Pointer: Foiling spam and other procmail email-filter tips

5. Help! make: *** [all-recursive-am] Error 2

6. Install via serial connection

7. Stupid recursive filename Scripting problem (Yes, I am dumb)

8. Windows XP vs Linux

9. wget: wget *keyword*.html?

10. wget: "no such file" --info for wget users!

11. wget: wget *keyword*.html?

12. wget doing strange things...

13. Strange behavior from wget