wget problem, url ending with '?'

wget problem, url ending with '?'

Post by Bjoer » Sat, 11 Jan 2003 05:43:02



Hello,

I'd like to mirror a page with wget (version 1.8.2, cygwin, Windows
2000), however it seems it doesn't follow some links whose URL ends with
a '?'.

Any ways around this? I didn't run it with the -E switch, wonder if that
would make a difference (although it wouldn't really seem logical)?

Also can I make it try again? If using the -nc option, it stops on the
first page because that already exists. I would have thought that it
would still recurse over the pages, only not save the files :-(

Any help would be greatly appreciated!

Thanks

Bjoern

 
 
 

wget problem, url ending with '?'

Post by Alexander Mau » Sat, 11 Jan 2003 05:54:37


Bjoern schrieb:

Quote:> I'd like to mirror a page with wget (version 1.8.2, cygwin, Windows
> 2000), however it seems it doesn't follow some links whose URL ends with
> a '?'.

Just try to set the URL in quotation marks:
wget "http://www.xyz.com?parameter"

Greetings
    Alexander

 
 
 

wget problem, url ending with '?'

Post by Bjoer » Sat, 11 Jan 2003 07:00:07



> Bjoern schrieb:

>> I'd like to mirror a page with wget (version 1.8.2, cygwin, Windows
>> 2000), however it seems it doesn't follow some links whose URL ends
>> with a '?'.

> Just try to set the URL in quotation marks:
> wget "http://www.xyz.com?parameter"

Thanks, but I think I can't do that, as the URL in question is not the
'base' document, but a link to be followed?

Bjoern

 
 
 

wget problem, url ending with '?'

Post by Alexander Mau » Sat, 11 Jan 2003 07:33:37


Bjoern schrieb:

Quote:> Thanks, but I think I can't do that, as the URL in question is not the
> 'base' document, but a link to be followed?

Why is this a problem?
wget follows links as well as it gets normal HTML pages...
(I use it with such an URL, too.)

Greetings
    Alexander

 
 
 

wget problem, url ending with '?'

Post by Bjoer » Sat, 11 Jan 2003 09:00:38



> Bjoern schrieb:

>> Thanks, but I think I can't do that, as the URL in question is not the
>> 'base' document, but a link to be followed?

> Why is this a problem?
> wget follows links as well as it gets normal HTML pages...
> (I use it with such an URL, too.)

I'm not sure, maybe it's a coincidence. But it didn't seem to follow the
links that ended with '?' (there's many of them), but followed lots of
other links on the same page.

Don't know if it's because of the '?', but I couldn't find any other
distinguishing aspect of the URLs. They all have a smiliar path and
belong to the same domain.

Bjoern

 
 
 

wget problem, url ending with '?'

Post by Erik Ljungstr? » Sat, 11 Jan 2003 09:32:05



> Don't know if it's because of the '?', but I couldn't find any other
> distinguishing aspect of the URLs. They all have a smiliar path and
> belong to the same domain.

Have you established that it works with a "normal" browser?
What does wget say when attempting to fetch this link?
What wget version?

--

        -> ipv4: http://www.northernmost.org
        -> ipv6: http://freebsd.northernmost.org
        -> Norrk?ping, Sweden

 
 
 

wget problem, url ending with '?'

Post by Raqueeb Hass » Sun, 12 Jan 2003 01:30:49


you can use some switches to follow the link ...... the keyword is "follow"

raqueeb hassan
augusta, ga

 
 
 

wget problem, url ending with '?'

Post by Bjoer » Sun, 12 Jan 2003 02:47:57



> you can use some switches to follow the link ...... the keyword is "follow"

Can't find any references to that? The only things I find in the -help
output are -follow-ftp and -recursive.

Documentation seems to be available only for version 1.5.3, so if it's a
new feature, i don't know.

Bjoern

 
 
 

wget problem, url ending with '?'

Post by Bjoer » Thu, 16 Jan 2003 20:19:25




>>Don't know if it's because of the '?', but I couldn't find any other
>>distinguishing aspect of the URLs. They all have a smiliar path and
>>belong to the same domain.

> Have you established that it works with a "normal" browser?
> What does wget say when attempting to fetch this link?
> What wget version?

Sorry, took me a while to get back to this. I didn't have a log-file in
my first attempt.

Now I tried it again, and I find that wget doesn't even try to download
the respective pages. So there is no error message. I've looked at the
source of the originating page, but as far as I can tell, the links look
like normal 'hrefs', except for the trainling '?' in the URL.

The version of wget is 1.8.2, cygwin, Windows 2000

Bjoern

 
 
 

1. Regex does not 'see' ?ABC at end of URL

I need to redirect a URL that has ?ABC at the end but the regex in my REDIRECTMATCH directive does not see the ?ABC.
I have tried many regexs including the following:
RedirectMatch (.*)\?ABC$   $1?XYZ
RedirectMatch (.*)ABC$     $1?XYZ
RedirectMatch (.*)?ABC$    $1?XYZ
But no REDIRECTION occurs.
Any ideas?

NOTE!
1. I cannot use REWRITE as Frontpage Extensions are affected.
2. $1?XYZ is not important in this example.
3. Though a newbie at regexs my experiments work as intended UNTIL I add ?anything at the end of a URL.
4. The object of the exercise is to know when a URL has certain 'tracking' information on the end.
5. My URL in question is on many many sites throughout the web so I cannot change it.

Regards
Tom

2. PCI FastEthernet cards for 43P-140

3. Would someone give me a URL for 'sudo' and/or 'runas'?

4. .forward not working

5. (patch for Bash) '%NN' URL hexcode decoding for 'echo'

6. How to configure sendmail user-database?

7. applet can't find end of URL input stream

8. can't get hard drive to work

9. must supply index.html to end of user site's url for it to work

10. URL with directory without ending slash at Netscape comerce server doesn't work

11. mmap'd file ends with '!', copy doesn't

12. PPP configuration problem - (I'm at my wit's end)

13. scosh email 'problem ending submission'