wget - Rename the Directory Index of a web page downloaded with wget to index.html

Question

I am currently using a wget command that is fairly complicated, but the essence of it is the -p and -k flags to download all the pre-requisites. How do I rename the main downloaded file to index.html?

For instance, I download a webpage

http://myawesomewebsite.com/something/derp.html

This will, for example, download:

derp.html
style.css
firstimage.png
secondimage.jpg

And maybe even an iFrame:

iframe.html
iframe-style.css

So now the question is how do I rename derp.html to index.html, without accidentally renaming iframe.html to index.html as well, given that I don't know what the name of the resolved downloaded file may be?

When I tried this method on a Tumblr page with URL http://something.tumblr.com/34324/post it downloaded as page.html.

I've tried the --output-document flag, but that results in nothing being downloaded at all.

Thanks!

score 0 · Accepted Answer

This is what I ended up doing:

If there was no index.html found after downloading, I used Ruby to get the derp.html part of the URL, and then searched for derp.html and then renamed it to index.html.

It's not as elegant as I would like, but it works.

wget - Rename the Directory Index of a web page downloaded with wget to index.html

1 回答 1

Related

Reference