How do I resume wget mirroring website?

I am using wget to download the entire website.
I used the follwing command (on Windows 7):

wget ^
 --recursive ^
 -A "*thread*, *label*" ^
 --no-clobber ^
 --page-requisites ^
 --html-extension ^
 --domains example.com ^
 --random-wait ^
 --no-parent ^
 --background ^
 --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^
     http://example.com/

      

After 2 days my little brother restarted the PC
so I tried to resume the suspended process
I added to the command

following:
--continue ^

      

so the code looks like

wget ^
     --recursive ^
     -A "*thread*, *label*" ^
     --no-clobber ^
     --page-requisites ^
     --html-extension ^
     --domains example.com ^
     --random-wait ^
     --no-parent ^
     --background ^
     --continue ^
     --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^
         http://example.com/

      

Unfortunately it started a new job, it downloads the same files again and writes a new log file named

wget-log.1

      

Is there a way to resume the mirroring site using wget, or to get me to start all over again?

+3


source to share


1 answer


Try parameter -nc. It checks everything one more time, but doesn't load it.

I am using this code to download one website:   wget -r -t1 domain.com -o log

I stopped the process, I wanted to resume it, so I changed the code:   wget -nc -r -t1 domain.com -o log



The logs have something like this:   File .... already there; not retrieving. etc.

I have checked the logs before and it seems that after 5 minutes of this kind of control it starts downloading new files.

I am using this guide for wget: http://www.linux.net.pl/~wkotwica/doc/wget/wget_8.html

+2


source







All Articles