web archiving

Act of creating archives of Web content.

Wget and WARC

reader of WARC : https://replayweb.page

--span-hosts for exploring other hosts

--no-clobber for skipping already downloaded files (not sure if it works)

--wait=0.1 and --random-wait instead of --limit-rate=1M

wget --recursive\
      --level=inf \
      --no-clobber \
      --warc-file=test\
      --warc-cdx\
      --execute robots=off\
      --page-requisites \
      --html-extension \
      --directory-prefix=.\
      --user-agent=Mozilla \
      --limit-rate=1M \
      --continue \
      URL

This post accepts webmentions. Do you have the URL to your post?

Otherwise, send your comment on my service.

Or interact from the fediverse with your username:

fediverse logo Share on the Fediverse