Ga verder naar de inhoud

Archiving social media accounts with ArchiveWeb.page

As part of the project Best practices for archiving social media in Flanders and Brussels, various tools were tested to archive social media platforms. This manual describes the tool ArchiveWeb.page for archiving social media.

Disclaimer: This manual was created in February - June 2021. Do you notice something not working? Email Nastasia

----

ArchiveWeb.page is a Chrome extension that turns your browser into a web archiving system. It is the successor of Webrecorder. Like its predecessor, it records a browsing session in the standard format WARC. As you interact with a webpage during the session, ArchiveWeb.page downloads all content you encounter by capturing network traffic and browser processes. This makes the tool time-intensive. If you want to capture all content from a social media account, such as comments, you will have to manually open all posts and click on the comments.

Requirements

  • Chrome browser;
  • an account on the social media platform.

Advantages

  • can be used on Windows, macOS, and Linux;
  • archives social media in the standard WARC format;
  • archiving is done directly in the browser;
  • can also be used to replay the web archives;
  • extensive documentation available;
  • creates a minimal set of metadata stored in the WARC file, such as used software, timestamp, URL, and page title;
  • includes an autopilot function for Facebook, Twitter, and Instagram.

Disadvantages

  • time-intensive if you want to capture all content from the social media platform;
  • certain elements of Facebook are blocked, such as opening photos, so they cannot be included in the web archive;
  • frequently crashes while scrolling through Facebook. This is likely a limitation imposed by Facebook.

Workflow

Step 1: Install the software

ArchiveWeb.page is a Chrome extension that you install via the Chrome Web Store.

  • Open Chrome.
  • Go to this link to the ArchiveWeb.page extension page and click

  • A window will appear asking you to confirm that you want to add the extension. Click on Add Extension.

  • Then pin the extension by clicking the puzzle piece at the top left and then the pin icon next to Webrecorder Archiveweb.page.

  • The extension is now in your browser next to the address bar.

Step 2: capture the social media account

After installing ArchiveWeb.page you can use Chrome as a web archiving tool.

  • Create a collection for the account you want to capture.
    • Click on the ArchiveWeb icon and click the dropdown menu under Record To. Choose Create New Archive…

:* And give the collection a name.

  • Then go to the social media platform of the account you want to archive and log in.

  • Navigate to the social media account you want to archive and start recording. To do this, click the ArchiveWeb icon in the browser again. Tick the option Start With Autopilot and press Start.

  • After you have pressed start, ArchiveWeb will reload the page and begin downloading the content. The autopilot function will automatically scroll down, open posts, click open comments, and play videos.

  • To make sure all content is preserved, you must open every post and photo and play all videos. On Facebook, not all comments are shown automatically, so you will also need to expand these if you want to save them. Also check that you see all comments instead of only the relevant ones. While performing these actions, ArchiveWeb will save more and more content.
  • If you want to end the session, click the ArchiveWeb.page icon again and press Stop.

Step 3: export the web archive as a WARC file

After ArchiveWeb.page has archived the social media account, you can export the web archive in WARC format

  • Again, click on the ArchiveWeb icon and select in the menu under Record To the collection you created in Step 2.

  • Then click on Browse Archive. You will see a list of pages you have archived.

  • On the left, choose Download and click on Download All as WARC Only

  • Save the file. Note: ArchiveWeb.page wants to save the file as a .warc file, but it is a compressed (gzip) WARC file. Therefore, add .gz as an extension. You can also adjust this after downloading.

  • The web archive is saved!

Result

The web archive can now be opened with WARC players such as ReplayWeb.page. Go to https://replayweb.page and open your WARC file.

Extension

The archiveweb.page extension autopilot function sometimes needs a bit of help.

Especially with auto-scrolling.

It is also possible to use a javascript bookmarklet to expand all comments on a Facebook page or Facebook group.

Simple Auto Scroll extension

Use an extension in Chrome to scroll automatically.

Go to https://chromewebstore.google.com/detail/simple-autoscroll/fgecljolecpahpphjjhfhgiimljpkodo and click "Add to Chrome" to add Simple Auto Scroll to Chrome.

To use Simple Auto Scroll, navigate to the Facebook/web page to be archived and click on the Simple Auto Scroll icon at the top right.

Simple Auto Scroll has 3 scroll speeds, clicking once will make the page scroll slowly downwards, twice is medium speed, three times is the fastest setting.

A fourth click will stop scrolling.

Or click on the page to stop scrolling.

To adjust the scroll speed, open the context menu of the extension by right-clicking the extension icon.

And select "Options".

The higher the number, the slower the scroll speed, the options accept negative numbers.

Expanding Facebook comments

To automatically expand comments on Facebook, the "auto-scroll" bookmarklet by Jens-Ingo Farley can be used.

Bookmarklets are small script extensions in the form of a bookmark bar button. More information about bookmarklets: https://support.mozilla.org/en-US/kb/bookmarklets-perform-common-web-page-tasks.

Go to http://com.hemiola.com/bookmarklet/ and drag the "Expand-All" button to your browser's bookmark bar.

If the bookmark bar is not activated, use the following shortcut to activate it.

Chrome or Chromium-based browsers: Ctrl+Shift+B

Then, go to the Facebook page or group to be archived and click the "Expand All" button in the bookmark bar.

The process stops automatically when the end of the page is reached, or when the user presses the "esc" key.

The Expand all bookmarklet and the Simple Auto scroll extension combination are a good alternative to the auto-pilot function of the archive.web extension.

It is recommended to have only the tab with the Facebook page or group to be archived open in the browser.

Do not set the auto scroll speed too high so that the archive.web extension can archive all links.

Use a computer with enough RAM and a fast internet connection.

Deze pagina is laatst aangepast op 03 oktober 2025

Deze pagina aanvullen of corrigeren?

Foutje gespot? Of heb je aanvullende inzichten? Deel je ervaringen via onderstaande knop.