Skip to content

Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.


Notifications You must be signed in to change notification settings


Repository files navigation

wacz-exhibitor 🏛️

Experimental proxy and wrapper boilerplate for safely and efficiently embedding Web Archives (.warc, .warc.gz, .wacz) into web pages.

This implementation:

  • Wraps Webrecorder's client-side playback technology.
  • Serves, proxies and caches web archive files using NGINX.
  • Allows for two-way communication between the embedding website and the embedded archive using post messages.
<!-- Embedding a playback of archive.wacz on -->
  allow="allow-scripts allow-forms allow-same-origin"

See also: Live Demo, Blog post

Perma Tools



"It's a wrapper"

wacz-exhibitor serves an HTML document containing a pre-configured instance of, webrecorder's client-side web archives playback system, pointing at a proxied version of the requested WARC/WACZ file.

The playback will only start if said HTML document is embedded in a cross-origin <iframe> for security reasons (XSS prevention in the context of an <iframe> needing both allow-script and allow-same-origin).

We recommend hosting wacz-exhibitor on a subdomain of the embedding website to avoid third-party cookie limitations: -> Has iframes pointing at -> Hosts wacz-exhibitor

"It's a proxy"

wacz-exhibitor pulls and serves the requested archive file in the format required by <replay-web-page> (right Content-Type, support for range requests, CORS resolution and Content Security Policy).

The requested web archive file can be sourced from either:

  • The local /archives/ folder. This is where the server will look first.
  • A remote location the server will proxy from, defined in nginx.conf.

☝️ Back to summary




Serves an HTML document containing an instance of <replay-web-page>, pointing at a proxied archive file.

Must be embedded in a cross-origin <iframe>, preferably on the same parent domain to avoid third-party cookie limitations.



Query parameters

Name Required ? Description
source Yes Filename of the .warc, .warc.gz or .wacz. Can contain a path, but cannot be a url.
The file must either be present in the /archives/ folder or on the remote server defined in nginx.conf.
url No Url of a page within the archive to display.
ts No Timestamp of the page to retrieve. Can be either a YYYYMMDDHHMMSS-formatted string or a millisecond timestamp or a.
embed No <replay-web-page>'s embed mode. Can be set to replayonly to hide its UI.
deepLink No <replay-web-page>'s deepLink mode.
noSandbox No If set, will remove the sandbox from the <replay-web-page> iframe. May be necessary for certain playbacks; e.g., cross-browser compatible playbacks of PDFs.


<!-- On https://*.domain.ext: -->
  allow="allow-scripts allow-forms allow-same-origin allow-downloads"



Pulls, caches and serves a given .warc, .warc.gz or .wacz file, with full support for range requests.

Will first look for the path + file given in the local /archives/ folder, and try to proxy it from the remote server defined in nginx.conf.

☝️ Back to summary


This project consists of a single Dockerfile derived from the official NGINX Docker image, which can be deployed on any docker-compatible machine.


The following example describes the process of deploying wacz-exhibitor on, a platform-as-a-service provider.

  1. nginx.conf needs to be edited. See comments starting with EDIT: in the document for instructions.
  2. Install the flyctl client and sign-in, if not already done.
  3. Initialize and deploy the project by running the flyctl launch command (use flyctl deploy for subsequent deploys).
  4. wacz-exhibitor is now live and visible on the dashboard.
  5. We highly recommend setting up a custom domain and SSL certificate. This can be done directly from the dashboard. Ideally, the target domain should be a subdomain of the website on which wacz-exhibitor iframes are going to be embedded: for example, www.domain.ext embedding an <iframe> from wacz.domain.ext.

☝️ Back to summary

Local development

Example: Running wacz-exhibitor locally using docker

docker build . -t wacz-exhibitor-local
docker run --rm -p 8080:8080 wacz-exhibitor-local
# wacz-exhibitor is now accessible at http://localhost:8080


Development Sandbox

A minimal sandbox is available to test embedding wacz-exhibitor <iframe>s in webpages.

You may edit sandbox/index.html to make it point to a specific web archive file and run the following command to start the sandbox:

# Assuming: wacz-exhibitor is running on port 8080 ...
# The sandbox is now accessible at http://localhost:8000

☝️ Back to summary

Communicating with the embedded archive

wacz-exhibitor allows the embedding website to communicate with the embedded archive playback using post messages. All messages coming from a wacz-exhibitor <iframe> come with a waczExhibitorHref property, helping identify the sender.

This feature can be used to build interactive experiences using web archive files.

Messages interpreted by the wacz-exhibitor <iframe>

wacz-exhibitor will look for the following properties in messages coming from the embedding website and react accordingly:

Property name Expected value Description
updateUrl String If provided, will replace the current url parameter of <replay-web-page>.
updateTs Number If provided, will replace the current ts parameter of <replay-web-page>.
getCollInfo Boolean If provided, will send a post message back with <replay-web-page>'s collInfo object, containing meta information about the currently-loaded archive.
getInited Boolean If provided, will send a post message back with the current value of <replay-web-page>s inited property, indicating whether or not the service worker is ready.
overrideElementAttribute HTMLAttributeOverride If provided, will look for the element with the specified CSS selector inside <replay-web-page> and if found, apply the requested HTML attribute to it. If the element is not found, will send a post message back reporting "status": "timed out", along with a copy of the original message's data.

Messages hoisted from <replay-web-page>

wacz-exhibitor will forward to the embedding website every post message sent by <replay-web-page>'s service worker.

The most common example is the following, which is sent during navigation within an archive:

  "waczExhibitorHref": "https://wacz.domain.ext/?source=archive.warc.gz&url=https://what-was-archived.ext/path",
  "url": "https://what-was-archived.ext/new-path/",
  "view": "pages",
  "ts": "20220816162527"

Example: Intercepting messages from a wacz-exhibitor <iframe>

// Assuming: there's only 1 <iframe class="wacz-exhibitor">  
const playback = document.querySelector("iframe.wacz-exhibitor");

window.addEventListener("message", (event) => {
  // This message bears data and comes from the `wacz-exhibitor` <iframe>
  if (event?.data && event.source === playback.contentWindow) {

Example: Sending a message to a wacz-exhibitor <iframe>

// Assuming: there's only 1 <iframe class="wacz-exhibitor">  
const playback = document.querySelector("iframe.wacz-exhibitor");
const playbackOrigin = new URL(playback.src).origin;

  {"updateUrl": "https://what-was-archived.ext/new-path"},

☝️ Back to summary