Home

openZIM CMS

We distribute thousands of ZIM files and dozens of update/new ones each day. All are created by the Zimfarm and published more or less automatically on https://download.kiwix.org. The catalog is automatically updated once a day at https://download.kiwix.org/library/library_zim.xml for the static version and at https://library.kiwix.org (OPDS) for the dymamic/API/service version (based on library_zim.xml).

This works fine but the automatic nature of the process limits us in several ways like toggling publication of a ZIM, ensuring Q&A, fine-tuning metadata or integrating with partners.

GOAL

The Kiwix CMS is the publishing counter-part of the Zimfarm. Zimfarm Zimfarm allows non-tech people to manage ZIM creation ; CMS is a Web tool allowing non-tech people to handle publishing of ZIM files.

Publishing is all about managing the central catalog/library: how it is presented and where it is made available. It is neither about creation nor distribution.

Concepts

The CMS core concept is the Title, as an entry in a Library. The Title describes a piece with a Name, Title, Description, Location and a few other metadata.

ZIM Catalog lists Books which are representations of the Title, as a ZIM files. Titles can be split across different flavors: with pictures, with videos, introduction only, etc. Each Book can also have multiple dated revisions.

Each ZIM file Book should be attached to a Title and this process of matching a Book to a Title is called “reconciliation”. Reconciliation would normally happen automatically via the the ZIM Name metadata. Should it fail (there are many reasons for it to), it will be done manually. Each ZIM file (Book/revision) is identified by its UUID.

Beside the core Title management, the CMS has two other types of important concepts:

Ingesters are responsible for bringing books (and optional associated metadata) into the CMS. First one will be the Zimfarm ingester, to work off Zimfarm generated ZIM files. Another one would allow manually adding a ZIM file.
Digesters are responsible for exporting views of the CMS database. First one will be the one generating the library.xml file for the central Cataglog with latest revisions of published Books. Other ones could generate an RSS feed, or an XLS export of a subset of the data.

Additional modules

A notification handler
A “garbage collector” in charge of removing old revisions and detecting incoherence like: revisions of the same title in two different directories, other kind of oddities.

Features

Minimal User mgmt + auth, only two roles: anonymous, publisher.
First-time ingestion of book collection from library.xml
Books CRUD
Metadata updates: Popularity, category, etc.
Manually publish/un-publish a book
Manually publish/un-publish a flavor (per book)
Manually publish/un-publish a ZIM file
Move storage location (only for local files)
Publish a ZIM automatically for files stored in HTTP, LOCAL, S3 (call to API from Zimfarm)
Define Book QA constraints (see zimcheck json output)
Check Book QA against constraints if required
Export catalog in XML (library_zim.xml)
Book Overview with basic search/filter
Logs for publish/un-publish actions with basic search/filter
Corresponding atom feed

Technical Stack

Python 3.8+
MariaDB server
Flask API/backend
Properly unit tested
OpenApi spec
Vue.js frontend
Common code formatting and QA tools (codefactor, codecov)
Dockerized
CI with Github Actions
CD with Github Actions and ghcr

Provide feedback

Saved searches

Use saved searches to filter your results more quickly