Skip to content

Releases: webplatform/content-converter

Initial release

23 Jul 22:08
Compare
Choose a tag to compare

At this release, we could migrate full WebPlatform Docs MediaWiki content into a git repository.

Ability to handle a large XML file that stores multiple documents and all its revisions, allows you to write into text files and commit to git each edit with appropriate author and time of the contribution.

This release allows to convert a MediaWiki dumpBackup XML file to convert each Wiki page AbstractDocument. Each Document can be looped through the revisions AbstractRevision and define how we can declare who made the edit (Author).

Features:

  • Ability to define how to rewrite filename so we have no filesystem naming collisions for folders (see WebPlatform\ContentConverter\Filter\AbstractFilter class)
  • Validation of AbstractFilter passes so we are sure we have the same number of matchers and substitution values
  • Abstract classes to start from to handle Documents, Revision, Author, and how to store in a new format (i.e. "Persistency").
  • Ensure MediaWiki edit timestamp is properly converted. See issue-7.
  • Ability to handle with multiple passes a conversion.
    1. Step 1 convert every page that has been deleted or moved (so we get history but still empty output folder)
    2. Step 2 convert the full history
    3. Step 3 Apply filters