Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve specification of the ZIP format and compression specification #39

Closed
bubnikv opened this issue Apr 16, 2021 · 6 comments
Closed

Comments

@bubnikv
Copy link
Contributor

bubnikv commented Apr 16, 2021

The 3MF core spec only vaguely references the ZIP format. OPF does not specify ZIP compression either

http://idpf.org/epub/20/spec/OPF_2.0.1_draft.htm

Wikipedia is tiny bit more specific:

https://en.wikipedia.org/wiki/Open_Packaging_Conventions

The ISO/IEC 29500-2:2008 specification and the second edition of ECMA-376 makes a normative reference to PKWARE, Inc.'s .ZIP File Format Specification version 6.2.0 (2004), and supplements it with a normative set of clarifications. Note: The older first edition of ECMA-376 makes an informative (i.e., non-normative) reference to the newer PKWARE Inc's ".ZIP File Format Specification" version 6.2.1 (2005).[1] The ZIP format is not specified by any international standard but has widespread community and developer acceptance.

First, we should canonize the compression algorithm to DEFLATE. Wikipedia says:

https://en.wikipedia.org/wiki/ZIP_(file_format)

The ZIP file format permits a number of compression algorithms, though DEFLATE is the most common.

WinZip, starting with version 12.1, uses the extension .zipx for ZIP files that use compression methods newer than DEFLATE; specifically, methods BZip, LZMA, PPMd, Jpeg and Wavpack. The last 2 are applied to appropriate file types when "Best method" compression is selected.[27][28]

The same wikipedia page contains an interesting "Standardization" section, discussing enforcing the DEFLATE algorithm, disabling multi-part ZIPs etc.

Second, the definition of the ZIP 64bit extension is vague and it is not quite clear whether 64bit ZIP 3MFs are allowed or not. We have found out, that the 64bit ZIP 3MFs produced by miniz library that PrusaSlicer utilizes are not consumed by Microsoft tools. There was a discussion

https://github.com/3MFConsortium/archived_documents/tree/master/MeetingMaterials_44/Zip64_Issues_Materialise.zip

quite some time ago, but I suppose no resolution was drawn from that.

@jordig100
Copy link
Contributor

This is what is specified in the Core spec:

The ZIP archive MUST follow the .ZIP File Format Specification by PKWARE Inc.
Files within the ZIP archive that represents a 3MF document MUST use the compression method Deflate ("8 - The file is Deflated") or be stored uncompressed ("0 - The file is stored (no compression)") in accordance with the OPC specification ("Annex C, (normative) ZIP Appnote.txt Clarifications").

So, the only compression supported is DEFLATE.

@bubnikv
Copy link
Contributor Author

bubnikv commented May 20, 2021

Lukas Hejl of Prusa Research made quite an extensive study of both the ZIP64 extension support and UTF-8 encoding support for Prusa Research and 3MF consorcium.

In a nutshell:

  1. ZIP64 including the streaming extension is not difficult to implement, it is mostly supported by ZIP consumers with the exception of some Microsoft applications and Craftware FDM slicer. Our writeup mentions how we have implemented the streaming extension in a way that the ZIP64 extension is only used if needed, so that Microsoft tools will still consume these ZIPs. We wish Microsoft implements ZIP64 OPC support. Also some interactive tools (for example Windows Explorer) have an issue manipulating ZIP64 archives (adding / removing files).

  2. Files with UTF-8 encoded names are correctly extracted by most ZIP consumers, however many ZIP producers fail to encode file names into UTF-8 (for example Windows Explorer).

ZIP Report.pdf

@jordig100
Copy link
Contributor

jordig100 commented May 20, 2021 via email

@bubnikv
Copy link
Contributor Author

bubnikv commented Jun 3, 2021

We have found out that Microsoft 3D Builder accepts ZIP64 encoded 3MFs if the following conditions are fulfilled:

  • Streaming extension is NOT used together with ZIP64 extension.
  • Size of the ZIP64 archive after compression is not bigger than 4GiB.
  • ZIP64 extension block has to be filled in both the Local file header and Central directory file header, even though that is not strictly required by the ZIP spec.
  • ZIP64 extension block in Central directory file header must not be longer than 20 bytes. For example, DotNetZip produces a longer record, and such ZIP64 is not accepted by Microsoft 3D Builder.

While MS 3D builder does not refuse these ZIP64 files, if the uncompressed archive is larger than 4 GiB, MS 3D builder hangs when loading such a file.

@martinweismann
Copy link
Member

Several parts of this issue already covered by : cfc50ff

@martinweismann
Copy link
Member

martinweismann commented Feb 9, 2023

Main issue resolved.
see
#68 fro what's left

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants