15. July 2002
Customizing the file access
There were quite some requests from game developers and graphics-apps
developers who wanted various extensions to be included into the
zziplib library, but most of them were
only of specific usage. After some discussions we came up with a
model to customize the zziplib library
calls in a number of ways - and adding two or three arguments to
the zzip_* calls. The standard zziplib library
will actually call these *_ext_io functions with these extra arguments
to be set to zero.
The EXT feature describes a way to customize the extensions used in the magic wrapper to find a .ZIP file. It turned out that there are quite a number of applications that did chose the zip file format as their native document format and where just the file extension had been changed. This includes file types like Quake3 ".PK3" files from ID Software, the Java library files called ".JAR", and lately the OpenOffice-6 (resp. StarOffice-6) documents which carry xml-files along. Just build a zero-termined string-list of file-extensions and submit it to the _ext_io calls to let the zziplib find parts of those zip-documents automagically.
In quite some of these cases, it is very benefical to make use of the
o_modes functionality that allows to submit extra bit-options into
the zziplib - this includes options like
ZZIP_PREFERZIP
or even ZZIP_ONLYZIP
which
modifies the default behaviour of looking for real files first instead
of some within a zipped directory tree. Other bit-options include
ZZIP_CASELESS
to imitate win32-like filematching for a
zipped filetree.
Other wishes on zziplib circulated around obfuscation or access to zip-files wrapped in other data areas including encrpyted resources from other applications. This has been addressed with the IO-handlers that you can explicitly submit to the *_ext_io functions - the default will be posix-IO open/read/write and seek/tell. An application using zziplib can divert these to its own set of these calls - and it only needs to declare them on opening a zipped file.
Declaring an EXT stringlist is very simple as it is simply a
list of strings, the zziplib provides
you with a double-const zzip_strings_t
type to help
you move a global declaration into the writeonly segment of your
app - it turned out that about all developers wanted just some
extensions on the default and they were fine with having them
global-const for their application, nothing like dynamically
modifying them. Well, you are still allowed to make it fully
dynamic... if you find a use case for that.
Extending the magic zip-extensions is just done by adding the additional extensions to be recognized - just remember to add the uppercased variants too since those will be needed on (unx-like) filesystems that are case-sensitive. In the internet age, quite some downloaded will appear in uppercased format since the other side declared it as that and that other end was happy with it as being a (w32-like) case-insensitive server. Therefore, it should look like
static zzip_strings_t my_ext[] = { ".zip", ".ZIP", ".jar", ".JAR", 0 };
There is one frequently asked question in this area - how to open a zipped file as "test.zip/README" instead of "test/README". Other than some people might expect, the library will not find it - if you want to have that one needs a fileext list that contains the empty string - not the zero string, an empty string that is. It looks like
static zzip_strings_t my_ext[] = { ".zip", ".ZIP", "", 0 };
And last not least, people want to tell the library to not try to open a real file that lives side by side with the same path as the file path that can be matched by the zziplib. Actually, the magic wrappers were never meant to be used like - the developer should have used zzip_dir_* functions to open a zip-file and the zzip_file_* functions to read entries from that zip-file. However, the magic-wrappers look rather more familiar, and so you will find now a bit-option ZZIP_ONLYZIP that can be passed down to the _ext_io variants of the magic-wrapper calls, and a real-file will never get tested for existence. Actually, I would rather recommend that for application data the option ZZIP_PREFERZIP, so that one can enter debugging mode by unpacking the zip-file as a real directory tree in the place of the original zip.
While you will find the zzip_plugin_io_t declared in the zziplib headers, you are not advised to make much assumptions about their structure. Still we gone the path of simplicity, so you can use a global static for this struct too just like one can do for the EXT-list. This again mimics the internals of zziplib. There is even a helper function zzip_init_io that will copy the zziplib internal handlers to your own handlers-set. Actually, this is barely needed since the zziplib library will not check for nulls in the plugin_io structure, all handlers must be filled, and the zziplib routines call them unconditionally - that's simply because a conditional-call will be ten times slower than an unconditional call which adds mostly just one or two cpu cycles in the place so you won't ever notice zziplib to be anywhat slower than before adding IO-handlers.
However, you better instantiate your handlers in your application
and call that zzip_init_io on that instance to have everything
filled, only then modify the entry you actually wish to have
modified. For obfuscation this
will mostly be just the read()
routine. But one can
also use IO-handlers to wrap zip-files into another data part
for which one (also) wants to modify the open/close routines
as well.
Therefore, you can modify your normal stdio code to start using zipped files by exchaning the fopen/fread/fclose calls by their magic counterparts, i.e.
// FILE* file = fopen ("test/README", "rb");
ZZIP_FILE* file = zzip_fopen ("test/README", "rb");
// while (0 < fread (buffer, 1, buflen, file)))
while (0 < zzip_fread (buffer, 1, buflen, file)))
{ do something }
// fclose (file);
zzip_fclose (file);
and you then need to prefix this code with some additional code to support your own EXT/IO set, so the code will finally look like
/* use .DAT extension to find some files */
static zzip_strings_t ext[] = { ".dat", ".DAT", "", 0 };
/* add obfuscation routine - see zzxorcat.c examples */
static zzip_plugin_io_t io;
zzip_init_io (& io, 0);
io.read = xor_read;
/* and the rest of the code, just as above, but with ext/io */
ZZIP_FILE* file = zzip_open_ext_io ("test/README", O_RDONLY|O_BINARY,
ZZIP_ONLYZIP|ZZIP_CASELESS, ext, io);
while (0 < zzip_fread (buffer, 1, buflen, file)))
{ do something }
zzip_fclose (file);
What's more to it? Well, if you have some ideas then please mail me
about it - don't worry, I'll probably reject it to be part of the
standard zziplib dll, but perhaps it is worth to be added as a
configure option and can help others later, and even more perhaps
it can be somehow generalized just as the ext/io features have been
generalized now. In most respects, this ext/io did not add much
code to the zziplib - the posix-calls
in the implemenation, like "read(file)"
were simply
exchanged with "zip->io->read(file)"
, and the
old "zzip_open(name,mode)"
call is split up - the old
entry still persists but directly calls
"zzip_open_ext_io(name,mode,0,0,0)"
which has the
old implementation code with just one addition: when the ZIP_FILE
handle is created, it uses the transferred io-handlers (or the
default ones if io==0), and initialized the io-member of that
structure for usage within the zzip_read
calls.
This adds just a few bytes to the libs and just consumes additional cpu cycles that can be rightfully called to be negligible (unlike most commercial vendors will tell you when they indeed want to tell you that for soooo many new features you have to pay a price). It makes for greater variability without adding fatness to the core in the default case, this is truly efficient I'd say. Well, call this a German desease :-)=) ... and again, if you have another idea, write today... or next week.