Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getid3_quicktime::Analyze atom size/offset/data Discrepency when 64-bit size is specified #382

Open
codeprosoftware opened this issue May 15, 2022 · 1 comment

Comments

@codeprosoftware
Copy link

codeprosoftware commented May 15, 2022

Here's the specific issue I ran into, but it may also lead to more effected areas:
My getID3 version: 1.9.21-202109171300
(I'm currently locked to PHP 7.2, so can't upgrade to a newer version of getID3 due to the min PHP V7.4 restriction)

When trying to analyze the following QuickTime file:
https://www.dropbox.com/s/hbxf1dv9tf1ui3o/Against%20Me-%20We%27re%20Never%20Going%20Home%20-%20Song%20-%20I%20Still%20Love%20You%20Julie.m4v?dl=0

It detects a single chapter, but it's garbage data. In actuality, that file doesn't have any chapter information. I was able to isolate the issue down to the switch case for 'mdat' in the QuicktimeParseAtom method.

The atom for this particular item is found at offset 69589761 ( 0x425DB01 ). Looking at the file with a hex editor shows the full atom data (as far as I understand it looking through this code and several online references) to be:
00 00 00 01 6D 64 61 74 00 00 00 00 00 00 00 10

Then looking back at the Analyze method and referring to the QuickTime docs here: https://developer.apple.com/library/archive/documentation/QuickTime/QTFF/QTFFChap1/qtff1.html#//apple_ref/doc/uid/TP40000939-CH203-39013, it looks like the atom is using the special size value to indicate a 64-bit size is included with the header. Reading the correct atom size is handled currently via the check in Analyze for ($atomsize == 1) and reading the next 8 bytes as the extended header. Which sets $atomsize to 16. This leads to the problem in parsing this particular 'mdat', however, QuicktimeParseAtom is using a hardcoded value of 8 for a number of the length checks and other operations that I believe is meant to represent the length the atom header data is assumed to be. In this case, the header length is actually 16, because of the extra 8-byte size field. In this particular case for the more free-form check for chapters, it actually ends up reading data from the atom following the mdat atom, and trying to interpret that data as chapter information.

Examples of some of the hard-coded length of "8" that I think are intended to represent the header length in the 'mdat' case code, include:
while (($mdat_offset < (strlen($atom_data) - 8))
if (($atomsize > 8) && (!isset($info['avdataend_tmp']) || ($info['quicktime'][$atomname]['size'] > ($info['avdataend_tmp'] - $info['avdataoffset'])))) {

Based on this analysis, the atom data indicates that the inclusive atom size is 16, and the header size is 16, so that means the atom contains no actual data. That could be accounted for in the mdat processing section by either knowing what the header size was, and using it instead of hard-coding to 8, or by having $atom_data only contain the data portion of the atom (and removing any length subtractions of 8), which would be the size of the total ($atomsize) less the header length ultimately read. In either case for the while loop looking for the chapter data in my specific scenario, it would never enter it, since $mdat_offset would be 0 and the "data" length would calculate to 0.

I'm also assuming the second conditional I noted above could be effected by this where it checks for "($atomsize > 8)", since an empty mdat atom could be still be a total size of 16 like my case, but not have data.

This led me back to the Analyze method, and what I think may be a larger issue, not just for an atom that uses the extended size, but for atom parsing in general. QuicktimeParseAtom is called passing the result of "fread' with $atomsize as the amount to read for the $atom_data argument. But from what I can see, '$atomsize - [header size]' should actually be used in this call to fread, otherwise, the result will overrun the actual data content, reading extra data equal to the size of the atom header length.

Sorry for the long post. I wanted to describe the full issue and analysis I went through. As I went through that process, I started to realize that any fix I might try to put in place, even in my local copy could effect numerous other atom processing mechanisms that either use hard-coded values to account for header length or assume $atom_data is padded with extra bytes.

Based on the QuickTime spec I reference above, it does specifically mention 'mdat' as an example when talking about the extended size portion of the header, but it also doesn't say that special case to use the extended size is limited to mdat, either. And I don't know enough about the format for this processing code to determine if it could be an issue with containers (QuicktimeParseContainerAtom doesn't look to have the same check for a special 32-bit size of 1 and the extra read for a 64-bit value if found, but I don't know if that would be a valid atom definition for a container), or if other hard-coded values of "8" used throughout are related to header offsets or something completely different.

JamesHeinrich added a commit that referenced this issue Jun 21, 2022
@JamesHeinrich
Copy link
Owner

Sorry it took me so long to get around to looking at this.
I have made a small change, basically if the extended size field is present then read 8 bytes less into the data being parsed. I think this should fix the problem without requiring changes elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants