-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subtitles parser doesn't handle time codes that include more than 3 digits in the milliseconds field #1997
Comments
[off-topic] quick observation regarding PBS.. I just checked a few previous episodes, and they all include both SRT and VTT captions having 3 decimal millisecond fields.. on the one hand, hopefully PBS will fix itself. |
quick update.. I can confirm that 6 digit millisecond fields in SRT files are also not parsed correctly..
to reproduce the sample SRT files, I performed the following steps to convert the VTT to SRT:
attachments: |
code trace for VTT String firstLine = webvttData.readLine();
Matcher cueHeaderMatcher = WebvttCueParser.CUE_HEADER_PATTERN.matcher(firstLine);
return parseCue(null, cueHeaderMatcher, webvttData, styles); builder.startTimeUs = WebvttParserUtil.parseTimestampUs(Assertions.checkNotNull(cueHeaderMatcher.group(1)));
builder.endTimeUs = WebvttParserUtil.parseTimestampUs(Assertions.checkNotNull(cueHeaderMatcher.group(2))); String[] parts = Util.splitAtFirst(timestamp, "\\.");
value += Long.parseLong(parts[1]);
return value * 1000; fix for VTT
|
code trace for SRT long startTimeUs;
long endTimeUs;
Matcher matcher = SUBRIP_TIMING_LINE.matcher(currentLine);
if (matcher.matches()) {
startTimeUs = parseTimecode(matcher, /* groupOffset= */ 1);
endTimeUs = parseTimecode(matcher, /* groupOffset= */ 6);
} String millis = matcher.group(groupOffset + 4);
if (millis != null) {
timestampMs += Long.parseLong(millis);
}
return timestampMs * 1000; fix for SRT
|
This isn't a bug in the library, the media is invalid (and therefore any player behaviour is undefined), see further reasoning in #1999 (comment). |
We previously parsed an arbitrary number of decimal places, but assumed the value was in milliseconds, which doesn't make sense if there is greater or fewer than 3. This change restricts the parsing to match exactly 3, meaning the millisecond assumption is always true. The WebVTT spec requires there to be exactly 3 decimal places: https://www.w3.org/TR/webvtt1/#webvtt-timestamp The SubRip spec is less clearly defined, but the Wikipedia article defines it as having exactly 3 decimal places (https://en.wikipedia.org/wiki/SubRip#Format) and ExoPlayer has always assumed 3 decimal places (anything else is already handled incorrectly), so this change just ensures we don't show subtitles at the wrong time. Issue: #1997 PiperOrigin-RevId: 712885023
The change linked above tightens ExoPlayer's parsing to completely ignore cues with more or less than 3 decimal places (rather than render them at an incorrect time). |
Version
Media3 main branch
More version details
you ask: why would the milliseconds field ever contain more than 3 digits
I reply: it shouldn't, but for whatever reason.. captions on PBS do
example:
00:00:00.000000 --> 00:00:00.100000
(\.\d{3})\d{3}
$1
note: I haven't yet tested their SRT captions; I suspect the same time code format. TBD..
Devices that reproduce the issue
OS: Android
app: ExoAirPlayer v3.7.0
lib: AndroidX Media3 v1.5.0
Devices that do not reproduce the issue
No response
Reproducible in the demo app?
Not tested
Reproduction steps
load captions from: VTT
Expected result
captions display at the correct time
Actual result
captions don't display at the correct time
Media
-or-
Bug Report
adb bugreport
to [email protected] after filing this issue.The text was updated successfully, but these errors were encountered: