Fix occassional DTS overlap

Fix an occassional DTS overlap by closing the filtergraph after each segment and re-creating it at the beginning of each segment, instead of attempting to persist the filtergraph in between segments. This overlap occurred mostly when flip-flopping segments between transcoders, or processing non-consecutive segments within a single transcoder. This was due to drift in adjusting input timestamps to match the fps filter's expectation of mostly consecutive timestamps while adjusting output timestamps to remove accumulated delay from the filter. There is roughly a 1% performance hit on my machine from re-creating the filtergraph. Because we are now resetting the filter after each segment, we can remove a good chunk of the special-cased timestamp handling code before and after the filtergraph since we no longer need to handle discontinuities between segments. However, we do need to keep some filter flushing logic in order to accommodate low-fps or low-frame content. This does change our outputs, usually by one fewer frame. Sometimes we seem to produce an *additional* frame - it is unclear why. However, as the test cases note, this actually clears up a numer of long-standing oddities around the expected frame count, so it should be seen as an improvement. --- It is important to note that while this fixes DTS overlap in a (rather unpredictable) general case, there is another overlap bug in one very specific case. These are the conditions for bug: 1. First and second segments of the stream are being processed. This could be the same transcoder or different ones. 2. The first segment starts at or near zero pts 3. mpegts is the output format 4. B-frames are being used What happens is we may see DTS < PTS for the very first frames in the very first segment, potentially starting with PTS = 0, DTS < 0. This is expected for B-frames. However, if mpegts is in use, it cannot take negative timestamps. To accompdate negative DTS, the muxer will set PTS = -DTS, DTS = 0 and delay (offset) the rest of the packets in the segment accordingly. Unfortunately, subsequent transcodes will not know about this delay! This typically leads to an overlap between the first and second segments (but segments after that would be fine). The normal way to fix this would be to add a constant delay to all segments - ffmpeg adds 1.4s to mpegts by default. However, introducing a delay right now feels a little odd since we don't really offer any other knobs to control the timestamp (re-transcodes would accumulate the delay) and there is some concern about falling out of sync with the source segment since we have historically tried to make timestamps follow the source as closely as possible. So we're leaving this particular bug as-is for now. There is some commented-out code that adds this delay in case we feel that we would need it in the future. Note that FFmpeg CLI also has the exact same problem when the muxer delay is removed, so this is not a LPMS-specific issue. This is exercised in the test cases. Example of non-monotonic DTS after encoding and after muxing: Segment.Frame | Encoder DTS | Encoder PTS | Muxer DTS | Muxer PTS --------------|-------------|-------------|-----------|----------- 1.1 | -20 | 0 | 0 | 20 1.2 | -10 | 10 | 10 | 30 1.3 | 0 | 20 | *20* | 40 1.4 | 10 | 30 | *30* | 50 2.1 | 20 | 40 | *20* | 40 2.2 | 30 | 50 | *30* | 50 2.3 | 40 | 60 | 40 | 60
livepeer · Sep 13, 2024 · 4d44fe0 · 4d44fe0
1 parent fe5aff1
commit 4d44fe0
Show file tree

Hide file tree

Showing 6 changed files with 218 additions and 108 deletions.
diff --git a/ffmpeg/api_test.go b/ffmpeg/api_test.go
@@ -51,7 +51,7 @@ func TestAPI_SkippedSegment(t *testing.T) {
 		if res.Decoded.Frames != 120 {
 			t.Error("Did not get decoded frames", res.Decoded.Frames)
 		}
-		if res.Encoded[1].Frames != 245 {
+		if res.Encoded[1].Frames != 246 {
 			t.Error("Did not get encoded frames ", res.Encoded[1].Frames)
 		}
 	}
@@ -68,7 +68,7 @@ func TestAPI_SkippedSegment(t *testing.T) {
 
       # sanity check ffmpeg frame count against ours
       ffprobe -count_frames -show_streams -select_streams v ffmpeg_sw_$1.ts | grep nb_read_frames=246
-      ffprobe -count_frames -show_streams -select_streams v sw_$1.ts | grep nb_read_frames=245
+      ffprobe -count_frames -show_streams -select_streams v sw_$1.ts | grep nb_read_frames=246
 
     # check image quality
     # TODO really should have frame counts match for ssim
@@ -224,18 +224,14 @@ func countEncodedFrames(t *testing.T, accel Acceleration) {
 		if err != nil {
 			t.Error(err)
 		}
-		expectedFrames := 60
-		if i == 1 || i == 3 {
-			expectedFrames = 61 // TODO figure out why this is!
-		}
-		if res.Encoded[0].Frames != expectedFrames {
-			t.Error(in.Fname, " Mismatched frame count: expected ", expectedFrames, " got ", res.Encoded[0].Frames)
+		if res.Encoded[0].Frames != 60 {
+			t.Error(in.Fname, " Mismatched frame count: expected 60 got ", res.Encoded[0].Frames)
 		}
 		if res.Encoded[1].Frames != 120 {
 			t.Error(in.Fname, " Mismatched frame count: expected 120 got ", res.Encoded[1].Frames)
 		}
-		if res.Encoded[2].Frames != 239 {
-			t.Error(in.Fname, " Mismatched frame count: expected 239 got ", res.Encoded[2].Frames)
+		if res.Encoded[2].Frames != 240 {
+			t.Error(in.Fname, " Mismatched frame count: expected 240 got ", res.Encoded[2].Frames)
 		}
 		if res.Encoded[3].Frames != 120 {
 			t.Error(in.Fname, " Mismatched frame count: expected 120 got ", res.Encoded[3].Frames)
@@ -257,33 +253,33 @@ func countEncodedFrames(t *testing.T, accel Acceleration) {
 pts=129000
 pts=129750
 pts=130500
-pts=306000
 pts=306750
 pts=307500
+pts=308250
 
 ==> out_120fps_1.ts.pts <==
 pts=309000
 pts=309750
 pts=310500
-pts=486000
 pts=486750
 pts=487500
+pts=488250
 
 ==> out_120fps_2.ts.pts <==
 pts=489000
 pts=489750
 pts=490500
-pts=666000
 pts=666750
 pts=667500
+pts=668250
 
 ==> out_120fps_3.ts.pts <==
 pts=669000
 pts=669750
 pts=670500
-pts=846000
 pts=846750
 pts=847500
+pts=848250
 
 ==> out_30fps_0.ts.pts <==
 pts=129000
@@ -297,9 +293,9 @@ pts=306000
 pts=309000
 pts=312000
 pts=315000
+pts=480000
 pts=483000
 pts=486000
-pts=489000
 
 ==> out_30fps_2.ts.pts <==
 pts=489000
@@ -313,9 +309,9 @@ pts=666000
 pts=669000
 pts=672000
 pts=675000
+pts=840000
 pts=843000
 pts=846000
-pts=849000
 
 ==> out_60fps_0.ts.pts <==
 pts=129000
@@ -463,8 +459,8 @@ func TestTranscoder_API_AlternatingTimestamps(t *testing.T) {
 
       # sanity check ffmpeg frame count against ours
       ffprobe -count_frames -show_streams -select_streams v ffmpeg_sw_$1.ts | grep nb_read_frames=246
-      ffprobe -count_frames -show_streams -select_streams v sw_$1.ts | grep nb_read_frames=245
-      ffprobe -count_frames -show_streams -select_streams v sw_audio_encode_$1.ts | grep nb_read_frames=245
+      ffprobe -count_frames -show_streams -select_streams v sw_$1.ts | grep nb_read_frames=246
+      ffprobe -count_frames -show_streams -select_streams v sw_audio_encode_$1.ts | grep nb_read_frames=246
 
     # check image quality
     # TODO frame count should really match for ssim
@@ -476,12 +472,146 @@ func TestTranscoder_API_AlternatingTimestamps(t *testing.T) {
     # Really should check relevant audio as well...
     }
 
-
-    # re-enable for seg 0 and 1 when alternating timestamps can be handled
     check 0
     check 1
     check 2
     check 3
+	`
+	run(cmd)
+
+}
+
+func TestTranscoder_API_DTSOverlap(t *testing.T) {
+	dtsOverlap(t, Software)
+}
+
+func dtsOverlap(t *testing.T, accel Acceleration) {
+	// Non-monotonic DTS timestamps are a major problem.
+	// We have one such case here when:
+	// 1. first segment pts starts near zero
+	// 2. B-frames are in use
+	// 3. mpegts is the output format
+	//
+	// the transcoder can produce DTS < 0, PTS = 0 which gets
+	// offset to DTS = 0, PTS = -DTS in the mpegts muxer
+	//
+	// However, transcodes for other segments will not be aware
+	// of this delay, leading to overlap between the first and
+	// second segments.
+	//
+	// This is not a LPMS specific issue but rather one in the
+	// employment of mpegts (always add an offset!); ffmpeg has
+	// the exact same issue.
+
+	// This test case codifies this behavior for now as a sign
+	// that we are aware of it, and if it ever changes somehow,
+	// this should fail and let us know.
+
+	run, dir := setupTest(t)
+	defer os.RemoveAll(dir)
+
+	err := RTMPToHLS("../transcoder/test.ts", dir+"/out.m3u8", dir+"/out_%d.ts", "2", 0)
+	if err != nil {
+		t.Error(err)
+	}
+
+	profile := P144p30fps16x9
+	profile.Framerate = 15
+	profile.Profile = ProfileH264Main
+	// and check no bframes case (which is ok)
+	profileNoB := profile
+	profileNoB.Profile = ProfileH264ConstrainedHigh
+	tc := NewTranscoder()
+	defer tc.StopTranscoder()
+	idx := []int{1, 0}
+	for _, i := range idx {
+		in := &TranscodeOptionsIn{Fname: fmt.Sprintf("%s/out_%d.ts", dir, i)}
+		out := []TranscodeOptions{{
+			Oname:        fmt.Sprintf("%s/bf_%d.ts", dir, i),
+			Profile:      profile,
+			AudioEncoder: ComponentOptions{Name: "copy"},
+			Accel:        accel,
+		}, {
+			Oname:        fmt.Sprintf("%s/nobf_%d.ts", dir, i),
+			Profile:      profileNoB,
+			AudioEncoder: ComponentOptions{Name: "copy"},
+			Accel:        accel,
+		}}
+		res, err := tc.Transcode(in, out)
+		if err != nil {
+			t.Error(err)
+		}
+		if res != nil {
+			if res.Decoded.Frames != 120 {
+				t.Error("Did not get decoded frames", res.Decoded.Frames)
+			}
+			if res.Encoded[0].Frames != 30 {
+				t.Error("Mismatched frame count for hw/nv")
+			}
+		}
+	}
+
+	cmd := `
+
+		# ffmpeg has the exact same problem so let's demonstrate that too
+		ffmpeg -loglevel warning -i out_0.ts -c:a copy \
+			-vf fps=15,scale=w=256:h=144 -c:v libx264 -muxdelay 0 -muxpreload 0 -copyts \
+			ffmpeg_sw_0.ts
+		ffmpeg -loglevel warning -i out_1.ts -c:a copy \
+			-vf fps=15,scale=w=256:h=144 -c:v libx264 -muxdelay 0 -muxpreload 0 -copyts \
+			ffmpeg_sw_1.ts
+
+
+		# Ensure timestamps are monotonic by checking deltas
+		# do this for the low fps rendition since those are more
+		# likely to have issues while downsampling fps
+		function calc_delta {
+			cat $1_0.ts $1_1.ts > $1_concat.ts
+			mapfile -t dts_times < <(ffprobe -hide_banner -select_streams v -of csv=p=0 -show_entries packet=dts_time $1_concat.ts | awk -F',' '{print $1}')
+			# Loop through the array and calculate the delta
+			for ((i = 1; i < ${#dts_times[@]}; i++)); do
+				delta=$(echo "${dts_times[$i]} - ${dts_times[$i-1]}" | bc -l)
+				echo "$delta" >> $1_concat.delta
+			done
+			sort $1_concat.delta | uniq -c | sed 's/^ *//g' > $2
+		}
+
+		calc_delta bf deltas.out
+		calc_delta nobf deltas_nobf.out
+		calc_delta ffmpeg_sw ffmpeg_deltas.out
+	`
+
+	if accel == Nvidia {
+		cmd = cmd + `
+			cat <<-EOF > expected_deltas.out
+				1 -.133333
+				20 .066666
+				38 .066667
+			EOF
+		`
+	} else {
+		// for sw transcode, ffmpeg and lpms are exactly the same
+		cmd = cmd + `
+			diff -u deltas.out ffmpeg_deltas.out
+
+			cat <<-EOF > expected_deltas.out
+				1 -.066666
+				20 .066666
+				38 .066667
+			EOF
+	`
+	}
+
+	cmd = cmd + `
+		diff -u expected_deltas.out deltas.out
+
+		# no b-frames case
+		cat <<-EOF > expected_deltas_nobf.out
+			20 .066666
+			39 .066667
+		EOF
+		diff -u expected_deltas_nobf.out deltas_nobf.out
+
   `
 	run(cmd)
 }
@@ -795,11 +925,7 @@ func consecutiveMP4s(t *testing.T, accel Acceleration) {
 				t.Error("Unexpected error ", err)
 				continue
 			}
-			expectedFrames := 60
-			if i == 1 || i == 3 {
-				expectedFrames = 61 // TODO figure out why this is!
-			}
-			if res.Decoded.Frames != 120 || res.Encoded[0].Frames != expectedFrames {
+			if res.Decoded.Frames != 120 || res.Encoded[0].Frames != 60 {
 				t.Error("Unexpected results ", i, inExt, outExt, res)
 			}
 		}
@@ -1119,9 +1245,8 @@ func setGops(t *testing.T, accel Acceleration) {
 
         # intra checks with fixed fps.
         # sanity check number of packets vs keyframes
-        # TODO look into why lpms generates 91 frames instead of 100
-        ffprobe -loglevel warning lpms_intra_10fps.ts -select_streams v -show_packets | grep flags= | wc -l | grep 91
-        ffprobe -loglevel warning lpms_intra_10fps.ts -select_streams v -show_packets | grep flags=K | wc -l | grep 91
+        ffprobe -loglevel warning lpms_intra_10fps.ts -select_streams v -show_packets | grep flags= | wc -l | grep 100
+        ffprobe -loglevel warning lpms_intra_10fps.ts -select_streams v -show_packets | grep flags=K | wc -l | grep 100
     `
 	run(cmd)
 

diff --git a/ffmpeg/encoder.c b/ffmpeg/encoder.c
@@ -159,11 +159,9 @@ void close_output(struct output_ctx *octx)
   }
   if (octx->vc && octx->hw_type == AV_HWDEVICE_TYPE_NONE) avcodec_free_context(&octx->vc);
   if (octx->ac) avcodec_free_context(&octx->ac);
+  free_filter(&octx->vf);
   octx->af.flushed = octx->vf.flushed = 0;
   octx->af.flushing = octx->vf.flushing = 0;
-  octx->vf.pts_diff = INT64_MIN;
-  octx->vf.prev_frame_pts = 0;
-  octx->vf.segments_complete++;
 }
 
 void free_output(struct output_ctx *octx)
@@ -314,6 +312,8 @@ int reopen_output(struct output_ctx *octx, struct input_ctx *ictx)
 
   // re-attach video encoder
   if (octx->vc) {
+    ret = init_video_filters(ictx, octx);
+    if (ret < 0) LPMS_ERR(reopen_out_err, "Unable to reopen filter")
     ret = add_video_stream(octx, ictx);
     if (ret < 0) LPMS_ERR(reopen_out_err, "Unable to re-add video stream");
   } else LPMS_INFO("No video stream!?");
@@ -470,6 +470,19 @@ int mux(AVPacket *pkt, AVRational tb, struct output_ctx *octx, AVStream *ost)
     av_packet_rescale_ts(pkt, tb, ost->time_base);
   }
 
+  /* Enable this if it seems we have issues
+     with the first and second segments overlapping due to bframes
+     See TestTranscoder_API_DTSOverlap
+
+  int delay = av_rescale_q(10, (AVRational){1, 1}, ost->time_base);
+  if (pkt->dts != AV_NOPTS_VALUE) {
+    pkt->dts += delay;
+  }
+  if (pkt->pts != AV_NOPTS_VALUE) {
+    pkt->pts += delay;
+  }
+  */
+
   // drop any preroll audio. may need to drop multiple packets for multichannel
   // XXX this breaks if preroll isn't exactly one AVPacket or drop_ts == 0
   //     hasn't been a problem in practice (so far)