Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent repeating frame indices #1265

Closed
wants to merge 1 commit into from

Conversation

JoeSchiff
Copy link
Contributor

@JoeSchiff JoeSchiff commented Jan 21, 2024

Fix for Issue #1158.

Problem

stream.thread_type.FRAME threading causes an incorrect series of repeating frame indices at the end of most videos. This is because a list of frames is returned by CodecContext._send_packet_and_recv . frame.index corresponds to the most recent frame received from the decoder. Therefore all frames in the list are assigned the same index as the final frame in the list.

Solution

I solved this issue by detecting whenever multiple frames are returned and then correcting the frame.index based on the frame's position in the list.

Other changes

AVCodecContext::frame_number is deprecated. I switched to AVCodecContext::frame_num.
EDIT: The deprecation was issued in ffmpeg 6.0, so I reverted back to using AVCodecContext::frame_number.

Testing

The test iterations may be overkill, but it only increased the test run time from 6 seconds to 6.5 seconds on my machine.

Future considerations

It is important to note that frame.index is affected by CodecContext.skip_frame. Skipped frames will not increment the frame index. I don't know if this is intended or desired behavior. Perhaps we should consider two different frame properties.
1: A property for the number of frames returned by the decoder (this is the current behavior).
2: A property for the frame index relative to the total number of frames in the video. I assume this can be determined by a timestamp?

@JoeSchiff JoeSchiff changed the title prevent repeating frame indices (fixes: #1158) Prevent repeating frame indices Jan 22, 2024
@WyattBlue
Copy link
Member

I would like there to be performance benchmarks ran on decode. Eyeballing the code, it looks this costs more cpu time but the PR is still good if the performance difference isn't measurable.

I'm curious why you choose to extract the logic of _correct_frame_indices to it's own function instead of inlining it in decode. This again feels like a performance thing but I would like measurable results. Actually, since _setup_decoded_frame is only one line now, that could be inlined as well.

@jlaine
Copy link
Member

jlaine commented Jan 26, 2024

I'm very doubtful about this PR, do we even know why we have this .index attribute?

As far as I can tell:

  • This is not part of FFmpeg's AVFrame
  • It's not in the pyav documentation at all.

Unless it's actually used somewhere, another approach would be to remove the attribute completely.

@JoeSchiff
Copy link
Contributor Author

Performance Benchmarks

@WyattBlue I ran some rudimentary performance benchmarks using the following code:

import av
from av.datasets import fate as fate_suite
import time

PATHS = (fate_suite("h264/bbc2.sample.h264"), fate_suite("h264/interlaced_crop.mp4"), av.datasets.curated("pexels/time-lapse-video-of-night-sky-857195.mp4"))

def test_decode_frame_indices():
    for path in PATHS:
        for thread_type in ("NONE", "AUTO"):
            for thread_count in range(4):
                for skip_type in ("NONE", "NONKEY"):
                    frame_list = []
                    compare_list = []
                    frame_count = 0
                    #print('\n', path, thread_type, thread_count, skip_type)
                    with av.open(path) as container:
                        stream = container.streams.video[0]
                        stream.thread_type = thread_type
                        stream.thread_count = thread_count
                        stream.codec_context.skip_frame = skip_type
                        for packet in container.demux(stream):
                            for frame in packet.decode():
                                frame_list.append(frame.index)
                                compare_list.append(frame_count)
                                frame_count += 1
                    print(frame_list == compare_list)

start_time = time.perf_counter()
test_decode_frame_indices()
print(time.perf_counter() - start_time)

I did 3 runs on each branch and got the following average durations:

main: 5.705269041995052 seconds
fix_frame_indices: 5.665727951010922 seconds

If you were referring to more sophisticated benchmarking, I'll need some help to get started.

Function vs Inline

The only reason I made a function was to make it slightly more organized. If you prefer that I change to inline, that's perfectly fine with me.

@JoeSchiff
Copy link
Contributor Author

AVFrame

@jlaine I looked through AVFrame Struct Reference and found display_picture_number, which looked promising. However, when I exposed it in pyav it always returned 0.
It is also deprecated in ffmpeg 6.0.

This issue seems to also conclude that display_picture_number is not trustworthy.

Docs

frame.index is found in the pyav.org docs here and here.

@jlaine
Copy link
Member

jlaine commented Jan 27, 2024

Thanks for the documentation pointers. It seems a bit weird that this member is mentioned in examples, but there isn't event a docstring clarifying the fact it's an index generated by pyav, and that it has no influence on encoding.

If the point is purely to have a running index, what is gained over doing:

for (index, frame) in enumerate(container.decode(video=0)):
   ...

@JoeSchiff
Copy link
Contributor Author

@jlaine Yeah, that's a good point. However, since frame.index is on the pyav.org homepage, I'm sure many people expect it to be available. I would really like to provide a replacement before removing it.

It seems these problems arise from the fact that frame.index is not an actual index. enumerate() and frame.index work correctly only when iterating every frame. They will break if using CodecContext.skip_frame or container.seek.

How about this solution:

  1. The current functionality of "number of frames decoded so far" is moved from frame.index to a new property called CodecContext.frame_num. That way it matches ffmpeg and the original functionality is still available to users.

  2. Create frame.guessed_index which is calculated from frame.pts * stream.guessed_rate * frame.time_base or similar (I still struggle with the intricacies of time base).

The problem is frame.pts, stream.guessed_rate, and frame.time_base are not always available in all media. Example: fate_suite("h264/bbc2.sample.h264")

If frame.guessed_index isn't available then the original intent of this PR can be implemented by the user like this:

for packet in container.demux(stream):
    frame_list = packet.decode()
    correction = len(frame_list)
    for frame in frame_list:
        corrected_index = stream.codec_context.frame_num - correction
        correction -= 1
        print(f"{corrected_index=}")

Ideally we would be able to get a frame's index from ffmpeg, but I can't find a way to do it at the moment. I think this new solution solves many problems:

  1. The original functionality of frame.index remains (elsewhere).
  2. The repeating frame indices from #1158 can be solved.
  3. Pyav matches ffmpeg's structure and property names.
  4. Provide a way to get the frame index even when CodecContext.skip_frame = "NONKEY".
  5. The name frame.guessed_index makes it clear that the value is not guaranteed to be accurate.

@WyattBlue
Copy link
Member

I agree with jlaine. We should deprecate and remove frame.index. There is no such concept in ffmpeg. Putting in a guessed_index would only further complicate an already complex project.

@WyattBlue WyattBlue closed this Feb 16, 2024
@JoeSchiff JoeSchiff deleted the fix_frame_indices branch January 6, 2025 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants