-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for directly running on segmentation on video files. #46
base: main
Are you sure you want to change the base?
Conversation
Hi @rolson24! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
Hey @rolson24 this is super interesting! I've actually been attempting to run video segmentation using longer videos (5+ minutes) and I've been running into memory allocation errors. Have you tested longer videos using this method? |
@jordan-barrett-jm Also I have a colab notebook that demonstrates this change that is based on the Roboflow one here |
Thanks! One solution I've found in the interim is mini batching the images |
Thanks for the great work! Just curious, it seems like we cannot still add new point during inference right? What I mean is sort of real-time tracking. |
I think you could add a new point by just using the add new point function
in the for loop when you want to add a new prompt. I have not tested that
theory though.
Also if you are still running out of GPU memory, try to use the
‘offload_state_to_cpu’ parameter when you initialize the state so the
states get stored on the system ram.
…On Wed, Jul 31, 2024 at 9:58 PM MattLiutt ***@***.***> wrote:
Thanks for the great work! Just curious, it seems like we cannot still add
new point during inference right? What I mean is sort of real-time tracking.
—
Reply to this email directly, view it on GitHub
<#46 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AX2EJPFSXGYVERO4KVUEGKLZPGHWPAVCNFSM6AAAAABLXJKLQ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRRG44DQNRZGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I've tested and checked the code, once the inference started, it prohibited from adding new points. |
@rolson24 how do i run and test it and how can i get to know the label assigned to each object in the video? |
… segmentation on video files.
This PR adds support for running segmentation directly on video files instead of individual image files. It uses
torchvision
's built-in VideoReader object and only adds the dependancy of PyAV which is are the python bindings for ffmpeg. Alternatively, users could compiletorchvision
from source with thevideo_reader
backend if they didn't want to install PyAV. I think this could really improve the easy of building demos for SAM-2 if this gets added because then the entire video doesn't have to be extracted first and then read into RAM.I will do some more rigorous testing to make sure it doesn't affect the expected behavior, but it seems to be working for now.