Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can we perform segmentation in real-time? #60

Open
CURRY-AND-RICE opened this issue Jul 31, 2024 · 5 comments
Open

How can we perform segmentation in real-time? #60

CURRY-AND-RICE opened this issue Jul 31, 2024 · 5 comments

Comments

@CURRY-AND-RICE
Copy link

Currently, I think we can only input video via separated frames stored in a directory.
However, for online applications, we should be able to input frames sequentially as they come in.
Are there any existing solutions to facilitate this?
Additionally, are there plans to add such functionality in the future?

Thank you for amazing work!

@rolson24
Copy link

rolson24 commented Jul 31, 2024

I opened a PR that can run directly on a video file without extracting and loading all of the frames into memory at once, but it doesn't support a video stream. I would most likely require a large refactor of this repository's codebase to support a video stream, but I know huggingface are working to add the model to transformers, which may be able to support running on a stream.

@CURRY-AND-RICE
Copy link
Author

Thank you for notifying me of such important information!
I found an issue on hugginface for adding SAMv2 which is currently in progress.
I will continue to explore ways to achieve stream inference and will keep this issue open.

@Joao-Pimenta
Copy link

@CURRY-AND-RICE Did you find a good implementation?

@CURRY-AND-RICE
Copy link
Author

@Joao-Pimenta
I've been unable to find an implementation that matches my needs.
Maybe this will help. #90

@heyoeyo
Copy link

heyoeyo commented Aug 13, 2024

Are there any existing solutions to facilitate this?

I have a basic example script that runs off videos (should work with webcams even), though it's not finalized and may be missing some features compared to the original video prediction implementation.

Edit: There's also now a UI version, which can also work on webcam:
webcamanim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants