-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add video post duplication detection support with videohash #303
Comments
Hello, I hadn't seen your library before but that looks like it would work really well. I had put together a solution in the past the generated hashes of a set of frames. However, it didn't scale well. How does video hash do the comparison for lookup? The database of hashes would likely be over 100 million videos. I'm sure I could plug it into my solution for images but would be interested in another approach. |
Similar to ImageHash, it(videohash) calculates the hamming distance of 64 bits to differentiate videos. So the time required to query a videohash and imagehash should be similar. It should be identical to what you are doing with ImageHash. Possible areas you should check before using it in production are the hashing time(too slow for your usage?) and collisions(too many collisions?). Also I ready to make changes to the library for making it more suitable for this particular use case. Maybe you should try it out on some sample videos and suggest some changes iff required to the library. |
I'm only using imagehash to create the hashes. I'm using a different solution for comparison since directly doing hamming distance didn't scale. However, it looks like I can do exactly the same thing with video hash. I should be able to test it out in the next couple weeks. I'm pretty limited on time right now I appreciate the heads up, I had no idea this existed. |
Nice bot, I came across your bot's comment on some subreddit and I noticed that it lacks video support.
I am @akamhy and I am the creator of videohash, a Near Duplicate Video Detection python library. I would like to know if you are interested in supporting video posts duplication detector with the videohash library?
The text was updated successfully, but these errors were encountered: