Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To launch the benchmarking #1

Open
Gpoxolcku opened this issue Apr 1, 2024 · 6 comments
Open

To launch the benchmarking #1

Gpoxolcku opened this issue Apr 1, 2024 · 6 comments

Comments

@Gpoxolcku
Copy link

Hi! Awesome work and datasets collection! Is there a way (or plan to release such a script) to launch a model's benchmark evaluation on the full set of data and obtain a comprehensive report on all the metrics?

@danielz02
Copy link
Member

Thanks a lot for your interest! We are working on it :) The end goal would be to support common Hugging Face models. Do you have any model in mind?

@Gpoxolcku
Copy link
Author

Thank you for quick answer! Do you know any approximate release time?) Just developing yet another model, interested in metrics to track the progress :)

@danielz02
Copy link
Member

danielz02 commented Apr 1, 2024

I'm thinking of some time around ICLR, which is early May, but I can definitely adjust the priority if there is a need for evaluating new models. What interface does your model use? Is it a Hugging Face pipeline or a Llava-like interface?

@Gpoxolcku
Copy link
Author

That would be very nice of you, thank you! I use Llava-like interface on a local machine

@Gpoxolcku
Copy link
Author

Hi, is there any success in finalizing the eval scripts? such benchmark would be very helpful for my projects, thanks :)

@alievrusik
Copy link

Hello! I would also greatly appreciate such script, right now it's not very convenient to compare different EO VLMs on this benchmark. Do you still plan to release it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants