Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Vision Language Model in GenAI-Perf #756

Merged
merged 6 commits into from
Jul 18, 2024
Merged

Support Vision Language Model in GenAI-Perf #756

merged 6 commits into from
Jul 18, 2024

Conversation

nv-hwoo
Copy link
Contributor

@nv-hwoo nv-hwoo commented Jul 18, 2024

No description provided.

mwawrzos and others added 5 commits July 11, 2024 09:52
* POC for LLaVA support

* non-streaming request in VLM tests

* image component sent in "image_url" field instead of HTML tag

* generate sample image instead of loading from docs

* add vision to endpoint mapping

* fixes for handling OutputFormat

* refactor - extract image preparation to a separate module

* fixes to the refactor

* replace match-case syntax with if-elseif-else

* Update image payload format and fix tests

* Few clean ups and tickets added for follow up tasks

* Fix and add tests for vision format

* Remove output format from profile data parser

* Revert irrelevant code change

* Revert changes

* Remove unused dependency

* Comment test_extra_inputs

---------

Co-authored-by: Hyunjae Woo <[email protected]>
* synthetic image generator

* format randomization

* images should be base64-encoded arbitrarly

* randomized image format

* randomized image shape

* prepare SyntheticImageGenerator to support different image sources

* read from files

* python 3.10 support fixes

* remove unused imports

* skip sampled image sizes with negative values

* formats type fix

* remove unused variable

* synthetic image generator encodes images to base64

* image format not randomized

* sample each dimension independently

Co-authored-by: Hyunjae Woo <[email protected]>

* apply code-review suggestsions

* update class name

* deterministic synthetic image generator

* add typing to SyntheticImageGenerator

* SyntheticImageGenerator doesn't load files

* SyntheticImageGenerator always encodes images to base64

* remove unused imports

* generate gaussian noise instead of blank images

---------

Co-authored-by: Hyunjae Woo <[email protected]>
* Add CLI options for synthetic image generation

* read image format from file when --input-file is used

* move encode_image method to utils

* Lazy import some modules
* support synthetic image generation for VLM model

* add test

* integrate sythetic image generator into LlmInputs

* add source images for synthetic image data

* use abs to get positive int
@nv-hwoo nv-hwoo changed the title Vision language Support Vision Language Model in GenAI-Perf Jul 18, 2024
@debermudez
Copy link
Contributor

Does this feature have new documentation or examples?

@nv-hwoo
Copy link
Contributor Author

nv-hwoo commented Jul 18, 2024

It needs one doc but I was thinking to add them after adding this to main so that NIM team can just pull main branch to use it.

@debermudez
Copy link
Contributor

It needs one doc but I was thinking to add them after adding this to main so that NIM team can just pull main branch to use it.

Thats fine with me. Thanks for being on top of that.

@nv-hwoo nv-hwoo merged commit 30af885 into main Jul 18, 2024
5 checks passed
@nv-hwoo nv-hwoo deleted the vision-language branch July 18, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants