Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EETQ not available when using TGI via get_huggingface_llm_image_uri #4194

Open
TRT-BradleyB opened this issue Oct 15, 2023 · 4 comments
Open

Comments

@TRT-BradleyB
Copy link

Describe the bug

Related to this issue:
aws/deep-learning-containers#3377

There are two versions of the TGI 1.1.0 image. One has EETQ pre-installed: https://github.com/NetEase-FuXi/EETQ

py39-cu118-ubuntu20.04 and py39-cu118-ubuntu20.04-v1.0

In the json config only the one without EETQ is specified.

"container_version": {"gpu": "cu118-ubuntu20.04"}

Easy fix, but I'm not sure how you'd like to resolve this given the naming scheme deviates.

@Daan-Grashoff
Copy link

They have multiple versions, but none are working with AWQ models:

    "imageDetails": [
        {
            "registryId": "763104351884",
            "repositoryName": "huggingface-pytorch-tgi-inference",
            "imageDigest": "sha256:2739b630b95d8a95e6b4665e66d8243dd43b99c4fdb865feff13aab9c1da06eb",
            "imageTags": [
                "2.0.1-gpu-py39-cu118-ubuntu20.04",
                "2.0-tgi1.1-gpu-py39-cu118-ubuntu20.04",
                "2.0-gpu-py39-cu118-ubuntu20.04-v1",
                "2.0.1-tgi1.1.0-gpu-py39-cu118-ubuntu20.04-v1.0-2023-10-02-14-29-28",
                "2.0-tgi1.1-gpu-py39-cu118-ubuntu20.04-v1",
                "2.0.1-tgi1.1.0-gpu-py39-cu118-ubuntu20.04",
                "2.0.1-tgi1.1.0-gpu-py39-cu118-ubuntu20.04-v1.0"
            ],
            "imageSizeInBytes": 4576429231,
            "imagePushedAt": "2023-10-02T16:39:34+02:00",
            "imageManifestMediaType": "application/vnd.docker.distribution.manifest.v2+json",
            "artifactMediaType": "application/vnd.docker.container.image.v1+json",
            "lastRecordedPullTime": "2023-10-16T15:46:30.296000+02:00"
        }
    ]
}```

@Igosuki
Copy link

Igosuki commented Nov 13, 2023

Why is this still not solved ? eetq slashes inference time by a factor of 2...

@amzn-choeric
Copy link
Contributor

I might be missing something obvious, but the two tags you listed for 1.1.0 should be pointing to the same image. Please use the latest version, which should be 1.3.3 as of this writing.

@knikure
Copy link
Contributor

knikure commented Jan 10, 2024

@TRT-BradleyB can you try with latest TGI image?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants