Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix (GPxQ): unwrap QuantTensor when dealing with QuantLinear #915

Closed

Conversation

fabianandresgrob
Copy link
Collaborator

When running GPxQ and we use quantized activations, once a QuantLinear layer is processed, update_batch tries to unsqueeze (GPFQ) or transpose (GPTQ) that QuantTensor, leading to an error.

This PR solves that issue by unwrapping that tensor. It also adds a quant linear model in the fixtures for reproducing the issue.
Some of the tests still fail as it does not raise a ValueError because it actually has a quant_input (here). @i-colbert maybe we need to change the conditions for an expected fail?

@Giuseppe5
Copy link
Collaborator

Tests are failing. Is this because of the PR?
Is this still needed?

@Giuseppe5
Copy link
Collaborator

@fabianandresgrob would you mind fixing the error so we can merge this?

@fabianandresgrob
Copy link
Collaborator Author

No longer needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants