Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GPTQ for inputs with batch size != 1 and with seq len == 1 #3002

Conversation

ljaljushkin
Copy link
Contributor

@ljaljushkin ljaljushkin commented Oct 1, 2024

Changes

GPTQ correctly processes inputs with batch size != 1 and with batch size and sequence length equal 1.
Also changed the errors we are raising in NNCF from built-in Python errors to NNCF-specific ones.

Reason for changes

Stable-diffusion models, e.g. runwayml/stable-diffusion-v1-5 has as an input for linear layers with the following shapes: [2*num_images_in_prompt, text_embedding_size, hidden_dimension].

https://github.com/openvinotoolkit/nncf/blob/develop/examples/llm_compression/openvino/tiny_llama/main.py
uses not filtered data from wikitext that leads to the corner case with sequence length == 1.

Related tickets

150851, 155538

Tests

  • test_compression_with_transposed_activations
  • test_compression_with_different_algo_combinations
  • test_raise_error_with_unsupported_params_for_e2m1
  • test_raise_error_with_unsupported_params_for_empty_dataset

CI

  • weight compression conformance

@ljaljushkin ljaljushkin requested a review from a team as a code owner October 1, 2024 18:44
@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ labels Oct 1, 2024
@ljaljushkin ljaljushkin force-pushed the nl/3d_activations_weight_compression branch from 91773fd to 9add286 Compare October 1, 2024 20:02
Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ljaljushkin ljaljushkin changed the title Support for 3D activations in data-aware weight compression Fix for inputs with batch size != 1 in data-aware weight compression Oct 8, 2024
@ljaljushkin ljaljushkin force-pushed the nl/3d_activations_weight_compression branch from 9add286 to 1214ce8 Compare October 21, 2024 13:32
@kshpv
Copy link
Collaborator

kshpv commented Oct 22, 2024

This PR resolves 155538

@ljaljushkin ljaljushkin changed the title Fix for inputs with batch size != 1 in data-aware weight compression Fix GPTQ for inputs with batch size != 1 and with seq len == 1 Oct 23, 2024
@@ -264,7 +264,7 @@ def _quantize_weights(
scales.append(scale)
else:
if self._scale_estimation and block_compression_config.num_bits == 4:
activations = [inp.squeeze()[:, (i1 + i) : (i1 + i + group_size)] for inp in inputs]
activations = [inp[..., (i1 + i) : (i1 + i + group_size)] for inp in inputs]
Copy link
Contributor Author

@ljaljushkin ljaljushkin Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slice by last dimension, which is supposed to be hidden one.
It's aligned with processing statistics in activations_to_wc_statistics when reduction axes are all dimensions except the last one @nikita-savelyevv

[
LMLinearModel.INPUT_SHAPE,
[3, 5, 16],
[1, 1, 16],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test case for the bug 155538 with tiny-llama example. The root cause is not filtered data with sequence length == 1.
@kshpv

"input_shape",
[
LMLinearModel.INPUT_SHAPE,
[3, 5, 16],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test case to cover SD case with batch size != 1

Copy link
Collaborator

@andreyanufr andreyanufr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally with different input shapes. It works.

@ljaljushkin ljaljushkin merged commit 57e3891 into openvinotoolkit:develop Oct 24, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants