[Torch] INT4 weight compression #3014

alexsu52 · 2024-10-15T06:52:47Z

Changes

Support INT4 weight compression in Torch and Torch.FX backends
Added INT4SymmetricWeightsDecompressor and INT4ASymmetricWeightsDecompressor

Reason for changes

Support INT4 weight model compression of PyTorch models in NNCF.

Related tickets

#3005

Tests

updated tests

nncf/quantization/algorithms/weight_compression/torch_backend.py

ljaljushkin · 2024-10-23T12:34:53Z

nncf/quantization/quantize_model.py

            )

-        if backup_mode is not None:
-            raise AttributeError("TorchFX backend does not support backup_mode option.")
+        if ratio is not None and ratio != 1:


As far as I can see, Torch and Torch FX have the same processing of parameters. Does it make sense to combine them in a single function?

Thanks for this comment. Thanks for this comment. We are going to refactor it when the torch.fx backend will be ready to move out from experiments module.

cc' @daniil-lyakhov, @anzr299

nncf/torch/quantization/quantize_functions.py

github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ labels Oct 15, 2024

alexsu52 force-pushed the as/int4_torch_fx branch 4 times, most recently from 7906fa8 to c1d9fd7 Compare October 18, 2024 07:36

alexsu52 requested review from andreyanufr and ljaljushkin October 18, 2024 07:40

alexsu52 marked this pull request as ready for review October 18, 2024 08:57

alexsu52 requested a review from a team as a code owner October 18, 2024 08:57

alexsu52 force-pushed the as/int4_torch_fx branch from d4df25c to b4d13b8 Compare October 21, 2024 17:18

ljaljushkin reviewed Oct 23, 2024

View reviewed changes

nncf/quantization/algorithms/weight_compression/torch_backend.py Show resolved Hide resolved

ljaljushkin reviewed Oct 23, 2024

View reviewed changes

nncf/torch/quantization/quantize_functions.py Show resolved Hide resolved

alexsu52 force-pushed the as/int4_torch_fx branch from a6e4a47 to 2632e99 Compare October 24, 2024 09:41

ljaljushkin requested changes Oct 24, 2024

View reviewed changes

nncf/torch/quantization/quantize_functions.py Outdated Show resolved Hide resolved

nncf/torch/quantization/quantize_functions.py Outdated Show resolved Hide resolved

alexsu52 added 9 commits October 24, 2024 14:59

added support for int4 weight compression in torch and torch.fx backends

8a8b558

updated tests

1777236

added tinyllama_int4_data_free_backend_TORCH test

ba76c09

ruff happy

e42804e

Update pipelines.py

7a7ce40

added docstrings

e97550d

updated references

4896ed7

added test_pack_uin4 and test_pack_in4

725600e

replied to comments

9a1432c

alexsu52 force-pushed the as/int4_torch_fx branch from 2632e99 to 9a1432c Compare October 24, 2024 12:19

alexsu52 requested a review from ljaljushkin October 24, 2024 12:20

tests passed

69b145b

ljaljushkin approved these changes Oct 24, 2024

View reviewed changes

andreyanufr approved these changes Oct 25, 2024

View reviewed changes

MaximProshin added the Code Freeze label Oct 25, 2024

alexsu52 merged commit ef9cdd2 into openvinotoolkit:develop Oct 28, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Torch] INT4 weight compression #3014

[Torch] INT4 weight compression #3014

alexsu52 commented Oct 15, 2024 •

edited

Loading

ljaljushkin Oct 23, 2024

alexsu52 Oct 24, 2024

[Torch] INT4 weight compression #3014

[Torch] INT4 weight compression #3014

Conversation

alexsu52 commented Oct 15, 2024 • edited Loading

Changes

Reason for changes

Related tickets

Tests

ljaljushkin Oct 23, 2024

Choose a reason for hiding this comment

alexsu52 Oct 24, 2024

Choose a reason for hiding this comment

alexsu52 commented Oct 15, 2024 •

edited

Loading