-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Computation of compression parameters via OpenVINO models #2727
Open
nikita-savelyevv
wants to merge
77
commits into
openvinotoolkit:develop
Choose a base branch
from
nikita-savelyevv:compress-via-openvino
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Computation of compression parameters via OpenVINO models #2727
nikita-savelyevv
wants to merge
77
commits into
openvinotoolkit:develop
from
nikita-savelyevv:compress-via-openvino
+2,052
−263
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
github-actions
bot
added
NNCF Common
Pull request that updates NNCF Common
NNCF OpenVINO
Pull requests that updates NNCF OpenVINO
NNCF PTQ
Pull requests that updates NNCF PTQ
labels
Jun 11, 2024
alexsu52
reviewed
Jun 13, 2024
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
4 times, most recently
from
July 3, 2024 18:31
55cafaa
to
a68a63d
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
4 times, most recently
from
July 16, 2024 14:19
6b98ddd
to
3d9faa4
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
6 times, most recently
from
September 6, 2024 11:11
1c85732
to
b527cac
Compare
github-actions
bot
added
the
documentation
Improvements or additions to documentation
label
Sep 6, 2024
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
2 times, most recently
from
September 11, 2024 12:59
ac3ea02
to
2a3a63c
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
from
October 11, 2024 11:51
c9569bb
to
a151d99
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
2 times, most recently
from
October 21, 2024 08:52
fe30c13
to
19ea412
Compare
alexsu52
reviewed
Oct 22, 2024
nncf/quantization/algorithms/weight_compression/weight_lowering/dispatcher.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/weight_lowering/dispatcher.py
Outdated
Show resolved
Hide resolved
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
3 times, most recently
from
October 26, 2024 13:40
eef34f8
to
ca3447c
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
from
October 29, 2024 15:19
ca3447c
to
f3891cd
Compare
This reverts commit 9a56fae.
nikita-savelyevv
changed the title
Generalize weight compression via OpenVINO submodels
Computation of compression parameters via OpenVINO models
Dec 12, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
documentation
Improvements or additions to documentation
NNCF Common
Pull request that updates NNCF Common
NNCF OpenVINO
Pull requests that updates NNCF OpenVINO
NNCF PTQ
Pull requests that updates NNCF PTQ
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
weight_lowering.py
:do_int_quantization()
is used for computing a compressed weight. Possible signatures:weight
->compressed_weight
,scale
, (zero_point
for asymmetric compression)weight
,scale
, (zero_point
) ->compressed_weight
,scale
, (zero_point
)calculate_quantized_dequantized_weight()
is used for computing a decompressed weight. Possible signatures:weight
->decompressed_weight
weight
,scale
, (zero_point
) ->decompressed_weight
weight
->decompressed_weight
,compressed_weight
,scale
, (zero_point
)weight
,scale
, (zero_point
) ->decompressed_weight
,compressed_weight
,scale
, (zero_point
)scale
andzero_point
are the same as the ones given as input (if they were given at all).openvino.Tensor
. Implementation for this backend is limited by only the required functionality, e.g. addition of OV Tensors is not supported because it is not needed.bf16
,u4
andi4
data types. For example,bf16
constants are read from an OpenVINO LLM and given as inputs to a compressing OpenVINO model.u4
andi4
compressed weights are seamlessly inserted into the resulting compressed OpenVINO model.tensor.to_backend()
method to convert an NNCF Tensor from one backend to another. Currently on OV<->NP conversion is required.Data-free asymmetric compression:
Data-free symmetric compression:
Data-aware compression:
Reason for changes
Reducing model compression time. Only OpenVINO model compression backend is affected.
Related tickets
139047
Tests
tests/openvino/native/quantization/test_ov_modeling_compression.py::test_quantization_alignment
-- check aligment with reference numpy implementationtests/openvino/native/test_openvino_modeling.py
-- checks OV modeling framework hyperparameterstests/openvino/native/test_tensor.py
-- NNCF OV Tensor backend testsValidation jobs:
NNCF/job/manual/job/post_training_weight_compression/286/