forked from openvinotoolkit/nncf
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dl/conv layer attrs update #23
Closed
daniil-lyakhov
wants to merge
131
commits into
dl/channel_alignment_improvements_full
from
dl/conv_layer_attrs_update
Closed
Dl/conv layer attrs update #23
daniil-lyakhov
wants to merge
131
commits into
dl/channel_alignment_improvements_full
from
dl/conv_layer_attrs_update
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
daniil-lyakhov
force-pushed
the
dl/conv_layer_attrs_update
branch
2 times, most recently
from
August 24, 2023 13:44
5437420
to
bb403da
Compare
daniil-lyakhov
force-pushed
the
dl/channel_alignment_improvements_full
branch
3 times, most recently
from
August 25, 2023 11:19
1eb303f
to
dd84e18
Compare
daniil-lyakhov
force-pushed
the
dl/conv_layer_attrs_update
branch
from
August 25, 2023 12:12
bb403da
to
9da200c
Compare
openvinotoolkit#2073) ### Changes * ChannelAlignment algorithm is enabled by default * Biases are added only for operations that are affected by CA algorithm ### Reason for changes * To increase models mertics by using ChannelAlignment algorithm by default ### Related tickets 114328 114583 ### Tests tests/post_training/test_templates/test_channel_alignment.py is updated
daniil-lyakhov
force-pushed
the
dl/conv_layer_attrs_update
branch
2 times, most recently
from
September 8, 2023 12:43
300c01a
to
669f0a9
Compare
Refactor smooth quant to use weights layout Tests
### Changes - Added new operation - GroupNormalization ### Reason for changes - Performance degradations that are caused by not correct quantization scheme - New operation support ### Related tickets - 119821 - 119335 ### Tests - TBD
### Changes Disable MaskRCNN and RetinaNet graph tests until ticket 119664 is resolved. The following tests are now excluded: test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_t_a_sym_t-retinanet] test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_t_a_sym_t-mask_rcnn] test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_ch_a_asym_t-retinanet] test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_ch_a_asym_t-mask_rcnn] test_compressed_graph.py::TestModelsGraph::test_magnitude_sparsity_network[retinanet] test_compressed_graph.py::TestModelsGraph::test_magnitude_sparsity_network[mask_rcnn] test_compressed_graph.py::TestModelsGraph::test_rb_sparsity_network[retinanet] test_compressed_graph.py::TestModelsGraph::test_rb_sparsity_network[mask_rcnn] test_compressed_graph.py::TestModelsGraph::test_pruning_network[retinanet] test_compressed_graph.py::test_quantize_outputs[w_sym_t_a_sym_t-retinanet] test_compressed_graph.py::test_quantize_outputs[w_sym_ch_a_asym_t-retinanet]
…toolkit#2118) ### Changes Add message about deprecation of `export_to_onnx_standard_ops` option in NNCFConfig ### Reason Recommended way to export to onnx with QuantizeLinear-DequantizeLinear node pairs is `nncf.strip(quantized_model)`.
openvinotoolkit#2115) ### Changes NNCF should not quantize GRU ops with linear_before_reset set to true, since oneDNN does not support it yet ### Reason for changes To align with POT ### Related bug openvinotoolkit#2105 ### Tests Added `test_ignore_nodes_by_attribues` for OV backend
…oolkit#2123) ### Changes Added Whisper notebook to the list of quantization samples
### Changes - Add marks `nightly` and `weakly` for tests. - Mark sanity tests as `nightly` - Split `test_functions.TestParametrized` to fast for precommit and long for nightly - Time of torch precommit reduced from 60 to 40 mins - Set `xfail` for sanity tests with `--mode train` in case of segment fault. Sporadic segment fault reproduced on torch>=2.0.0 on call `backward` function. ### Related tickets 119128
### Changes Added the link to Quantization with accuracy control using NNCF notebooks. ### Reason for changes Customer adoption ### Related tickets N/A ### Tests N/A
### Changes Fixed problem with shared weights in compression. ### Reason for changes Problem with some LLMs with shared weights. ### Related tickets ### Tests
…t#2086) ### Changes - Add support for the `dump_intermediate_model` parameter to save fully quantized model in the AAQ pipeline ### Reason for changes - Alignment with POT ### Related tickets N/A ### Tests N/A
### Changes 1. Fixed an issue with wrong `tqdm` bar length in the case when calibration dataset length is less than `subset_size`. Reproducer: nikita-savelyevv@f0951c1 **Before:** `Statistics collection: 34%|██████ | 101/300 [00:03<00:06, 28.66it/s]` **After:** When dataset has `__len__`: `Statistics collection: 100%|██████████████████| 101/101 [00:03<00:00, 28.20it/s]` When dataset doesn't have `__len__`: `Statistics collection: 34%|██████ | 101/300 [00:03<00:06, 29.45it/s]` 2. Improved progress bar GUI when ran from notebooks. **Before:** <img width="704" alt="Screenshot 2023-09-06 091857" src="https://github.com/openvinotoolkit/nncf/assets/23343961/9851cb8d-00f1-4297-af50-14697e86e961"> or (in some browsers progress bar takes up multiple lines): ![image](https://github.com/openvinotoolkit/nncf/assets/23343961/99fa9629-2869-4d8f-872e-97ef59bc092e) **After:** <img width="706" alt="Screenshot 2023-09-06 105453" src="https://github.com/openvinotoolkit/nncf/assets/23343961/58e75cc9-2507-4c5b-8c3c-cac44eefcb79"> In console the progress bar is the same. ### Reason for changes User experience improvement. ### Related tickets 112627 ### Tests <!--- How was the correctness of changes tested and whether new tests were added -->
### Changes Upgrade ultralytics to 8.0.170 ### Reason for changes For some reason yolo samples started to fail. Upgrading ultralytics solves this issue because the later version contains these changes: ultralytics/ultralytics@a741961 ### Related tickets 120311 ### Tests Build 82 passed
### Changes Removal of upper bounds from `scipy` version. ### Reason for changes - `scipy<1.11.1` has security vulnerability (see ticket) - The upper bound is causing pip conflicts in openvinotoolkit/openvino#19458 ### Related tickets 117438
### Changes - Fixed behaviour in the `calibrate.py` for algos without options ### Reason for changes - Bugfix ### Related tickets - 120295 ### Tests
…m properly by make command (openvinotoolkit#2127) ### Changes All tests from `tests/experimental/{backend}/` are moved to directories `tests/{backed}/experimental` ### Reason for changes To enable this tests when make command is called. This tests are not running in precommit on current develop branch
### Changes Skip cuda test if cuda is not available ### Reason for changes To fix CPU pre-commit ### Tests precommit_torch_cpu/169/ is finished successfully
### Changes <!--- What was changed (briefly), how to reproduce (if applicable), what the reviewers should focus on --> ### Reason for changes <!--- Why should the change be applied --> ### Related tickets 117723 ### Tests <!--- How was the correctness of changes tested and whether new tests were added -->
### Changes Extends `ModelInputInfo` mechanism used to specify inputs to `NNCFNetwork` for graph building/exporting - now the input info can be specified either as `FillerInputInfo`, which functions pretty much the same as before and uses NNCF config file as the source of specification for the input tensors, or as `ExactInputInfo`, which allows to specify exact forward arguments for graph building. The latter is used to build the model graph based on outputs of dataloaders attached to `NNCFConfig` in the QAT API if the "input_info" field is not specified in `NNCFConfig`, and also in the PTQ API flow to build the graph based on the output of the calibration dataset. ### Reason for changes Previously the PTQ API had to specify own `wrap_inputs_fn`, `wrap_outputs_fn`, `dummy_forward_fn` to make NNCFNetwork build its graph based on the outputs of the calibration dataloader - these functions had to be mostly copy-pasted from the QAT approach to preserve basic NNCF PT functionality such as traced tensor expiry, same tensor replication etc. The new approach allows code reuse. Also the QAT use cases where the init dataloaders are specified are made easier since "input_info" fields in the NNCFConfig may now be omitted. ### Related tickets N/A ### Tests tests.torch.test_graph_building.test_input_info_args_are_passed_into_forward tests.torch.test_graph_building.test_filler_input_info_arg_generation tests.torch.test_graph_building.test_compressed_model_creation_can_build_exact_input_infos_from_dataloader_in_config tests.torch.ptq.test_quantize_model_helpers.test_create_nncf_network_with_nncf_dataset
### Changes - Updated SmoothQuant algorithm to work with Convolution layers; ### Reason for changes - Better accuracy results in some cases; ### Related tickets - 113591 ### Tests --------- Co-authored-by: Liubov Talamanova <[email protected]>
### Changes Added `Concat` to `MULTIHEAD_ATTENTION_OUTPUT` ignored pattern for OV, ONNX, Torch backends ### Reason for changes To improve accuracy of https://huggingface.co/EleutherAI/gpt-neo-1.3B model ### Related tickets * 117617
### Changes - Added new files for 2023.2 scale references (only layer names were changed) instead of the symlinks; - Changed layer names for existing 2023.2 references; ### Reason for changes - Alignment with the newest OV version
### Changes As stated in the title ### Reason for changes PTQ PT CUDA test cases fail ### Related tickets 124679 ### Tests test_input_infos_respect_device_setting
…t#2250) ### Changes Fixed a regression introduced in openvinotoolkit#2196 for the object detection samples and bumped the `datasets` version for the movement sparsity tests to fix a `Loading a dataset cached in a LocalFileSystem is not supported` error in the associated test cases. ### Reason for changes Torch nightly tests fail otherwise. ### Related tickets N/A ### Tests torch_nightly
### Changes Allow the use of an external weight importance information for reordering weights of the super-network. Adds missing info in experimental schema for previously committed KD. ### Reason for changes Several advanced algorithms can produce weight importance information that outperform L1/L2 weight reordering strategies. This PR allows the use of external weight importance information to reorder the weights in the super-network. ### Related tickets N/A ### Tests Tests have been included. --------- Co-authored-by: Yuan Jinjie <[email protected]>
…vinotoolkit#2246) ### Changes - Do not filter constant nodes for torch backend in the inference graph - Fix version in requarements.txt for examples of post_training_quantization - for ssd300_vgg16 is not available to use torch 2.1.0 (failed on export to onnx Unsupported: ONNX export of operator get_pool_ceil_padding, tracing is not supporting too) - Update metrics - Add to PTEngine convert inputs to model's device to sync behavior with `create_compress_model` - Mobilenet_v2 example converting PyTorch model to IR by tracing (without onnx). - nncf.quantize for PyTorch works with copy of the target model ### Reason for changes To make PTQ work properly with disconnected graphs (like in [example](https://github.com/openvinotoolkit/nncf/blob/develop/examples/post_training_quantization/torch/ssd300_vgg16/main.py)) ### Related tickets 124417 ### Tests test_examples build 128 --------- Co-authored-by: Alexander Dokuchaev <[email protected]>
…inotoolkit#2220) ### Changes As stated in the title ### Reason for changes This doesn't seem obvious to some developers, so will state this in the style guide. ### Related tickets N/A ### Tests N/A
### Changes Introduced `nncf.torch.wrap_model(model: torch.nn.Module, example_input: Any) -> NNCFNetwork` ### Reason for changes Making it easier to obtain `NNCFNetwork`. ### Related tickets N/A ### Tests test_wrap_model.py
### Changes Networkx was updated to allow 3.1, pyparsing limitation was removed. Will now replace the disallowed colon symbols `:` during reads and writes of .dot graphs. ### Reason for changes OV is now at the networkx 3.1, and we should be aligned at least on the major version for better DX. ### Related tickets 69520 ### Tests Existing graph-checking tests
…penvinotoolkit#2253) ### Changes Supports multi-device model inference and wrapped forward functions ### Reason for changes Support tracing "bigscience/bloomz-560m" model from HF ### Related tickets N/A ### Tests test_no_self_forward, test_multidevice_model
### Changes Use built-in `tmp_path` for temporary files to fix NAS tests on Windows ### Reason for changes The PR (openvinotoolkit#2234) introduced a new test which fails on Windows with error: `PermissionError: [Errno 13] Permission denied: 'C:\\Users\\SYS_K8~1\\AppData\\Local\\Temp\\tmpmf1i25nd'` ### Related tickets 124904 ### Tests NAS tests on Windows
### Changes Allow torchvision 0.16 in the examples ### Reason for changes Otherwise the installation of the requirements for the torch examples tries to install torchvision 0.16, which pulls the torch 2.0.1 which is different from the BKC torch v2.1 ### Related tickets N/A ### Tests torch_nightly, torch E2E
) ### Changes Exclude from weight compression nodes that has more than one reduction axes ### Reason for changes There's only one model that has multiple reduction axes. It's `chatglm` with one embedding layer having [8132,32,2] shape. It was decided to not quantize this layer, since it would save just 6Mb in 4Gb model in case of int8 quantization with risk to reduce accuracy, and it can't be quantized group-wise. The idea is to switch to multiple reduction axes when it will be really needed. ### Related tickets n/a ### Tests Tested on 104 models from share with IR's for llm models. In all cases except chatglm there's a single reduction axis.
### Changes Remove logic to set device in `PTEngine`, to support multi-device model openvinotoolkit#2253
### Changes renamed name to node_name in the warning ### Reason for changes chatglm model support ### Related tickets 125045 ### Tests test_not_quantize_with_multiple_reduction_axes
daniil-lyakhov
force-pushed
the
dl/conv_layer_attrs_update
branch
3 times, most recently
from
November 15, 2023 12:39
3263f53
to
bdeb0c5
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
dependencies
documentation
Improvements or additions to documentation
experimental
NNCF Common
NNCF ONNX
NNCF OpenVINO
NNCF PT
NNCF PTQ
NNCF TF
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Reason for changes
Related tickets
Tests