Dl/conv layer attrs update #23

daniil-lyakhov · 2023-08-22T17:26:22Z

Changes

Reason for changes

Related tickets

Tests

openvinotoolkit#2073) ### Changes * ChannelAlignment algorithm is enabled by default * Biases are added only for operations that are affected by CA algorithm ### Reason for changes * To increase models mertics by using ChannelAlignment algorithm by default ### Related tickets 114328 114583 ### Tests tests/post_training/test_templates/test_channel_alignment.py is updated

Refactor smooth quant to use weights layout Tests

### Changes - Added new operation - GroupNormalization ### Reason for changes - Performance degradations that are caused by not correct quantization scheme - New operation support ### Related tickets - 119821 - 119335 ### Tests - TBD

### Changes Disable MaskRCNN and RetinaNet graph tests until ticket 119664 is resolved. The following tests are now excluded: test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_t_a_sym_t-retinanet] test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_t_a_sym_t-mask_rcnn] test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_ch_a_asym_t-retinanet] test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_ch_a_asym_t-mask_rcnn] test_compressed_graph.py::TestModelsGraph::test_magnitude_sparsity_network[retinanet] test_compressed_graph.py::TestModelsGraph::test_magnitude_sparsity_network[mask_rcnn] test_compressed_graph.py::TestModelsGraph::test_rb_sparsity_network[retinanet] test_compressed_graph.py::TestModelsGraph::test_rb_sparsity_network[mask_rcnn] test_compressed_graph.py::TestModelsGraph::test_pruning_network[retinanet] test_compressed_graph.py::test_quantize_outputs[w_sym_t_a_sym_t-retinanet] test_compressed_graph.py::test_quantize_outputs[w_sym_ch_a_asym_t-retinanet]

…toolkit#2118) ### Changes Add message about deprecation of `export_to_onnx_standard_ops` option in NNCFConfig ### Reason Recommended way to export to onnx with QuantizeLinear-DequantizeLinear node pairs is `nncf.strip(quantized_model)`.

openvinotoolkit#2115) ### Changes NNCF should not quantize GRU ops with linear_before_reset set to true, since oneDNN does not support it yet ### Reason for changes To align with POT ### Related bug openvinotoolkit#2105 ### Tests Added `test_ignore_nodes_by_attribues` for OV backend

…oolkit#2123) ### Changes Added Whisper notebook to the list of quantization samples

### Changes - Add marks `nightly` and `weakly` for tests. - Mark sanity tests as `nightly` - Split `test_functions.TestParametrized` to fast for precommit and long for nightly - Time of torch precommit reduced from 60 to 40 mins - Set `xfail` for sanity tests with `--mode train` in case of segment fault. Sporadic segment fault reproduced on torch>=2.0.0 on call `backward` function. ### Related tickets 119128

### Changes Added the link to Quantization with accuracy control using NNCF notebooks. ### Reason for changes Customer adoption ### Related tickets N/A ### Tests N/A

### Changes Fixed problem with shared weights in compression. ### Reason for changes Problem with some LLMs with shared weights. ### Related tickets ### Tests

…t#2086) ### Changes - Add support for the `dump_intermediate_model` parameter to save fully quantized model in the AAQ pipeline ### Reason for changes - Alignment with POT ### Related tickets N/A ### Tests N/A

### Changes 1. Fixed an issue with wrong `tqdm` bar length in the case when calibration dataset length is less than `subset_size`. Reproducer: nikita-savelyevv@f0951c1 **Before:** `Statistics collection: 34%|██████ | 101/300 [00:03<00:06, 28.66it/s]` **After:** When dataset has `__len__`: `Statistics collection: 100%|██████████████████| 101/101 [00:03<00:00, 28.20it/s]` When dataset doesn't have `__len__`: `Statistics collection: 34%|██████ | 101/300 [00:03<00:06, 29.45it/s]` 2. Improved progress bar GUI when ran from notebooks. **Before:** <img width="704" alt="Screenshot 2023-09-06 091857" src="https://github.com/openvinotoolkit/nncf/assets/23343961/9851cb8d-00f1-4297-af50-14697e86e961"> or (in some browsers progress bar takes up multiple lines): ![image](https://github.com/openvinotoolkit/nncf/assets/23343961/99fa9629-2869-4d8f-872e-97ef59bc092e) **After:** <img width="706" alt="Screenshot 2023-09-06 105453" src="https://github.com/openvinotoolkit/nncf/assets/23343961/58e75cc9-2507-4c5b-8c3c-cac44eefcb79"> In console the progress bar is the same. ### Reason for changes User experience improvement. ### Related tickets 112627 ### Tests

### Changes Upgrade ultralytics to 8.0.170 ### Reason for changes For some reason yolo samples started to fail. Upgrading ultralytics solves this issue because the later version contains these changes: ultralytics/ultralytics@a741961 ### Related tickets 120311 ### Tests Build 82 passed

### Changes Removal of upper bounds from `scipy` version. ### Reason for changes - `scipy<1.11.1` has security vulnerability (see ticket) - The upper bound is causing pip conflicts in openvinotoolkit/openvino#19458 ### Related tickets 117438

### Changes - Fixed behaviour in the `calibrate.py` for algos without options ### Reason for changes - Bugfix ### Related tickets - 120295 ### Tests

…m properly by make command (openvinotoolkit#2127) ### Changes All tests from `tests/experimental/{backend}/` are moved to directories `tests/{backed}/experimental` ### Reason for changes To enable this tests when make command is called. This tests are not running in precommit on current develop branch

) ### Changes Make StatisticsAggreagtor keep the original tensor share after aggregation. ### Reason for changes To add support of correct handling statistics in case batch_size > 1. ### Related tickets 121650 ### Tests All tests are updated accordingly

### Changes Skip cuda test if cuda is not available ### Reason for changes To fix CPU pre-commit ### Tests precommit_torch_cpu/169/ is finished successfully

### Changes  ### Reason for changes  ### Related tickets 117723 ### Tests

### Changes Extends `ModelInputInfo` mechanism used to specify inputs to `NNCFNetwork` for graph building/exporting - now the input info can be specified either as `FillerInputInfo`, which functions pretty much the same as before and uses NNCF config file as the source of specification for the input tensors, or as `ExactInputInfo`, which allows to specify exact forward arguments for graph building. The latter is used to build the model graph based on outputs of dataloaders attached to `NNCFConfig` in the QAT API if the "input_info" field is not specified in `NNCFConfig`, and also in the PTQ API flow to build the graph based on the output of the calibration dataset. ### Reason for changes Previously the PTQ API had to specify own `wrap_inputs_fn`, `wrap_outputs_fn`, `dummy_forward_fn` to make NNCFNetwork build its graph based on the outputs of the calibration dataloader - these functions had to be mostly copy-pasted from the QAT approach to preserve basic NNCF PT functionality such as traced tensor expiry, same tensor replication etc. The new approach allows code reuse. Also the QAT use cases where the init dataloaders are specified are made easier since "input_info" fields in the NNCFConfig may now be omitted. ### Related tickets N/A ### Tests tests.torch.test_graph_building.test_input_info_args_are_passed_into_forward tests.torch.test_graph_building.test_filler_input_info_arg_generation tests.torch.test_graph_building.test_compressed_model_creation_can_build_exact_input_infos_from_dataloader_in_config tests.torch.ptq.test_quantize_model_helpers.test_create_nncf_network_with_nncf_dataset

### Changes - Updated SmoothQuant algorithm to work with Convolution layers; ### Reason for changes - Better accuracy results in some cases; ### Related tickets - 113591 ### Tests --------- Co-authored-by: Liubov Talamanova <[email protected]>

### Changes Added `Concat` to `MULTIHEAD_ATTENTION_OUTPUT` ignored pattern for OV, ONNX, Torch backends ### Reason for changes To improve accuracy of https://huggingface.co/EleutherAI/gpt-neo-1.3B model ### Related tickets * 117617

### Changes - Added new files for 2023.2 scale references (only layer names were changed) instead of the symlinks; - Changed layer names for existing 2023.2 references; ### Reason for changes - Alignment with the newest OV version

### Changes As stated in the title ### Reason for changes PTQ PT CUDA test cases fail ### Related tickets 124679 ### Tests test_input_infos_respect_device_setting

…t#2250) ### Changes Fixed a regression introduced in openvinotoolkit#2196 for the object detection samples and bumped the `datasets` version for the movement sparsity tests to fix a `Loading a dataset cached in a LocalFileSystem is not supported` error in the associated test cases. ### Reason for changes Torch nightly tests fail otherwise. ### Related tickets N/A ### Tests torch_nightly

### Changes Allow the use of an external weight importance information for reordering weights of the super-network. Adds missing info in experimental schema for previously committed KD. ### Reason for changes Several advanced algorithms can produce weight importance information that outperform L1/L2 weight reordering strategies. This PR allows the use of external weight importance information to reorder the weights in the super-network. ### Related tickets N/A ### Tests Tests have been included. --------- Co-authored-by: Yuan Jinjie <[email protected]>

…vinotoolkit#2246) ### Changes - Do not filter constant nodes for torch backend in the inference graph - Fix version in requarements.txt for examples of post_training_quantization - for ssd300_vgg16 is not available to use torch 2.1.0 (failed on export to onnx Unsupported: ONNX export of operator get_pool_ceil_padding, tracing is not supporting too) - Update metrics - Add to PTEngine convert inputs to model's device to sync behavior with `create_compress_model` - Mobilenet_v2 example converting PyTorch model to IR by tracing (without onnx). - nncf.quantize for PyTorch works with copy of the target model ### Reason for changes To make PTQ work properly with disconnected graphs (like in [example](https://github.com/openvinotoolkit/nncf/blob/develop/examples/post_training_quantization/torch/ssd300_vgg16/main.py)) ### Related tickets 124417 ### Tests test_examples build 128 --------- Co-authored-by: Alexander Dokuchaev <[email protected]>

…inotoolkit#2220) ### Changes As stated in the title ### Reason for changes This doesn't seem obvious to some developers, so will state this in the style guide. ### Related tickets N/A ### Tests N/A

### Changes Introduced `nncf.torch.wrap_model(model: torch.nn.Module, example_input: Any) -> NNCFNetwork` ### Reason for changes Making it easier to obtain `NNCFNetwork`. ### Related tickets N/A ### Tests test_wrap_model.py

### Changes Networkx was updated to allow 3.1, pyparsing limitation was removed. Will now replace the disallowed colon symbols `:` during reads and writes of .dot graphs. ### Reason for changes OV is now at the networkx 3.1, and we should be aligned at least on the major version for better DX. ### Related tickets 69520 ### Tests Existing graph-checking tests

…penvinotoolkit#2253) ### Changes Supports multi-device model inference and wrapped forward functions ### Reason for changes Support tracing "bigscience/bloomz-560m" model from HF ### Related tickets N/A ### Tests test_no_self_forward, test_multidevice_model

### Changes Use built-in `tmp_path` for temporary files to fix NAS tests on Windows ### Reason for changes The PR (openvinotoolkit#2234) introduced a new test which fails on Windows with error: `PermissionError: [Errno 13] Permission denied: 'C:\\Users\\SYS_K8~1\\AppData\\Local\\Temp\\tmpmf1i25nd'` ### Related tickets 124904 ### Tests NAS tests on Windows

### Changes Allow torchvision 0.16 in the examples ### Reason for changes Otherwise the installation of the requirements for the torch examples tries to install torchvision 0.16, which pulls the torch 2.0.1 which is different from the BKC torch v2.1 ### Related tickets N/A ### Tests torch_nightly, torch E2E

…s_update

) ### Changes Exclude from weight compression nodes that has more than one reduction axes ### Reason for changes There's only one model that has multiple reduction axes. It's `chatglm` with one embedding layer having [8132,32,2] shape. It was decided to not quantize this layer, since it would save just 6Mb in 4Gb model in case of int8 quantization with risk to reduce accuracy, and it can't be quantized group-wise. The idea is to switch to multiple reduction axes when it will be really needed. ### Related tickets n/a ### Tests Tested on 104 models from share with IR's for llm models. In all cases except chatglm there's a single reduction axis.

…s_update

### Changes Remove logic to set device in `PTEngine`, to support multi-device model openvinotoolkit#2253

### Changes renamed name to node_name in the warning ### Reason for changes chatglm model support ### Related tickets 125045 ### Tests test_not_quantize_with_multiple_reduction_axes

…s_update

github-actions bot added NNCF Common NNCF OpenVINO NNCF PTQ labels Aug 22, 2023

daniil-lyakhov force-pushed the dl/conv_layer_attrs_update branch 2 times, most recently from 5437420 to bb403da Compare August 24, 2023 13:44

daniil-lyakhov force-pushed the dl/channel_alignment_improvements_full branch 3 times, most recently from 1eb303f to dd84e18 Compare August 25, 2023 11:19

daniil-lyakhov force-pushed the dl/conv_layer_attrs_update branch from bb403da to 9da200c Compare August 25, 2023 12:12

daniil-lyakhov force-pushed the dl/conv_layer_attrs_update branch 2 times, most recently from 300c01a to 669f0a9 Compare September 8, 2023 12:43

daniil-lyakhov and others added 18 commits September 8, 2023 14:47

Weights layout in conv/matmul layer attributes is introduced

0f5f147

Refactor smooth quant to use weights layout Tests

Fix get_matmul_channel_axes

669f0a9

Fix tests

0982ce4

Added Whisper notebook to the list of quantization samples (openvinot…

05ae916

…oolkit#2123) ### Changes Added Whisper notebook to the list of quantization samples

Update README.md (openvinotoolkit#2125)

d54d47d

### Changes Added the link to Quantization with accuracy control using NNCF notebooks. ### Reason for changes Customer adoption ### Related tickets N/A ### Tests N/A

Update ReleaseNotes with 2.6.0 (openvinotoolkit#2111)

9d8ed96

Fixed problem with shared weights in compression. (openvinotoolkit#2110)

25968cd

### Changes Fixed problem with shared weights in compression. ### Reason for changes Problem with some LLMs with shared weights. ### Related tickets ### Tests

Add support for the dump_intermediate_model parameter (openvinotoolki…

8400793

…t#2086) ### Changes - Add support for the `dump_intermediate_model` parameter to save fully quantized model in the AAQ pipeline ### Reason for changes - Alignment with POT ### Related tickets N/A ### Tests N/A

[PTQ] Fix non-optional algos in calibrate.py (openvinotoolkit#2137)

e46e1c5

### Changes - Fixed behaviour in the `calibrate.py` for algos without options ### Reason for changes - Bugfix ### Related tickets - 120295 ### Tests

kshpv and others added 25 commits November 6, 2023 07:36

[Torch] Skip cuda test if cuda is not available (openvinotoolkit#2242)

18c4471

### Changes Skip cuda test if cuda is not available ### Reason for changes To fix CPU pre-commit ### Tests precommit_torch_cpu/169/ is finished successfully

Updated readme for 4-bit weight compression (openvinotoolkit#2237)

71a60de

Correct device assignment for ExactInputsInfo (openvinotoolkit#2252)

29bfc89

### Changes As stated in the title ### Reason for changes PTQ PT CUDA test cases fail ### Related tickets 124679 ### Tests test_input_infos_respect_device_setting

Add a provisional section on pytest code style w.r.t. fixtures (openv…

b2511f1

…inotoolkit#2220) ### Changes As stated in the title ### Reason for changes This doesn't seem obvious to some developers, so will state this in the style guide. ### Related tickets N/A ### Tests N/A

Introduced nncf.torch.wrap_model(...) (openvinotoolkit#2251)

898faf1

### Changes Introduced `nncf.torch.wrap_model(model: torch.nn.Module, example_input: Any) -> NNCFNetwork` ### Reason for changes Making it easier to obtain `NNCFNetwork`. ### Related tickets N/A ### Tests test_wrap_model.py

Merge remote-tracking branch 'origin/develop' into dl/conv_layer_attr…

7744955

…s_update

Fix rebase

472007d

Merge remote-tracking branch 'origin/develop' into dl/conv_layer_attr…

b5b023e

…s_update

Remove to_device from PTEngine (openvinotoolkit#2260)

610e800

### Changes Remove logic to set device in `PTEngine`, to support multi-device model openvinotoolkit#2253

Fixed typo in the warning from compress_weights (openvinotoolkit#2267)

8efd04a

### Changes renamed name to node_name in the warning ### Reason for changes chatglm model support ### Related tickets 125045 ### Tests test_not_quantize_with_multiple_reduction_axes

Merge remote-tracking branch 'origin/develop' into dl/conv_layer_attr…

608b16e

…s_update

daniil-lyakhov force-pushed the dl/conv_layer_attrs_update branch 3 times, most recently from 3263f53 to bdeb0c5 Compare November 15, 2023 12:39

Fix SQ axis for convs in OV backend

bdeb0c5

daniil-lyakhov closed this Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dl/conv layer attrs update #23

Dl/conv layer attrs update #23

daniil-lyakhov commented Aug 22, 2023

Dl/conv layer attrs update #23

Dl/conv layer attrs update #23

Conversation

daniil-lyakhov commented Aug 22, 2023

Changes

Reason for changes

Related tickets

Tests