PoC of lowering compilation time using Python threading #3

p-wysocki · 2023-09-11T14:53:04Z

Changes

implemented Python threading in order to speed up model compilation
removed the need to serialize the model after compilation
using https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/122-quantizing-model-with-accuracy-control/122-speech-recognition-quantization-wav2vec2.ipynb as a benchmark, the step:
INFO:nncf:Calculating ranking score for groups of quantizers has been shortened from ~4 minutes to a little over 2 minutes
of the remaining 2 minutes, the compilation takes only ~12 seconds on my machine

Related tickets

119274

…ilation

alexsu52 · 2023-09-12T06:55:14Z

nncf/quantization/algorithms/accuracy_control/ranker.py

+                current_group.operations, current_group.quantizers, quantized_model, quantized_model_graph
+            )
+
+            modified_models.append(modified_model)


A general comment is that the proposed solution is not memory optimal. What I mean is that this solution requires storing as many copies of the model in memory as there are groups in groups_to_rank. For some models, the number of groups for ranking is in the hundreds. Probably, algorithm can be crashed by memory for huge models.

alexsu52 and others added 11 commits July 17, 2023 21:54

speed-up calculation of FQ important scores

2e987f5

pylint happy

c81790b

replied to comments

ffa15d9

replied to comments

9e4eb58

debug

bf1513b

wip

c8d2b44

cleanup

babd750

Remove unnecessary import

968d2a3

Merge remote-tracking branch 'alexsu52/develop' into debug_async_comp…

5cfdc2b

…ilation

cleanup

8230208

Minor changes

023267b

github-actions bot added NNCF OpenVINO NNCF PTQ labels Sep 11, 2023

alexsu52 reviewed Sep 12, 2023

View reviewed changes

alexsu52 self-requested a review September 21, 2023 05:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PoC of lowering compilation time using Python threading #3

PoC of lowering compilation time using Python threading #3

p-wysocki commented Sep 11, 2023

alexsu52 Sep 12, 2023

PoC of lowering compilation time using Python threading #3

Are you sure you want to change the base?

PoC of lowering compilation time using Python threading #3

Conversation

p-wysocki commented Sep 11, 2023

Changes

Related tickets

alexsu52 Sep 12, 2023

Choose a reason for hiding this comment