-
Notifications
You must be signed in to change notification settings - Fork 840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added quantization to LCM notebook #1411
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@MaximProshin, @eaidova, @nikita-savelyevv please take a look |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2023-10-31T14:11:02Z Line #1. import sys import skip_kernel_extension
otherwise it is not visible |
View / edit / reply to this conversation on ReviewNB eaidova commented on 2023-10-31T14:11:03Z Line #13. def calculate_inference_time(pipeline, calibration_dataset, size=10): do you really need 128 samples for understand model performance? Maybe 10 will be enough l-bat commented on 2023-10-31T14:43:12Z I use 10 samples. 128 data files relates to internal dataset files. ngaloppo commented on 2023-11-01T21:40:11Z the l-bat commented on 2023-11-02T08:19:03Z thanks! Removed |
View / edit / reply to this conversation on ReviewNB nikita-savelyevv commented on 2023-10-31T14:26:08Z I'm curious what performance boost did you get for quantized model? Could you please add this information to PR description? l-bat commented on 2023-10-31T14:45:33Z Added |
View / edit / reply to this conversation on ReviewNB nikita-savelyevv commented on 2023-10-31T14:26:10Z Line #2. %pip install -q "openvino>=2023.1.0" transformers "diffusers>=0.21.4" pillow gradio Add nncf? l-bat commented on 2023-10-31T14:45:36Z I use default nncf version from https://github.com/openvinotoolkit/openvino_notebooks/blob/main/requirements.txt#L4 l-bat commented on 2023-10-31T21:01:52Z Added |
View / edit / reply to this conversation on ReviewNB nikita-savelyevv commented on 2023-10-31T14:26:11Z Line #23. pipeline = int8_pipe if pipe_precision.value == "INT8" else ov_pipe As I understand from previous discussions we decided to add comparison between original and quantized models to interactive demos. Shouldn't we add this functionality to this notebook too? l-bat commented on 2023-10-31T21:02:33Z Added comparison |
I use 10 samples. 128 data files relates to internal dataset files. View entire conversation on ReviewNB |
Added View entire conversation on ReviewNB |
I use default nncf version from https://github.com/openvinotoolkit/openvino_notebooks/blob/main/requirements.txt#L4 View entire conversation on ReviewNB |
Please see #1363 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I was able to run it on my Win laptop and got 1.5x vs FP16. Generated pictures are also ok compared to FP16. Imo, it's not easy to quickly compare pictures from INT8 and FP16 - is it possible to have it a part of the demo and allow the user to select the precision directly there?
Added View entire conversation on ReviewNB |
Added comparison View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB AlexKoff88 commented on 2023-11-01T11:21:41Z I would also add a note that quantizing the rest of the SD pipeline does not significantly improve inference performance but can lead to a substantial degration of accuracy. l-bat commented on 2023-11-02T08:18:48Z Added |
the View entire conversation on ReviewNB |
Added View entire conversation on ReviewNB |
thanks! Removed View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB igor-davidyuk commented on 2023-11-05T08:27:53Z Line #2. sys.path.append("../utils") Instead, do this: # Fetch notebook_utils module We try to eliminate local dependencies in openvino_notebooks so all the examples could be used standalone |
Add post-training optimization support by NNCF for UNet model from 263-latent-consistency-models-image-generation.ipynb
Pipeline performance speed up: 1.307