change deafult `use_cache` param to `True`, to align with the former implementation and make CI pass #1343

kaixuanliu · 2024-09-19T08:36:02Z

No description provided.

implementation and make CI pass Signed-off-by: kaixuanliu <[email protected]>

kaixuanliu · 2024-09-19T08:39:27Z

In #1292, args.use_kv_cache was set to False by default, which will greatly slow down the performance, and cause CI failed.

kaixuanliu · 2024-09-19T08:41:56Z

@regisss @libinta Pls help review

skaulintel · 2024-09-21T00:16:58Z

@kaixuanliu

Hi kaixun. With this PR, we still see 3/10 lava tests fail with the below messages.

FAILED tests/test_image_to_text_example.py::test_image_to_text_bf16[token0-llava-hf/llava-1.5-7b-hf-1-87.2901500056982] - assert 79.04646425810226 >= ((2 - 1.05) * 87.2901500056982)
FAILED tests/test_image_to_text_example.py::test_image_to_text_fp8[token0-llava-hf/llava-1.5-7b-hf-1-115.48515989461843] - assert 102.4314524915627 >= ((2 - 1.05) * 115.48515989461843)
FAILED tests/test_image_to_text_example.py::test_image_to_text_fp8[token0-llava-hf/llava-1.5-13b-hf-1-78.2635142547838] - assert 69.15517294520089 >= ((2 - 1.05) * 78.2635142547838)

Signed-off-by: kaixuanliu <[email protected]>

tthakkal · 2024-09-26T19:37:48Z

examples/image-to-text/run_pipeline.py

+        type=str2bool,
+        default=True,


Suggested change

type=str2bool,

default=True,

action="store_true",

default=None,

this should work, don't need to set to True by default

This cannot solve the problem, as if we set default to None, it will pass down this param to modeling part, and change the value of use_cache in generation_config here: L671, which will slow down the performance as well.

imangohari1 · 2024-09-26T20:23:28Z

@skaulintel @vidyasiv
I am not quite sure that the threshould should be adjusted.
these tests have actually passed before on 1.17(CI#160) so we need to at least understand the cause.

Signed-off-by: kaixuanliu <[email protected]>

vidyasiv · 2024-09-30T16:24:30Z

@skaulintel @vidyasiv I am not quite sure that the threshould should be adjusted. these tests have actually passed before on 1.17(CI#160) so we need to at least understand the cause.

Updating the padding to 0 (already merged) lowers performance and improves accuracy per testing for #1366 for llava 1.5 models so we have to update perf.

vidyasiv · 2024-09-30T16:26:30Z

@regisss PR #1343 and #1366 are now identical.

regisss · 2024-09-30T16:29:38Z

@regisss PR #1343 and #1366 are now identical.

Okay, let's close this one then since #1366 is better documented.

kaixuanliu requested a review from regisss as a code owner September 19, 2024 08:36

change deafult use_cache param to True, to align with the former

571ca1b

implementation and make CI pass Signed-off-by: kaixuanliu <[email protected]>

adjust performance limit for CI test to match latest driver

bce52d7

Signed-off-by: kaixuanliu <[email protected]>

libinta added the synapse1.18 label Sep 24, 2024

vidyasiv approved these changes Sep 26, 2024

View reviewed changes

tthakkal approved these changes Sep 26, 2024

View reviewed changes

libinta added the run-test Run CI for PRs from external contributors label Sep 26, 2024

tthakkal suggested changes Sep 26, 2024

View reviewed changes

yafshar mentioned this pull request Sep 26, 2024

Only pass the use_kv_cache True to generator #1366

Merged

3 tasks

refine code

9f3bc16

Signed-off-by: kaixuanliu <[email protected]>

regisss closed this Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change deafult `use_cache` param to `True`, to align with the former implementation and make CI pass #1343

change deafult `use_cache` param to `True`, to align with the former implementation and make CI pass #1343

kaixuanliu commented Sep 19, 2024

kaixuanliu commented Sep 19, 2024

kaixuanliu commented Sep 19, 2024

skaulintel commented Sep 21, 2024 •

edited

Loading

tthakkal Sep 26, 2024

kaixuanliu Sep 29, 2024 •

edited

Loading

imangohari1 commented Sep 26, 2024

vidyasiv commented Sep 30, 2024

vidyasiv commented Sep 30, 2024

regisss commented Sep 30, 2024

change deafult use_cache param to True, to align with the former implementation and make CI pass #1343

change deafult use_cache param to True, to align with the former implementation and make CI pass #1343

Conversation

kaixuanliu commented Sep 19, 2024

kaixuanliu commented Sep 19, 2024

kaixuanliu commented Sep 19, 2024

skaulintel commented Sep 21, 2024 • edited Loading

tthakkal Sep 26, 2024

Choose a reason for hiding this comment

kaixuanliu Sep 29, 2024 • edited Loading

Choose a reason for hiding this comment

imangohari1 commented Sep 26, 2024

vidyasiv commented Sep 30, 2024

vidyasiv commented Sep 30, 2024

regisss commented Sep 30, 2024

change deafult `use_cache` param to `True`, to align with the former implementation and make CI pass #1343

change deafult `use_cache` param to `True`, to align with the former implementation and make CI pass #1343

skaulintel commented Sep 21, 2024 •

edited

Loading

kaixuanliu Sep 29, 2024 •

edited

Loading