Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change deafult use_cache param to True, to align with the former implementation and make CI pass #1343

Closed
wants to merge 3 commits into from

Conversation

kaixuanliu
Copy link
Contributor

No description provided.

implementation and make CI pass

Signed-off-by: kaixuanliu <[email protected]>
@kaixuanliu
Copy link
Contributor Author

In #1292, args.use_kv_cache was set to False by default, which will greatly slow down the performance, and cause CI failed.

@kaixuanliu
Copy link
Contributor Author

@regisss @libinta Pls help review

@skaulintel
Copy link
Collaborator

skaulintel commented Sep 21, 2024

@kaixuanliu

Hi kaixun. With this PR, we still see 3/10 lava tests fail with the below messages.

FAILED tests/test_image_to_text_example.py::test_image_to_text_bf16[token0-llava-hf/llava-1.5-7b-hf-1-87.2901500056982] - assert 79.04646425810226 >= ((2 - 1.05) * 87.2901500056982)
FAILED tests/test_image_to_text_example.py::test_image_to_text_fp8[token0-llava-hf/llava-1.5-7b-hf-1-115.48515989461843] - assert 102.4314524915627 >= ((2 - 1.05) * 115.48515989461843)
FAILED tests/test_image_to_text_example.py::test_image_to_text_fp8[token0-llava-hf/llava-1.5-13b-hf-1-78.2635142547838] - assert 69.15517294520089 >= ((2 - 1.05) * 78.2635142547838)

@libinta libinta added the run-test Run CI for PRs from external contributors label Sep 26, 2024
Comment on lines 157 to 158
type=str2bool,
default=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type=str2bool,
default=True,
action="store_true",
default=None,

this should work, don't need to set to True by default

Copy link
Contributor Author

@kaixuanliu kaixuanliu Sep 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot solve the problem, as if we set default to None, it will pass down this param to modeling part, and change the value of use_cache in generation_config here: L671, which will slow down the performance as well.

@imangohari1
Copy link
Contributor

@skaulintel @vidyasiv
I am not quite sure that the threshould should be adjusted.
these tests have actually passed before on 1.17(CI#160) so we need to at least understand the cause.

Signed-off-by: kaixuanliu <[email protected]>
@vidyasiv
Copy link
Contributor

@skaulintel @vidyasiv I am not quite sure that the threshould should be adjusted. these tests have actually passed before on 1.17(CI#160) so we need to at least understand the cause.

Updating the padding to 0 (already merged) lowers performance and improves accuracy per testing for #1366 for llava 1.5 models so we have to update perf.

@vidyasiv
Copy link
Contributor

@regisss PR #1343 and #1366 are now identical.

@regisss
Copy link
Collaborator

regisss commented Sep 30, 2024

@regisss PR #1343 and #1366 are now identical.

Okay, let's close this one then since #1366 is better documented.

@regisss regisss closed this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors synapse1.18
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants