-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sarkar/growing bucket for beam #450
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I just left a few minor comments.
Could you also share a command line that I can quickly run to test it please? @ssarkar2
@regisss : Incorporated ur suggestions. Here are commands for testing: 1. Run with a dataset (--dataset_name squad --column_name context ) for 10 steps with bs=1 (--dataset_max_samples 10 --batch 1 ) without cropping prompt (--max_input_tokens -1) python run_generation.py --model_name_or_path path_to_model --use_hpu_graphs --use_kv_cache --max_new_tokens 256 --batch 1 --num_beams 2 --dataset_name squad --column_name context --max_input_tokens -1 --dataset_max_samples 10 Adding "--bucket 50" to enable current PR code the bucketed one is faster than the base one 2. Single prompt test: python run_generation.py --model_name_or_path path_to_model --use_hpu_graphs --use_kv_cache --max_new_tokens 256 --batch 16 --num_beams 2 --bucket 50 Here we may see some improvement, but maybe not too much difference. Could depend on model, promptlen. max_new_tokens etc This change helps reduce num compilations, so it is more useful in test 1 (dataset, with different input shapes) models tested: opt-350m, llama-7b, bloom-7b |
@regisss , i see some distributed tests fail in "Unit and integration tests / Run tests for optimum.habana.transformers (pull_request_target)". Are these expected failures, or introduced by my PR? |
I've seen these errors in the CI of another PR that was not related to distributed training either. That's a bit weird but that's independent of your PR. I'm going to relaunch these tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just spotted one comment to update. I'm going to test it quickly and then it should be good to merge!
@regisss can we merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What does this PR do?
Extends growing bucket optimization from greedy to beam.
Original PR for growing bucket optimization for greedy here
Testing in progress
Fixes # (issue)
Before submitting