Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sarkar/growing bucket for beam #450

Merged
merged 6 commits into from
Oct 26, 2023
Merged

Conversation

ssarkar2
Copy link
Collaborator

@ssarkar2 ssarkar2 commented Oct 4, 2023

What does this PR do?

Extends growing bucket optimization from greedy to beam.

Original PR for growing bucket optimization for greedy here

Testing in progress

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 4, 2023

The documentation is not available anymore as the PR was closed or merged.

@ssarkar2
Copy link
Collaborator Author

@regisss : can you please take a look at this PR. This ports this PR from greedy to beam.
Perf/accuracy tests have been done internally, and they look good.

This is not necessary for 1.12, which I understand is your priority right now

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I just left a few minor comments.
Could you also share a command line that I can quickly run to test it please? @ssarkar2

examples/text-generation/README.md Outdated Show resolved Hide resolved
optimum/habana/transformers/generation/utils.py Outdated Show resolved Hide resolved
optimum/habana/transformers/generation/utils.py Outdated Show resolved Hide resolved
optimum/habana/transformers/generation/utils.py Outdated Show resolved Hide resolved
@ssarkar2
Copy link
Collaborator Author

ssarkar2 commented Oct 17, 2023

@regisss : Incorporated ur suggestions. Here are commands for testing:

1. Run with a dataset (--dataset_name squad --column_name context ) for 10 steps with bs=1 (--dataset_max_samples 10 --batch 1 ) without cropping prompt (--max_input_tokens -1)
beams is enabled by --num_beams 2

python run_generation.py --model_name_or_path path_to_model --use_hpu_graphs --use_kv_cache --max_new_tokens 256 --batch 1 --num_beams 2 --dataset_name squad --column_name context --max_input_tokens -1 --dataset_max_samples 10

Adding "--bucket 50" to enable current PR code
python run_generation.py --model_name_or_path path_to_model --use_hpu_graphs --use_kv_cache --max_new_tokens 256 --batch 1 --num_beams 2 --dataset_name squad --column_name context --max_input_tokens -1 --dataset_max_samples 10 --bucket 50

the bucketed one is faster than the base one

2. Single prompt test:
python run_generation.py --model_name_or_path path_to_model --use_hpu_graphs --use_kv_cache --max_new_tokens 256 --batch 16 --num_beams 2

python run_generation.py --model_name_or_path path_to_model --use_hpu_graphs --use_kv_cache --max_new_tokens 256 --batch 16 --num_beams 2 --bucket 50

Here we may see some improvement, but maybe not too much difference. Could depend on model, promptlen. max_new_tokens etc

This change helps reduce num compilations, so it is more useful in test 1 (dataset, with different input shapes)

models tested: opt-350m, llama-7b, bloom-7b

@ssarkar2 ssarkar2 requested a review from regisss October 17, 2023 17:49
@ssarkar2 ssarkar2 added the run-test Run CI for PRs from external contributors label Oct 17, 2023
@ssarkar2
Copy link
Collaborator Author

@regisss , i see some distributed tests fail in "Unit and integration tests / Run tests for optimum.habana.transformers (pull_request_target)". Are these expected failures, or introduced by my PR?

@regisss
Copy link
Collaborator

regisss commented Oct 20, 2023

@regisss , i see some distributed tests fail in "Unit and integration tests / Run tests for optimum.habana.transformers (pull_request_target)". Are these expected failures, or introduced by my PR?

I've seen these errors in the CI of another PR that was not related to distributed training either. That's a bit weird but that's independent of your PR. I'm going to relaunch these tests.

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just spotted one comment to update. I'm going to test it quickly and then it should be good to merge!

examples/text-generation/run_generation.py Outdated Show resolved Hide resolved
@ssarkar2
Copy link
Collaborator Author

@regisss can we merge this?

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@regisss regisss merged commit b935935 into main Oct 26, 2023
9 checks passed
@regisss regisss deleted the sarkar/growing_bucket_for_beam branch October 26, 2023 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants