Skip to content

Commit

Permalink
Fix text-generation example README.md (#1081)
Browse files Browse the repository at this point in the history
  • Loading branch information
shepark authored Jun 17, 2024
1 parent 595cc3e commit 9aa739b
Showing 1 changed file with 8 additions and 5 deletions.
13 changes: 8 additions & 5 deletions examples/text-generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -474,13 +474,16 @@ Below example uses `flash_attention_recompute` mode in order to reduce memory co
python ../gaudi_spawn.py --use_deepspeed --world_size 8 run_generation.py \
--model_name_or_path meta-llama/Llama-2-70b-hf \
--use_hpu_graphs \
--limit_hpu_graphs \
--use_kv_cache \
--reuse_cache \
--bf16 \
--trim_logits \
--attn_softmax_bf16 \
--max_input_tokens 31744 \
--max_new_tokens 1024 \
--batch_size=12 \
--bucket_size=128 \
--bucket_internal \
--batch_size 10 \
--max_input_tokens 40960 \
--max_new_tokens 5120 \
--use_flash_attention \
--flash_attention_recompute \
--flash_attention_causal_mask \
Expand All @@ -497,7 +500,7 @@ The evaluation of LLMs can be done using the `lm_eval.py` script. It utilizes th

For a more detailed description of parameters, please see the help message:
```
./run_lm_eval.py -h
python run_lm_eval.py --help
```


Expand Down

0 comments on commit 9aa739b

Please sign in to comment.