We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello,
I tried to modify the pipeline_config of h2o on narrativeqa dataset in longbench using the llama3-8b-instruct model.
pipeline_config
narrativeqa
llama3-8b-instruct
The first experiment is that I add a 1x_heavy.json that force heavy_ratio = 1.0 and recent_ratio = 0.0, as the following:
1x_heavy.json
{ "pipeline_params": { "method": "h2o_longbench", "model_name": "./LLMs/Meta-Llama-3-8B-Instruct", "tokenizer_name": "./LLMs/Meta-Llama-3-8B-Instruct", "chat_template": "llama3", "model_max_len": 7500, "use_flash_attn": true, "truncation_mode": "middle", "batch_size": 1, "out_of_max_len_allowed": true, "rope_theta_factor": 1.0, "heavy_ratio": 1.0, "recent_ratio": 0.0 } }
The second experiment is that add a 1x_recent.json that force heavy_ratio = 0.0 and recent_ratio = 1.0, as the following:
1x_recent.json
{ "pipeline_params": { "method": "h2o_longbench", "model_name": "./LLMs/Meta-Llama-3-8B-Instruct", "tokenizer_name": "./LLMs/Meta-Llama-3-8B-Instruct", "chat_template": "llama3", "model_max_len": 7500, "use_flash_attn": true, "truncation_mode": "middle", "batch_size": 1, "out_of_max_len_allowed": true, "rope_theta_factor": 1.0, "heavy_ratio": 0.0, "recent_ratio": 1.0 } }
If I understood correctly, the results of these two experiments should be the same, equal to the baseline.
The baseline we got "qa_f1_score": 21.71. The first one we got "qa_f1_score": 21.71, while the second one we got "qa_f1_score": 19.6.
Could you please let me know if the designed experiment is correct or if some reasons cause this difference?
Regards! Chao
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hello,
I tried to modify the
pipeline_config
of h2o onnarrativeqa
dataset in longbench using thellama3-8b-instruct
model.The first experiment is that I add a
1x_heavy.json
that force heavy_ratio = 1.0 and recent_ratio = 0.0, as the following:The second experiment is that add a
1x_recent.json
that force heavy_ratio = 0.0 and recent_ratio = 1.0, as the following:If I understood correctly, the results of these two experiments should be the same, equal to the baseline.
The baseline we got "qa_f1_score": 21.71. The first one we got "qa_f1_score": 21.71, while the second one we got "qa_f1_score": 19.6.
Could you please let me know if the designed experiment is correct or if some reasons cause this difference?
Regards!
Chao
The text was updated successfully, but these errors were encountered: