add end to end llama test, including generating and running vmfb #224

renxida · 2023-12-07T01:08:55Z

Also:

fixes some errors caused by IREE version bump
make gen_external_params use google fire to eliminate the argparse sprawl and make the python api & cli api consistent.

python/turbine_models/custom_models/stateless_llama.py

python/turbine_models/gen_external_params/gen_external_params.py

python/turbine_models/tests/conftest.py

IanNod · 2023-12-07T01:42:14Z

python/turbine_models/tests/stateless_llama_test.py

+
+def test_export(quantization: Literal["int4", None], precision: Literal["f16", "f32"]):
+    llama.export_transformer_model(
+        hf_model_name="llSourcell/medllama2_7b",


I think this model is actually slightly different than the meta version. Might be better to use "Trelis/Llama-2-7b-chat-hf-function-calling-v2" or a secret hf token for the actual meta model we use

python/turbine_models/tests/stateless_llama_test.py

python/turbine_models/gen_external_params/gen_external_params.py

python/turbine_models/custom_models/stateless_llama.py

python/turbine_models/tests/stateless_llama_test.py

IanNod · 2023-12-07T01:54:52Z

python/turbine_models/tests/stateless_llama_test.py

+    args.external_weight_file = "medllama2_7b_f16_int4.safetensors"
+    args.run_vmfb = True
+    args.device = "llvm-cpu"
+    args.precision = precision


are these flags set from the pytest_generate_tests in conftest.py? If so this may be a very long test for a ci as each config will probably be at least 5 min

Yup they are. But it only runs one combo unless the --all flag is passed to pytest.

We can do regular pytest to run only one config for usual tests, and then pass the all flag for e.g. monthly releases.

Cool. Yeah for this CI I think we just want to run one config and eventually have a nightly CI that runs all or at least a larger subset of configs

Would like to get the CPU ci in ASAP, and then take some time later to make a more comprehensive CI.

renxida

Thanks @IanNod ! I'll make a push later today to:

Switch Default Model: Set Trelis/Llama-2-7b-chat-hf-function-calling-v2 as the default model.
Enhance Tests: Develop tests to check for both crashes and functional correctness.
Update Dependencies: Add fire to requirements.txt if included in the project.
Improve run_vmfb_comparison: Update it to automatically detect and report output discrepancies.

I'll also see if I can get away with fewer default flags.

renxida · 2023-12-07T17:07:08Z

python/turbine_models/tests/stateless_llama_test.py

+    args.external_weight_file = "medllama2_7b_f16_int4.safetensors"
+    args.run_vmfb = True
+    args.device = "llvm-cpu"
+    args.precision = precision


Yup they are. But it only runs one combo unless the --all flag is passed to pytest.

We can do regular pytest to run only one config for usual tests, and then pass the all flag for e.g. monthly releases.

python/turbine_models/tests/stateless_llama_test.py

- Saves weights to .safetensors file - Load weights at runtime with a "stripped" .mlir

…y fail tests on comparison fail

…n argparse and function params

dan-garvey

you need to rebase. Once you do we can see how long this takes. I'm concerned that comparing to torch is overkill

.github/workflows/test_models.yml

…y fail tests on comparison fail

…n argparse and function params

python/turbine_models/custom_models/stateless_llama.py

python/turbine_models/gen_external_params/gen_external_params.py

python/turbine_models/tests/conftest.py

python/turbine_models/tests/vmfb_comparison.py

… and the other unquantized f32

renxida · 2023-12-11T15:22:49Z

Requested changes addressed, just need a bigger CI (we currently have 86GB of mem requirement, the CI machine has 62, and a 128 GB ci machine would probably be pretty future-proof)

Done rebasing.

Xida Ren added 3 commits December 7, 2023 01:06

add end to end llama test, including generating and running vmfb

27e68b4

adjust naming of tests to look clearer

d6a29dd

fix formatting with black

945e9ea

renxida requested a review from IanNod December 7, 2023 01:16

IanNod reviewed Dec 7, 2023

View reviewed changes

renxida mentioned this pull request Dec 7, 2023

Fixes cpu flag update for stateless llama from iree bump #226

Merged

renxida commented Dec 7, 2023

View reviewed changes

IanNod and others added 2 commits December 7, 2023 11:48

Fixes cpu flag update for stateless llama from iree bump (nod-ai#226)

1981cd6

Stable Diffusion using aot.export and external parameters (nod-ai#217)

b5a6192

- Saves weights to .safetensors file - Load weights at runtime with a "stripped" .mlir

IanNod requested a review from dan-garvey December 7, 2023 18:19

Xida Ren and others added 5 commits December 7, 2023 18:21

fold run_vmfb_comparison into python/turbine_models/tests and actuall…

f1aa879

…y fail tests on comparison fail

Adds tests for gen_external_params.py quantize function (nod-ai#225)

9395248

remove python-fire dependency

f6c8008

remove unnecessary vulcan max_alloc and rename for consistency betwee…

612556a

…n argparse and function params

black

200dbe5

dan-garvey previously requested changes Dec 7, 2023

View reviewed changes

.github/workflows/test_models.yml Show resolved Hide resolved

Xida Ren added 8 commits December 7, 2023 19:10

resolve merge conflicts

4f58d78

adjust naming of tests to look clearer

3ab45d1

fix formatting with black

e552a1a

fold run_vmfb_comparison into python/turbine_models/tests and actuall…

23fd558

…y fail tests on comparison fail

remove python-fire dependency

8b32886

remove unnecessary vulcan max_alloc and rename for consistency betwee…

75a4fc7

…n argparse and function params

black

48e2685

resolve merge conflict

e139f82

IanNod reviewed Dec 7, 2023

View reviewed changes

Xida Ren added 5 commits December 7, 2023 22:48

fix discrepancy between vmfb and torch due to one being quantized f16…

476b60a

… and the other unquantized f32

finally test cases passed and black applied

2289373

make llama and sd test separate steps

03c022b

typo

c00bc24

show mem availability

91b3b72

renxida added 3 commits December 8, 2023 16:31

fix device issue

5e31126

move args back to beginnign of file

cf4b4b1

remove MAX_STEP_SEQ which is not needed for VMFB_COMPARISON

979321c

renxida requested a review from dan-garvey December 11, 2023 15:22

renxida added 3 commits December 11, 2023 17:39

black

b000642

only parse args when in main

ad865e2

black

3883224

dan-garvey approved these changes Dec 11, 2023

View reviewed changes

dan-garvey merged commit 4a9976a into nod-ai:main Dec 11, 2023
3 checks passed

renxida deleted the end-to-end-test branch December 15, 2023 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add end to end llama test, including generating and running vmfb #224

add end to end llama test, including generating and running vmfb #224

renxida commented Dec 7, 2023

IanNod Dec 7, 2023

IanNod Dec 7, 2023

renxida Dec 7, 2023

IanNod Dec 7, 2023

renxida Dec 11, 2023

renxida left a comment

renxida Dec 7, 2023

dan-garvey left a comment

renxida commented Dec 11, 2023

add end to end llama test, including generating and running vmfb #224

add end to end llama test, including generating and running vmfb #224

Conversation

renxida commented Dec 7, 2023

IanNod Dec 7, 2023

Choose a reason for hiding this comment

IanNod Dec 7, 2023

Choose a reason for hiding this comment

renxida Dec 7, 2023

Choose a reason for hiding this comment

IanNod Dec 7, 2023

Choose a reason for hiding this comment

renxida Dec 11, 2023

Choose a reason for hiding this comment

renxida left a comment

Choose a reason for hiding this comment

renxida Dec 7, 2023

Choose a reason for hiding this comment

dan-garvey left a comment

Choose a reason for hiding this comment

renxida commented Dec 11, 2023