Add e2e tests that model switching costs and add func arg lowering for the air pipeline #566

nirvedhmeshram · 2024-07-17T17:03:31Z

Another change we are making and will have in all iree-compile commands running something beyond a single dispatch is the use of --iree-scheduling-optimize-bindings=false . This is needed because there is currently no way to pass runtime constants to generated kernel. In a typical iree backend they are used to pass two things

buffer offsets
dynamic shapes

with the above flag we wont generate offseted buffers and we dont support dynamic shapes yet so for now this flag allows us to run models where such offsets would otherwise occur.

It was also discovered that the lowering used by the air pipeline does not respect the binding number for the hal.binding ops so we lower them before the AIRRtToNpuPass pass can do it.

newling · 2024-07-17T19:27:48Z

There are 2 tests which repeat the same matmul N times. Can you update the comments in the tests specifying what they're testing, so that it's clear they're both useful? Have you run this locally with success? CI has numerical errors which aren't zeros... is the offset thing not working?

FYI I created a task which is slightly related (compiler side though, not runtime) #541

I think these are useful numerical tests, so happy include them here. But for benchmarking runtime overheads, should we also have some separate code, something along the lines of #415 ?

nirvedhmeshram · 2024-07-17T19:50:33Z

There are 2 tests which repeat the same matmul N times. Can you update the comments in the tests specifying what they're testing, so that it's clear they're both useful? Have you run this locally with success? CI has numerical errors which aren't zeros... is the offset thing not working?

Yes I can update the comments, what they are testing is that what happens if you dont switch and just keep calling the same kernel, so they establish the baseline for the switching case, Ya the CI caught an error that I didnt locally becuase I was using uniform inputs, its a fun one where we are wrongly swapping argument locations on the hal.interface.bindings (kind of stuff that is tickled with these "serious" models :) )

FYI I created a task which is slightly related (compiler side though, not runtime) #541

I think these are useful numerical tests, so happy include them here. But for benchmarking runtime overheads, should we also have some separate code, something along the lines of #415 ?

The way I am using these tests are with Tracy profiles which give me ns precision timing of all runtime functions,. Of course #415 is useful for the whole model performance benchmarking though, so we should have that for all these "models" too.

newling · 2024-07-18T16:13:59Z

Just an FYI: I'm working on changing the script to allow golden values from other sources (arbitrary python) which might be useful if we don't fully trust the llvm-cpu backend.

nirvedhmeshram · 2024-07-18T16:34:44Z

Just an FYI: I'm working on changing the script to allow golden values from other sources (arbitrary python) which might be useful if we don't fully trust the llvm-cpu backend.

so far havent seen any issues but thats a good feature!

newling · 2024-07-18T18:25:11Z

Thanks, looks good to me. I've left only minor suggestions, so accepting in advance.

It was also discovered that the lowering used by the air pipeline does not respect the binding number for the hal.binding ops so we lower them before the AIRRtToNpuPass pass can do it.

Nice fix! I would probably have done this as a separate PR, but they're both pretty small so fine.

compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIELowerFuncArgs.cpp

nirvedhmeshram force-pushed the nm_add_pdi_switch_e2e_tests branch 2 times, most recently from 96adfab to 576f8e0 Compare July 17, 2024 17:08

Add e2e tests that model switching costs

c24f9bb

nirvedhmeshram force-pushed the nm_add_pdi_switch_e2e_tests branch from 576f8e0 to c24f9bb Compare July 17, 2024 17:08

disable data tiling to see if cpu is actually wrong

01838fd

Add func arg pass and address reviwer comments

314c3ae

nirvedhmeshram requested review from MaheshRavishankar, yzhang93, Abhishek-Varma and jtuyls as code owners July 18, 2024 16:31

Merge branch 'main' into nm_add_pdi_switch_e2e_tests

9e8e992

nirvedhmeshram changed the title ~~Add e2e tests that model switching costs~~ Add e2e tests that model switching costs and add func arg lowering for the air pipeline Jul 18, 2024

nirvedhmeshram added 2 commits July 18, 2024 10:38

self review

a6e4dac

clang-format

4a87134

nirvedhmeshram requested a review from newling July 18, 2024 16:54

newling approved these changes Jul 18, 2024

View reviewed changes

nirvedhmeshram added 2 commits July 18, 2024 12:55

address reviwer comments

0cc654e

clang-format

d5849e7

nirvedhmeshram merged commit 2280d55 into main Jul 18, 2024
2 checks passed

nirvedhmeshram deleted the nm_add_pdi_switch_e2e_tests branch July 18, 2024 20:10

nirvedhmeshram mentioned this pull request Jul 19, 2024

multiple calls to same dispatch doesnt work #518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add e2e tests that model switching costs and add func arg lowering for the air pipeline #566

Add e2e tests that model switching costs and add func arg lowering for the air pipeline #566

nirvedhmeshram commented Jul 17, 2024 •

edited

Loading

newling commented Jul 17, 2024

nirvedhmeshram commented Jul 17, 2024

newling commented Jul 18, 2024

nirvedhmeshram commented Jul 18, 2024 •

edited

Loading

newling commented Jul 18, 2024

Add e2e tests that model switching costs and add func arg lowering for the air pipeline #566

Add e2e tests that model switching costs and add func arg lowering for the air pipeline #566

Conversation

nirvedhmeshram commented Jul 17, 2024 • edited Loading

newling commented Jul 17, 2024

nirvedhmeshram commented Jul 17, 2024

newling commented Jul 18, 2024

nirvedhmeshram commented Jul 18, 2024 • edited Loading

newling commented Jul 18, 2024

nirvedhmeshram commented Jul 17, 2024 •

edited

Loading

nirvedhmeshram commented Jul 18, 2024 •

edited

Loading