Heuristic for picking the chunk/batch size? #1542

gdalle · 2024-06-17T19:21:34Z

ForwardDiff has a heuristic for picking chunk size, with a default threshold of 12 dictated by memory bandwidth:

https://github.com/JuliaDiff/ForwardDiff.jl/blob/ff56092ed2960717ce45f53a90584898c232e74b/src/prelude.jl#L24-L34

https://github.com/JuliaDiff/ForwardDiff.jl/blob/ff56092ed2960717ce45f53a90584898c232e74b/src/prelude.jl#L8

Does Enzyme have something similar I could use? I seem to remember a graph showing performance as a function of chunk size, with a maximum around 8-12 as well, but it disappeared in the Slackhole

wsmoses · 2024-06-17T23:10:18Z

Not presently, but contributions welcome!

gdalle · 2024-06-18T04:33:08Z

Do you know which graph I'm talking about? Is it in some publication online?

vchuravy · 2024-06-18T10:46:56Z

I don't think it's in a publication, but around 10mins in @tgymnich has some in his talk at EnzymeCon https://youtu.be/nPN_Z5j6JDM?feature=shared

tgymnich · 2024-06-18T11:45:27Z

vchuravy · 2024-06-18T14:22:02Z

@tgymnich do you remember what machine you used for these measurements?

wsmoses · 2024-06-18T14:23:40Z

This will also now depend a lot more on the program in Julia.

for example, batching something with a linear solve will almost always be faster since we now do one linear solve to be reused for all chunks

tgymnich · 2024-06-18T14:25:49Z

@vchuravy this must have been a bare metal AWS machine provided by @wsmoses. I believe it was with AVX512.

gdalle · 2024-06-18T14:31:33Z

I'm asking because I'm including vector mode in DI, so it would be nice to have a function in Enzyme I can call to pick a decent chunk size if the user doesn't provide it. Even if the function is dumb at the moment, I feel like that's definitely something I don't want to decide myself

vchuravy · 2024-06-18T14:38:35Z

8/16 should be a safe bet.

wsmoses · 2024-06-18T14:40:18Z

I'm asking because I'm including vector mode in DI, so it would be nice to have a function in Enzyme I can call to pick a decent chunk size if the user doesn't provide it. Even if the function is dumb at the moment, I feel like that's definitely something I don't want to decide myself

Sure, open a PR to enzyme to add a function which returns 16 for now and we can add more complex analysis later.

gdalle mentioned this issue Jun 18, 2024

Implement pick_batchsize #1545

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heuristic for picking the chunk/batch size? #1542

Heuristic for picking the chunk/batch size? #1542

gdalle commented Jun 17, 2024 •

edited

Loading

wsmoses commented Jun 17, 2024

gdalle commented Jun 18, 2024

vchuravy commented Jun 18, 2024

tgymnich commented Jun 18, 2024

vchuravy commented Jun 18, 2024

wsmoses commented Jun 18, 2024

tgymnich commented Jun 18, 2024

gdalle commented Jun 18, 2024

vchuravy commented Jun 18, 2024

wsmoses commented Jun 18, 2024

Heuristic for picking the chunk/batch size? #1542

Heuristic for picking the chunk/batch size? #1542

Comments

gdalle commented Jun 17, 2024 • edited Loading

wsmoses commented Jun 17, 2024

gdalle commented Jun 18, 2024

vchuravy commented Jun 18, 2024

tgymnich commented Jun 18, 2024

vchuravy commented Jun 18, 2024

wsmoses commented Jun 18, 2024

tgymnich commented Jun 18, 2024

gdalle commented Jun 18, 2024

vchuravy commented Jun 18, 2024

wsmoses commented Jun 18, 2024

gdalle commented Jun 17, 2024 •

edited

Loading