[Relax] Share storage allocs among functions after cuda graph rewriting #16830

vinx13 · 2024-04-01T17:07:54Z

This PR makes storage among different functions shared after cuda graph rewriting. Because CUDA graph cache storage, storage objects are not freed after function execution, this will increase memory usage if there are multiple functions. Making storage objects shared eliminate such overhead.

It also updates rewriting to prevent capturing storages and bindings used as function output. Previous we relies on the fact output tensors are allocated with R.builtin.alloc_tensor, however, this behavior changed after we enable storage planning for output tensor, which may also use R.memory.alloc_memory

cc @tqchen

…ng (apache#16830)

github-actions bot requested a review from tqchen April 1, 2024 17:08

vinx13 force-pushed the feat/cuda-graph-merge-1 branch 3 times, most recently from 55c8b4e to 97dbcf6 Compare April 1, 2024 18:27

[Relax] Share storage allocs among functions after cuda graph rewriting

045342f

vinx13 force-pushed the feat/cuda-graph-merge-1 branch from 97dbcf6 to 045342f Compare April 1, 2024 19:27

tqchen approved these changes Apr 1, 2024

View reviewed changes

vinx13 merged commit f83a329 into apache:main Apr 2, 2024
19 checks passed

thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024

[Relax] Share storage allocs among functions after cuda graph rewriti…

25b76c0

…ng (apache#16830)

ysh329 mentioned this pull request Apr 21, 2024

[Release] v0.16.0 Release Candidate Notes #16911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relax] Share storage allocs among functions after cuda graph rewriting #16830

[Relax] Share storage allocs among functions after cuda graph rewriting #16830

vinx13 commented Apr 1, 2024

[Relax] Share storage allocs among functions after cuda graph rewriting #16830

[Relax] Share storage allocs among functions after cuda graph rewriting #16830

Conversation

vinx13 commented Apr 1, 2024