Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relax] Share storage allocs among functions after cuda graph rewriting #16830

Merged
merged 1 commit into from
Apr 2, 2024

Conversation

vinx13
Copy link
Member

@vinx13 vinx13 commented Apr 1, 2024

This PR makes storage among different functions shared after cuda graph rewriting. Because CUDA graph cache storage, storage objects are not freed after function execution, this will increase memory usage if there are multiple functions. Making storage objects shared eliminate such overhead.

It also updates rewriting to prevent capturing storages and bindings used as function output. Previous we relies on the fact output tensors are allocated with R.builtin.alloc_tensor, however, this behavior changed after we enable storage planning for output tensor, which may also use R.memory.alloc_memory

cc @tqchen

@github-actions github-actions bot requested a review from tqchen April 1, 2024 17:08
@vinx13 vinx13 force-pushed the feat/cuda-graph-merge-1 branch 3 times, most recently from 55c8b4e to 97dbcf6 Compare April 1, 2024 18:27
@vinx13 vinx13 force-pushed the feat/cuda-graph-merge-1 branch from 97dbcf6 to 045342f Compare April 1, 2024 19:27
@vinx13 vinx13 merged commit f83a329 into apache:main Apr 2, 2024
19 checks passed
thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants