Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VM] [Hexagon] Introduce 2D Discontiguous vtcm alloc tensor #16557

Closed

Conversation

quic-sanirudh
Copy link
Contributor

Adds 2D Discontiguous alloc tensor hexagon builtin to support 2D allocations for hexagon at relax level. This is needed when the ops are implemented to take advantage of 2d indirections and enables memory manager optimizations to try utilize VTCM memory efficiently.

This patch also introduces the R.vm.copy_tensor op to support copies between different tensors, specifically planned to be used when copying tensors from one memory scope to another

@quic-sanirudh
Copy link
Contributor Author

Copy link
Contributor

@rasagna-quic rasagna-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Anirudh

Copy link
Contributor

@abhikran-quic abhikran-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @quic-sanirudh !

@tqchen
Copy link
Member

tqchen commented Feb 14, 2024

sorry for getting into this late. I see we introduced some metadata by having a map from pointer to metadata, @quic-sanirudh , an alternative would be to introduce a HexagonBuffer object that contains the meta data like OpenCL
https://github.com/apache/tvm/blob/main/src/runtime/opencl/opencl_common.h#L382

@quic-sanirudh
Copy link
Contributor Author

sorry for getting into this late. I see we introduced some metadata by having a map from pointer to metadata, @quic-sanirudh , an alternative would be to introduce a HexagonBuffer object that contains the meta data like OpenCL https://github.com/apache/tvm/blob/main/src/runtime/opencl/opencl_common.h#L382

Thanks for the review @tqchen. I guess you're referring to the pointer table we allocate to store 2D tensors. What we want is to allocate 2D physical tensors where the first dimension is just a list of pointers to blocks of data. This is planned so that we can take full advantage of the very little VTCM memory we have in hexagon. I'll take a look at the OpenCL code mentioned here and see if I can refactor my patch based on the ideas there, thanks.

@quic-sanirudh
Copy link
Contributor Author

Hi @tqchen, I've gone through the OpenCL code a little bit and as far as I understood, that BufferDescriptor metadata stored there is similar to the physical shape we use in this PR. We could refactor the code to follow a similar structure, but the main intention in this PR was to enable this alloc_discontiguous_tensor builtin as a runtime call to support these pointer offsets, because we would like to allocate tensors as a discontiguous chunk in VTCM memory because of its size restrictions.

If it's okay, could you provide feedback on the introduction of copy_tensor part as well, and perhaps we can merge this one if that looks okay and we'll refactor the hexagon_device_api in a follow-up PR so that it's not blocking some related work downstream.

@@ -199,6 +215,9 @@ class HexagonDeviceAPI final : public DeviceAPI {

//! \brief Hexagon power manager
std::unique_ptr<HexagonPowerManager> runtime_power_manager;

//! \brief NDArray base -> Physical Shape map
std::map<void*, PhysicalShape> ndarray_physical_shape;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually we prefer unordered_map over map.

my main comment is to embed the metadata like physicalshape into the Buffer data structure in void* data field, so we can remove this map altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @tqchen. Yes it would be nice to do this change, I wanted to do it in a follow-up PR because I would like to refactor many things in the hexagon_device_api along with this.

I want to do a bigger cleanup because basically this map is populated with 2D tensors that are allocated in 2 different ways (one is when 2D storage and its tensor is directly allocated and second is through the alloc_discontiguous_tensor introduced in this PR).
Both are needed for different use cases and I just did not want to include code refactor in this PR, but I'll definitely introduce a follow-up PR and fix this.

src/runtime/relax_vm/builtin.cc Outdated Show resolved Hide resolved
src/runtime/relax_vm/hexagon/builtin.cc Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_api.cc Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_buffer.h Outdated Show resolved Hide resolved
@tqchen
Copy link
Member

tqchen commented Feb 19, 2024

Thanks @quic-sanirudh . I did another round provided all the feedbacks, let me know if they make sense

@quic-sanirudh quic-sanirudh force-pushed the add_2d_allocations_copying branch from bb61e41 to 6902f5a Compare February 20, 2024 09:21
@quic-sanirudh
Copy link
Contributor Author

Thanks @quic-sanirudh . I did another round provided all the feedbacks, let me know if they make sense

Thanks a lot @tqchen for taking the time to review. I've made all the changes suggested. Let me know if it looks good now.

@quic-sanirudh quic-sanirudh force-pushed the add_2d_allocations_copying branch from 6902f5a to 218fce3 Compare February 20, 2024 11:25
@quic-sanirudh
Copy link
Contributor Author

@tvm-bot rerun

@tqchen
Copy link
Member

tqchen commented Feb 20, 2024

happy to get it in @quic-sanirudh to unblock folks, let us followup and try remove the phyiscal map by embedding them in buffer descriptor

@quic-sanirudh
Copy link
Contributor Author

happy to get it in @quic-sanirudh to unblock folks, let us followup and try remove the phyiscal map by embedding them in buffer descriptor

Thanks a lot for understanding. I'll follow-up as soon as I can with the change to remove the physical map.

@quic-sanirudh
Copy link
Contributor Author

@tvm-bot rerun

@quic-sanirudh quic-sanirudh force-pushed the add_2d_allocations_copying branch from 218fce3 to d4a085a Compare February 21, 2024 15:49
Adds 2D Discontiguous alloc tensor hexagon builtin to support 2D
allocations for hexagon at relax level. This is needed when the ops are
implemented to take advantage of 2d indirections and enables
memory manager optimizations to try utilize VTCM memory efficiently.

This patch also introduces the `R.vm.copy_tensor` op to support copies
between different tensors, specifically planned to be used when copying
tensors from one memory scope to another

Co-authored-by: arangasa <[email protected]>
@quic-sanirudh quic-sanirudh force-pushed the add_2d_allocations_copying branch from d4a085a to 6a836a1 Compare February 29, 2024 14:26
@tqchen
Copy link
Member

tqchen commented Mar 4, 2024

@tvm-bot rerun

@quic-sanirudh
Copy link
Contributor Author

This patch seems to have some issue in the build that I'm unable to reproduce locally in my setup (Those tests seem to pass in my setup).. I'll keep this open a while and try to debug the issue through the CI docker setup locally and fix it before retriggering the build.

@quic-sanirudh
Copy link
Contributor Author

This patch had some data race in the hexagon device api that needs to be investigated. I'm working on a cleaned up version that I'll raise in a separate follow-up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants