A relatively optimal model automatic partitioning scheme #744

ExcellentHH · 2024-03-18T12:56:43Z

ExcellentHH
Mar 18, 2024

When I first tried ezkl, I greatly admired the efforts of the ezkl project team in making the validation of ONNX models simple and accessible. However, a common issue that arose during usage was the large scale of the models to be validated, leading to constraint numbers exceeding even 2 to the power of 26. This resulted in significantly prolonged prove time and even instances of insufficient hardware memory, limiting the utility of ezkl on nodes with weaker computing capabilities. Hence, I pondered over the possibility of partitioning the models into multiple sub-models, committing shared values among them to ensure consistency. It was delightful to discover that the updates in ezkl adopted a similar approach to my thought process, employing hash functions or commitment schemes to commit intermediate feature values of the model, as shown in https://blog.ezkl.xyz/post/splitting/. Personally, I believe there is still room for optimization here, which does not conflict with the GPU acceleration supported by the current version of ezkl.

Clearly, different ways of partitioning the model will result in varying numbers of intermediate feature values to be committed. Therefore, when partitioning the model, we need to consider not only the prove and verification costs of the normal model inference process but also the additional costs arising from the intermediate feature values. Additionally, we must be mindful of the upper limit on the number of constraints that hardware can support for prove and verification.

My initial idea was to traverse the ONNX graph using a DFS or BFS approach, calculating the number of rows needed in the layout process of each layer based on parameters of different layer types (e.g., input size, kernel size, stride, padding for Conv layers), along with the additional overhead generated by commitments, in an attempt to determine a relatively optimal partitioning strategy for the model. However, I encountered challenges in calculating the rows. I am currently unsure about how to deduce the required rows based on the parameters of different layer types or to calculate the rows needed for commitments based on the number of elements to be committed. Could you provide me with some guidance on this? Alternatively, do you have any better insights on partitioning strategies?

Any insights are appreciated, thanks for the hard work!

alexander-camuto · 2024-03-22T11:48:54Z

alexander-camuto
Mar 22, 2024
Maintainer

best bet is to call gen-settings on the partitions -- this will give you an exact number of rows used. Ultimately something akin to BFS may work -- it gets a bit complex for larger / branching models (where this would really shine).

1 reply

ExcellentHH Mar 22, 2024
Author

Thank you for your reply, I will give it a try.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A relatively optimal model automatic partitioning scheme #744

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

A relatively optimal model automatic partitioning scheme #744

ExcellentHH Mar 18, 2024

Replies: 1 comment · 1 reply

alexander-camuto Mar 22, 2024 Maintainer

ExcellentHH Mar 22, 2024 Author

ExcellentHH
Mar 18, 2024

Replies: 1 comment 1 reply

alexander-camuto
Mar 22, 2024
Maintainer

ExcellentHH Mar 22, 2024
Author