Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NNPA] Memory reduction of stickified constant by stickifying at file writing #2917

Open
wants to merge 73 commits into
base: main
Choose a base branch
from

Conversation

imaihal
Copy link
Collaborator

@imaihal imaihal commented Aug 26, 2024

This PR reduces memory usage for NNPA compilation. Current main branch creates stickified data in ZHighConstPropagationPass and keeps the data until compilation finish. This PR sets original data, not stickified data, in ZHighConstPropagationPass. Then, in the KrnlToLLVMPass, stickfied data is created and stored in the file, and deleted after writing into the file. 

image

@imaihal imaihal changed the title [NNPA] Memory reduction by running stickification at file wrting [NNPA] Memory reduction of stickified constant by stickifying at file writing Aug 26, 2024
@imaihal
Copy link
Collaborator Author

imaihal commented Sep 11, 2024

@jenkins-droid test this please.

@imaihal
Copy link
Collaborator Author

imaihal commented Oct 11, 2024

Question about the Solution:
1. The stickification is delayed to lowering constant to llvm. In this lowering, can the space for krnl.Global be freed one-by-one while new llvm constant is created? I am comparing the situation when onnx.Constant is converted to stickified krnl.Global in onnx to zhigh in original code.
2. Following the previous point, I'd like to know whether the benefit of this PR comes from storing constant in file instead of in-memory, or from delaying the stickification operation.

I updated the figure in this PR since previous figure might be confusing.
The solution does not use krnl.Global. Instead, zlow.StickifiedConstant is used and lowered to llvm. When krnl.Global is used, memory usage can not be reduced. KrnlGlobal.cpp is also modified in this PR, but this modification is for using the ConstantOpInterface.
When zlow.StickifiedConstant is lowered to llvm, the data in value attribute is stickified, stored into a file, then deleted after writing. Original non-stickified data is alive, but stickified data is not alive.
The benefit comes from delaying the stickification, but it only benefits when storing into file.

@chentong319
Copy link
Collaborator

The previous figure is fine. I knew that this PR replaced some krnl.GlobalOps (result of stickifiedConstant) with zlowstickifiedConstant. The previous figure showed the reason why the ConstantOpInterface is introduced.

@imaihal
Copy link
Collaborator Author

imaihal commented Oct 17, 2024

@chentong319 Thanks for the comments. I'm now looking into RNN group and inheritance in Op definition.

@chentong319
Copy link
Collaborator

Our RNN group did not use inheritance. We just use a common utility function for core part of different RNN operations.

@imaihal
Copy link
Collaborator Author

imaihal commented Oct 25, 2024

I'm considering to use common functions.

I'm investigating how we can commonize extractConstantsToFile() for KrnlGlobalOp and ZLowStickifiedConstantOp. Currently it is called here. In the function, ConstantOpInterface is used for KrnlGlobalOp and ZLowStickifiedConstOp. I'm considering to commonize it using template, then call it here such as extractConstantsToFile<KrnlGlobalOp> or extractConstantsToFile<ZLowStickifiedConstantOp>, but I think we can't set operation name here. Do you know good way?

@chentong319
Copy link
Collaborator

If you use the Op matching rule for rewrite, the op is cast automatically.
In your this implementation, you are traverse the ops by yourself. The op has to be cast.
Then you can call the member functions and pass the results as parameters to the common utility function.
If you feel it is too complicated, it is okay to me to use ConstantOpInterface (perhaps with another less general name).

From ConstantOpInterface to KrnlGlobalOpInterface

Signed-off-by: Haruki Imai <[email protected]>
Signed-off-by: Haruki Imai <[email protected]>
Signed-off-by: Haruki Imai <[email protected]>
@imaihal
Copy link
Collaborator Author

imaihal commented Oct 29, 2024

@chentong319 Thanks for the investigations and comments!

I tried to implement some other way such as common functions and Trait, but I couldn't write it clean. So, I would like to keep using OpInterface.
How about using KrnlGlobalOpInterface? I used the opInterface to use the same functions with KrnlGlobalOp in zlowStickifiedConstantOp. I can replace if you have better candidate.

Signed-off-by: Haruki Imai <[email protected]>
Signed-off-by: Haruki Imai <[email protected]>
"The stickified tensor's buffer size and MemRef's size mismatched");

// Create a KrnlGlobalOp.
KrnlGlobalOp constantGlobal =
Copy link
Collaborator

@chentong319 chentong319 Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the previous implementation with KrnlGlobalOp in comment or if false branch, if you do not want to create an option to control the choice. You can define an option '--disable-krnl-constant-to-file' with default value of 'false'.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Is this because we may reuse the previous implementation in the future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created directive NNPA_ZHIGH_STICKIFIEDCONST_GEN to keep the original implementation. Currently commented out, but I confirmed it works when enabling this code.

@imaihal
Copy link
Collaborator Author

imaihal commented Oct 31, 2024

Could you review again?

@AlexandreEichenberger
Copy link
Collaborator

@chentong319 Do you mind reviewing it? You have been very involved with this PR. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants