-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Constant Propagation for ReduceLogSum Backend tests #2517
Conversation
Let @gongsu832 make the final decision as the changes are related to dockers. |
@gongsu832 Would you be able to look over this PR please? Thanks |
docker/Dockerfile.onnx-mlir-dev
Outdated
# Enable Constant Propagation | ||
&& TEST_CONSTANT_PROP=${TEST_CONSTANT_PROP:-$([ "$(uname -m)" = "s390x" ] && echo true || \ | ||
([ "$(uname -m)" = "x86_64" ] && echo true || \ | ||
([ "$(uname -m)" = "ppc64le" ] && echo true || echo true)))} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha... makes sense
What does that mean? When we run at Why is then a test failing at |
@AlexandreEichenberger The default behavior is set to false meaning constant propagation is disabled unless I enable it. So if we look at the lit tests from the PR I just merged in. I had to set the Also if we are running at |
This was the failure below that only occurred once I made my code updates:
Just to clarify @gongsu832 is JNI check-onnx-backend-constant was built with O0, O1, O2, or O3? |
I would like to know why that benchmark fails if we don't do constant propagation. Because this may be the tell tale that we have a problem. Also, why was this not caught in our regular CIs? Since your original patch could only have gone through with successful CIs on our key machines. |
I commented the tests out. When the backend tests was enabled before this PR it failed in the CI. It only failed for Jenkins job. When I build on my Mac, the tests had no issues and it passed here too. Like I mentioned earlier, my assumption is JNI is dependent on constant propagation. Only thing I am certain of is these two tests worked before my code changes... I saw the failure once creating constant propagation flag. The same applies for about 5 or 6 lit tests that failed. |
JNI itself doesn't have the notion of O0-3. The native model.so is built according to whatever O level set for the C/C++ tests. However, we run tests with -O0 for the dev image and with -O3 for the user image. If constant propagation requires -O3, you need to turn it off when building the dev image. |
Want to know too. Any test should be passed with all options/combinations of -O{0,1,2,3} and constant propagation on/off. Otherwise, we need to find out the bug to fix. @hamptonm1 could you run the failed test |
Okay let me test that out now and I will post results. |
Thanks @hamptonm1 , I know that disabling is quicker but getting to the bottom of this error now is much easier than having to go on chasing the same error on a very large model. So this is a big help. Thanks for finding the issue and helping understand what might be wrong here, much appreciated. |
@AlexandreEichenberger I re-enabled the tests in the PR that was always the plan.... so it is no longer disabled. The only thing I did was create a flag to enable constant propagation for the backend tests seeing that it passes once I include said flag. However, I am fine with running the tests via the python script to see if I can collect any other data. |
@AlexandreEichenberger @tungld Okay here are the results below for
What stands out to me is this message I also tested using
|
I am going to close tis PR out and hopefully this PR should solve our issues: #2537. Thanks! |
I reenabled two of the ReduceLogSum that were failing because constant propagation needed to be turned on.