Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CUDA gradient script #806

Merged
merged 2 commits into from
Mar 9, 2024

Conversation

kchristin22
Copy link
Collaborator

Fixes #744.

As described here, the gradient computation of the function of interest was moved inside the kernel. The results returned from the GPU were compared with the ones derived from the CPU execution to establish the correctness of the implementation.

Copy link

codecov bot commented Mar 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.89%. Comparing base (d7e5434) to head (20be1f3).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #806   +/-   ##
=======================================
  Coverage   94.89%   94.89%           
=======================================
  Files          49       49           
  Lines        7478     7478           
=======================================
  Hits         7096     7096           
  Misses        382      382           

@kchristin22
Copy link
Collaborator Author

It would be also safer to add checks for CUDA errors when calling CUDA functions, but the final comparison of the GPU and CPU results will hint on a need for debugging, so I believe there's no need to "clutter" the code further.

@@ -15,6 +15,7 @@
// XFAIL: clang-15
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we understand why it fails for clang-15?

Copy link
Collaborator Author

@kchristin22 kchristin22 Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first guess would be that the local CUDA version where this was tested and failed was not compatible with clang-15. I will have a look.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I haven't found a CUDA version old enough (prior to 11.5) and compatible with my system (Ubuntu 22.04) to test this with clang-15, but upon examining another issue I came across a similar failure of clang-15 and clang-14 (didn't test with clang-16). Relevant comment here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. We probably should open a new issue and a separate PR to continue the XFAIL: clang-15 discussion there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can open one if you'd like.

I was also thinking that if the CI was expanded to include CUDA's set up, we could check the build errors in the PR (harder for development but for older versions of clang it would be beneficial for developers with the latest OS versions). This could be a separate issue.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@@ -109,14 +110,24 @@ int main(void) {
cudaMemcpy(d_x, x, N * sizeof(double), cudaMemcpyHostToDevice);
cudaMalloc(&d_p, N * sizeof(double));
cudaMemcpy(d_p, p, N * sizeof(double), cudaMemcpyHostToDevice);
double *result, *d_result;
std::vector<double> result(N, 0);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably do not need a std::vector here. Can we use a fixed size array or an std::array?

Copy link
Owner

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. See my comment.

@@ -15,6 +15,7 @@
// XFAIL: clang-15
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. We probably should open a new issue and a separate PR to continue the XFAIL: clang-15 discussion there.

@vgvassilev vgvassilev merged commit c3a4f8e into vgvassilev:master Mar 9, 2024
81 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

derivative value of 0 when running the CUDA test script
2 participants