Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hari work 2 #11

Open
wants to merge 2 commits into
base: ledac/next_gpu
Choose a base branch
from

Conversation

science-enthusiast
Copy link

Bonjour
I still have doubts that I have marked as "DOUBT_HARI". We will discuss and address the issues.

Merci

Hari

@pledac pledac self-requested a review January 8, 2025 12:54
@pledac
Copy link
Member

pledac commented Jan 8, 2025

Hello Harri, thanks for the PR. Finally, the rebase can't be done for the moment. So just:
a) Fix the GPU build as intended
b) Please can you keep the syntax for the rewritten kernel and avoid using view_tab instead of tab except if necessary ? Example:

         for (int num_face = ndeb; num_face < nfin; num_face++)
            {
              const int elem1 = face_voisins(num_face, 0), elem2 = face_voisins(num_face, 1);
              int elem = elem1 == -1 ? elem2 : elem1;
              const double surf = zp1b.surface(num_face);
              tab_flux_faces(num_face) = val_flux(num_face - ndeb, 0) * surf / zp1b.volumes(elem); // TODO multiple elements!! units val_flux(num_face-ndeb,0) *surf [kg.s-1] => gives [kg.m-3.s-1]
            }

Would be rewritten like that:

          CDoubleArrView volumes_entrelaces = zp1b.volumes_entrelaces().view_ro();
          CIntTabView face_voisins = zp1b.face_voisins().view_ro();
          CIntTabView face_sommets = zp1b.face_sommets().view_ro();
          Kokkos::parallel_for(start_gpu_timer(__KERNEL_NAME__), Kokkos::RangePolicy<>(ndeb, nfin), KOKKOS_LAMBDA (const int num_face)
          {
            const int elem1 = face_voisins(num_face, 0), elem2 = face_voisins(num_face, 1);
            int elem = elem1 == -1 ? elem2 : elem1;
            const double surf = face_surfaces(num_face);
            flux_faces(num_face) = val_flux(num_face - ndeb, 0) * surf / volumes(elem); // TODO multiple elements!! units val_flux(num_face-ndeb,0) *surf [kg.s-1] => gives [kg.m-3.s-1]
          });
          end_gpu_timer(Objet_U::computeOnDevice, __KERNEL_NAME__);

It is for readability and to keep the Kokkos port the same on the whole code. Thanks !

@science-enthusiast
Copy link
Author

Hi Pierre
I agree with you about maintaining consistent style. Thanks for the descriptive example. I will modify my code accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants