Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial MIR component #1438

Open
wants to merge 344 commits into
base: develop
Choose a base branch
from
Open

Initial MIR component #1438

wants to merge 344 commits into from

Conversation

BradWhitlock
Copy link
Member

@BradWhitlock BradWhitlock commented Oct 3, 2024

This PR is adds an initial MIR component to Axom. EquiZ MIR was implemented as the first algorithm because it was more familiar and my first goal was to shake out problems with infrastructure for writing algorithms against Blueprint data. The MIR algorithm takes Blueprint input and generates Blueprint output.

This PR does the following:

  • Supercedes previous MIR work (previous work was either adapted or moved to "reference" directory)
  • Implements EquiZ (without iteration for now) that runs on CPU/GPU and works on 2D/3D meshes with various topology and shape types.
  • Provides views to simplify writing algorithms that use Blueprint data.
  • Provides several utility classes that perform mesh operations (extract zones, merge meshes, etc.)
  • Adds documentation
  • Adds example programs
  • Adds a little core infrastructure

NOTE:

  • The Elvira algorithm will be added in a future PR.
  • The previous serial MIR implementation was moved to a "reference" directory that we can remove once EquiZAlgorithm has iteration support (future work).
  • The component requires Conduit, RAJA, Umpire - I'll probably have to iterate with the CI some to make that work everywhere.
  • There are some files that changed because of "make style"

using MaterialID = int;
using MaterialIDArray = axom::Array<MaterialID>;
using MaterialIDView = axom::ArrayView<MaterialID>;
using MaterialVF = float;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is float a design choice over double?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I think I could change it to use the material view's FloatType for MaterialVF so it stays true to the types specified in the material view. I think I had some compiler problems in this code but I could try again.

Copy link
Member

@kennyweiss kennyweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing to make my way through this PR... this review covers the MIR examples.

When you get a chance, could please share some performance results and visualizations?

src/axom/mir/tests/mir_smoke.cpp Outdated Show resolved Hide resolved
src/axom/mir/tests/mir_testing_helpers.hpp Outdated Show resolved Hide resolved
std::cout << "a={";
for(axom::IndexType i = 0; i < a.size(); i++)
{
if(i > 0) std::cout << ", ";
Copy link
Member

@kennyweiss kennyweiss Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note:
Our clang-format can't enforce this until we go past clang-10, but we require braces around single line conditionals/loops
Since we're on clang-14 for our clang-format, we should update our rules to automatically add braces around single line conditionals

Comment on lines 41 to 48
# Add MIR if the prerequisites are found.
if(RAJA_FOUND AND UMPIRE_FOUND AND CONDUIT_FOUND)
axom_add_component(COMPONENT_NAME mir
DEFAULT_STATE ${AXOM_ENABLE_ALL_COMPONENTS})
else()
message(STATUS "Axom Component MIR turned off due to missing RAJA, UMPIRE, or Conduit.")
set(AXOM_ENABLE_MIR OFF CACHE BOOL "")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should add lines to our spack package to disable MIR when RAJA and Conduit are not present.
And similarly, to disable Sidre, Inlet and Klee when axom is configured without conduit.

For your specific question, I think we'd need to add -DAXOM_ENABLE_MIR=OFF to the *_noraja* CI configs, e.g. here

CMAKE_EXTRA_FLAGS: '-DBUILD_SHARED_LIBS=ON -DAXOM_QUEST_ENABLE_EXTRA_REGRESSION_TESTS:BOOL=ON -U RAJA_DIR'

src/axom/mir/examples/CMakeLists.txt Show resolved Hide resolved
@BradWhitlock
Copy link
Member Author

BradWhitlock commented Nov 9, 2024

I added a script that will make BASH scripts to run the mir_concentric_circles example program on various platforms and gather the caliper data results. I plotted the MIR runtime and some things look good, some very bad at the moment. I think SEQ mode could be optimized more to bring it down, likely bringing down time in the other modes too. OMP has inexplicably bad performance. I've seen some hints of this before during development but focused more on GPU so far. I'll need to profile OMP and see what's the matter.

image

rzansel42{whitlocb}144: env OMP_NUM_THREADS=44 ./examples/mir_concentric_circles --gridsize 1000 --numcircles 5 --policy omp | grep runMIR
runMIR                                 17.601138     17.601138     17.601138 88.542790 
rzansel42{whitlocb}145: env OMP_NUM_THREADS=88 ./examples/mir_concentric_circles --gridsize 1000 --numcircles 5 --policy omp | grep runMIR
runMIR                                 20.347344     20.347344     20.347344 89.191530 
rzansel42{whitlocb}146: env OMP_NUM_THREADS=32 ./examples/mir_concentric_circles --gridsize 1000 --numcircles 5 --policy omp | grep runMIR
runMIR                                 16.865811     16.865811     16.865811 88.723097 
rzansel42{whitlocb}147: env OMP_NUM_THREADS=22 ./examples/mir_concentric_circles --gridsize 1000 --numcircles 5 --policy omp | grep runMIR
runMIR                                 12.256273     12.256273     12.256273 85.147964 
rzansel42{whitlocb}148: env OMP_NUM_THREADS=2 ./examples/mir_concentric_circles --gridsize 1000 --numcircles 5 --policy omp | grep runMIR
runMIR                                  6.044385      6.044385      6.044385 73.873691 
rzansel42{whitlocb}149: env OMP_NUM_THREADS=1 ./examples/mir_concentric_circles --gridsize 1000 --numcircles 5 --policy omp | grep runMIR
runMIR                                  3.880441      3.880441      3.880441 64.487702 
rzansel42{whitlocb}150: ./examples/mir_concentric_circles --gridsize 1000 --numcircles 5 --policy omp | grep runMIR
runMIR                                 30.048744     30.048744     30.048744 90.253846 

void conduit_debug_err_handler(const std::string &s1, const std::string &s2, int i1)
{
std::cout << "s1=" << s1 << ", s2=" << s2 << ", i1=" << i1 << std::endl;
// This is on purpose.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you: this is a necessary comment.

inline int getAllocatorIDForAddress(void* ptr)
{
umpire::ResourceManager& rm = umpire::ResourceManager::getInstance();
return rm.getAllocator(ptr).getId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if ptr was not Umpire-allocated? A check that the allocator could be found would be helpful to avoid crashing. Would it make sense to return INVALID_ALLOCATOR_ID in that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I added a test that calls getAllocatorIDForAddress() with a pointer not allocated by Umpire, which made Umpire throw an exception. I added exception handling and return INVALID_ALLOCATOR_ID in that case.

*
* \note The coordset view must agree with the coordset in n_input. We pass both
* a view and the coordset node since the view may not be able to contain
* come coordset metadata and remain trivially copyable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"come coordset metadata" Should this be "some coordset metadata"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants