-
Notifications
You must be signed in to change notification settings - Fork 1
2016 07 11 webex
-
Attendees
- Jeff Squyres
- Dan Holmes
- Kieta Teranishi
- Wesley Bland
- Kathryn Mohror
- Howard Pritchard
- Aurelien Bouteiller
- Martin Schulz
-
Discussion points for today:
- Continue to discuss feedback from the 2016 June/Bellevue WA, USA MPI Forum meeting
Continue discussing feedback from Forum meeting.
-
Hubert: What is the error case in doing things in the left column? Does it have to abort (there's no assigned error handler or scope)?
- 2016-06-27: This covers: info, op, errhandler, datatype, ...and possibly group :-)
- 2016-06-27: Added slide 56 with some possibilities and discussion results
- 2016-06-27: Don't have a good answer for this yet. Need to think more.
-
Dan: Can we have a function to translate a group from one session to another? (these notes could be wrong here, Wesley didn't follow all of this discussion)
- This would allow them to be used between multiple sessions (e.g., give a group to a library to say "use these processes", and then the library can use that group inside its own session).
- 2016-07-11: This would seem useful.*
- *2016-07-11: Jeff proposal: MPI_Group_translate_session(MPI_Group src_group, MPI_Session dest_session, MPI_Group dest_group);
- **2016-07-11: Aurelien would rather have session handle explicitly passed as arg on specific functions (e.g., MPI_Comm_create_from_group)
- **2016-07-11: had a straw vote between these two:
- 5 people in favor of translate_session
- 1 in favor of session argument in calls
- 2 abstains
-
Pavan: If only part of the application is MPI (uses MPI for 10 hours, does something else for 10 hours), how do we force MPI to clean up its resources so we can have them back later?
- Jeff/Room: Could require the user to keep some object around to keep MPI from cleaning up. This would mean adding the requirement that MPI will clean itself up when everything is freed.
- 2016-07-11: Do we want to force this requirement?
- 2016-07-11: After discussion: yes, we want to go back to forcing this requirement (MPI is finalized when the last user-created handle is destroyed / MPI_FINALIZE -- need to get the precise language about that). The use case of having MPI release all resources so that they can be used by some other part of the app is useful (and we didn't have this use case before).
THIS IS AS FAR AS WE GOT THIS MEETING
-
Martin: Why can't we extract a session from
MPI_Init
?- Jeff: It's gross and could require a bunch of existing semantics to change. For example, what if you finalize the implicit session? Can you still call
MPI_Finalize
? - 2016-07-11: Still good with this answer
- Jeff: It's gross and could require a bunch of existing semantics to change. For example, what if you finalize the implicit session? Can you still call
-
Pavan: Calling finalize hooks in threads other than their own could cause problems.
- 2016-07-11: Cite a specific problem...?
-
Pavan:
MPI_Session_finalize
as presented is kind of collective. We don't want that. Who would it be collective with?- Tony: If we say that send cancel with sessions is illegal, does that make
MPI_Session_finalize
non-collective? - Hubert:
MPI_Request_free
- Everyone: Crap...
- Martin: What if we say that all communication taking place in the session must be done?
- Aurelien: What about sends where the data is buffered but not transferred?
- 2016-07-11: What's the problem with saying that all local communication must be completed, and that an application that sends after a peer disappears is erroneous? Compare to current behavior -- MPI implementations usually hang.
- Tony: If we say that send cancel with sessions is illegal, does that make
-
Jeff: How do you abort "all connected processes" when you may not have connected to all processes in
mpi://WORLD
?- Wesley: This would make the new error handler definitions very gross (leverages "all connected processes" to mean everyone in
MPI_COMM_WORLD
+ connected dynamics when definingMPI_ERRORS_ARE_FATAL
.
- Wesley: This would make the new error handler definitions very gross (leverages "all connected processes" to mean everyone in
-
Martin/Pavan: If you can't create the global address table at init time, that could make the common case of address tracking expensive because you may have to have per-communicator arrays to track all addressing info.
- Pavan: You may be able to recreate this by allocating the big array to potentially hold all procs at
MPI_Session_create
time. - Jeff: This already isn't a problem for OMPI because it uses a dynamically growing array of pointers to proc structs.
- Pavan: You may be able to recreate this by allocating the big array to potentially hold all procs at
-
Martin: In MPI 3.1, does
MPI_Init
still need to be collective? -
Pavan:
MPI_IO
can't be the same on all communicators. In fact, many of the built in attribute keys may not want to be the same on all communicators.- All: Should we make the special attributes be allowed to be different per communicator? Probably, especially for
MPI_TAG_UB
andMPI_IO
.
- All: Should we make the special attributes be allowed to be different per communicator? Probably, especially for
-
Aurelien: Instead of using a
parent_comm
forMPI_Exec
, why not use a group and tag like other communicator creation functions? -
Wesley: The new runtime sets from
MPI_Exec
will not be visible everywhere (can only see the sets you're in). Any one involved process will see at most two out of three.- You can construct the other with group subtraction.
-
All: Is there a good use case for needing all three exec sets anyway? We can derive the set we are in (parent vs. children). We can't get the other one (because we're not in it).
- The only one we need is the new big set that includes all processes in parent and children.
-
Pavan: How do we know when processes are done so it's safe to spawn again?
-
Pavan: MPI doesn't need replace because it can
MPI_Session_finalize
andexecvp
.- Anh: That doesn't exist in Windows.
-
Jeff: Add thread safety to
MPI_Session_init_comm
.- Wesley: What about error handler and info?
-
Pavan: Multithreading may be a problem where the tag isn't enough because the threads can be executed in any order.
- Pavan: However, one MPI call can't block the entire stack so maybe it's ok.
-
Martin: The wording around
set_name
onMPI_Session_init_comm
needs to get cleaned up.