-
Notifications
You must be signed in to change notification settings - Fork 1
2022 04 25 webex joint ftwg
#04/25/22 meeting notes for joint FT/Sessions WGs meeting
Attending: Howard Pritchard, Trupeshkumar Patel, Aurelien Bouteiller, Dan Holmes, Thomas Hines, Grace Nansamba
- Continue discussion of Aurelian’s revoke if process set versions don’t match concept – see notes from last meeting: https://github.com/mpiwg-sessions/sessions-issues/wiki/2022-04-11-webex-joint-ftwg
- If there’s time discuss items to present at the 5/11/22 virtual meeting
Discussing Aurelian's slide: Sessions: How to work with versioned sets
Get a notification - either poll or callback, now do something about it. MPI_Group_from_session_pset - are there new names? No pushback on the idea of a new way for MPI_Comm_create_from_group to fail - if there's a version mismatch for the related pset. Dan suggests modifying the stringtag argument to indicate which version of the pset name they used to create the input group. Returning uniform error codes across participating callers? Dan thinks this is necessary. Discuss situation of fast changing system. In Aurelian's model though is that return codes don't have to be uniform, and that the number of calls to MPI_Comm_create_from_group could differ for different participating processes. Discuss complications of derived groups from a process sets. Leads back to idea of incorporating version into the tag. Uniform error return for group mismatch. Inheritance complexity. Can we create groups from groups that originated from different versions of a process set. Not within a single process. Discuss local failures if group operations are used involving input groups derived from different process set versions. Discuss possibility of MPI_Comm_create_from_group returning a communicator that has problems - down processes, system changes/race conditions, etc.
Finding groups are matching (pset version) is likely to be difficult for an implementation. Example of exchanging of hashing members of local group and exchanging with other processes. Difficulty of not knowing number of participants again. We need the process set and its version as part of the communicator construction. Discuss validty of old process set names - we hadn't decided whether these are valid or not. PSCW as an implicit part of MPI_Comm_create_from_group. Discuss more about the PSET sync operation brought up two weeks ago. This functionality is too limited - doesn't work for more general group handled case.
Comm create from pset - will use the most recent pset version implicitly. Or some new variant of MPI_Comm_create_from_group with a version argument, a process set and its version, or a list of version numbers. Or could these be inferred from group and its provenance.