-
Notifications
You must be signed in to change notification settings - Fork 0
2021 06 14
- Discuss slide deck for Sept. 2021 plenary talk
- Discussion/presentation from Tony/Derek on MPI object serialization (aka, stages)
We are working towards a collective slide-deck that expresses the group position on interoperability of FT methods, and ongoing efforts and directions.
https://drive.google.com/file/d/184A-qb9MBt2g8iim49iBgDsaPMMiiLXQ/view
- Aurelien to iterate over the ULFM/fine grain depiction
- Ignacio to iterate over Reinit/coarse grain depiction
- Tony to provide some material on early ideas for object serialization
- All to iterate over slides that describe composition/coexistence of models
- Have some timetable/schedule
Goal is to have some form of advanced draft for July 12 conf-call
The idea is to enable serializing some (most?) MPI objects so that they can be rejuvenated/migrated at another location.
Papers on MPI Stages: (Tony/Derek, please add links)
Derek has been looking at how to serialize Datatypes, Groups, and some properties of Comms.
Communicator serialization model assumes that the Comm has no in-transit messages, so that serialization is limited to Group+CID+Info, but not matching queues
Some properties of Comms can be hard to serialize as they contain user-provided state; We believe that INFO are currently serializable, but ATTRIBUTES are probably not: they can contain pointers to user-memory, that if they are serializable, what they point to is not.
Serializing C pointers is problematic, but with language support (e.g., C++/Kokkos), enough semantic information may be available to provide some service.
This requires more investigation.
It may be useful under some use-cases to be able to serialize/restore a communicator in which communication are ongoing (i.e., without prior quiescing the network with an agreement).
Aurelien had some related use-cases when designing message logging over ULFM http://dx.doi.org/10.1016/j.future.2018.09.041 ; the state of communicators and potential send-recv completion drift/mismatch is re-stabilized during the recovery protocol
Keitah also experienced a similar need when working with task FT: https://ieeexplore.ieee.org/document/9308655/
Reinit case is fully quiescent
We discussed if the interface to the feature should be core-MPI, or a tool (i.e., MPI_T); under the premise that we are doing some sort of introspection of the MPI state.
After discussing, the group held the consensus that this idea has issues:
- it would make it incompatible with Fortran
- Unlike othe MPI_T, it has the intent of modifying state and behavior of MPI ops (e.g., deserialization will create communicators)
For these reasons, we think we'd prefer interfaces to be MPI_
Next we turned onto how to progress. Tony proposes to initiate a crude draft proposal (a small additional chapter) on top of mpi/5 as a branch in the mpiwg-ft repo.
Tony/Wesley: Add link to the repo when created
The idea is to have some interfaces to look at with definitions so that we can iterate; we prefer this way of working to creating an ever-growing slide-deck.
The idea is envisioned to have multi-use well beyond FT; we should list/collect potential and confirmed use-cases for the idea.
Next meeting is July 12