-
Notifications
You must be signed in to change notification settings - Fork 0
2017 10 11
Wesley Bland edited this page Oct 11, 2017
·
2 revisions
- Intel - Wesley
- ORNL - Geoffroy
- Argonne - Yanfei, Ken
- Auburn - Nawrin
- Sandia - Keita
- UTC - Tony
- Jeff - RMA is different from communicator-based FT because it is more data focused and it is more expensive and less likely to detect process failure. We should add more text to focus on conveying that the data is unavailable.
- Others - This is a bit out of scope of the initial ULFM proposal but still important. Maybe this should be an accompanying proposal
- Wesley - After reading through the proposal again, I think it makes sense to bring this into ULFM proper. It completes the picture for RMA because if we can detect process failure, we do, but we can also express failure in other, cheaper ways.
- Keita - The expected recovery model is unclear here.
- Good point: Need to add some advice to say that we expect the user to free the window, fix the data and recreate the window. They may or may not discover a process failure during this procedure.
- Wesley - The advice about
MPI_WIN_FREE
needs to be expanded to coverMPI_DATA_UNAVAILABLE
around lines 418-419.
Aurelien merged the proposal to detect when a communicator is revoked. This is now part of ULFM proper.
- All - Go over the
MPI_ERR_DATA_UNAVAILABLE
proposal text and leave comments. Specifically look at new text proposals in the comments.