-
Notifications
You must be signed in to change notification settings - Fork 22
ASC Q1 2023 Meeting
Thomas Naughton edited this page Jan 25, 2023
·
11 revisions
- Date: Jan. 24 & 26, 2023
- Time: 10 am - 1 pm US Central Daylight Time each day
- Location: Virtual Meeting. WebEx information (bottom of the page): https://recaptcha.open-mpi.org/pmix-asc-recaptcha/
- Active Notes Link: Google Doc - Please add your name and affiliation.
- Governance Document [latest]
This meeting has a floating agenda with specific synchronization points to keep us on track. Rough time estimates are provided per agenda item, and the co-chairs plan to cover the topics in the order seen below. However, since some agenda items will take longer/shorter than anticipated, an exact start/end timing is not guaranteed, and some items may float to the second day. If you cannot attend the full meeting and are presenting, please let the co-chairs know, and we can plan accordingly.
Start | End | Topic |
---|---|---|
10:00 am | 10:05 am | Gathering (--) |
10:05 am | 10:10 am | Roll Call (We will start roll call promptly at this time) |
10:10 am | 11:30 am | Discussion of agenda items |
11:30 am | 11:45 am | Break |
11:45 am | 1:00 pm | Discussion of agenda items |
Start | End | Topic |
---|---|---|
10:00 am | 10:05 am | Gathering (--) |
10:05 am | 11:30 am | Discussion of agenda items |
11:30 am | 11:50 am | Voting and Break Voting Link |
11:50 am | 12:30 am | Administrative and Working Group agenda items |
12:30 am | 12:45 pm | Technical and Use Case Presentation(s) |
12:45 am | 1:00 pm | Closing discussion and wrap up |
-
Governance PRs up for a Second Vote:
- None
-
Governance PRs up for a Reading and First Vote:
- None
-
PMIx Standard PRs up for a Reading (Provisional):
- None
-
PMIx Standard PRs up for a Reading (Errata):
- Add const to string parameters (Ken ~10 min)
-
PMIx Standard PRs up for a Second Vote:
- None
-
PMIx Standard PRs up for a Reading and First Vote:
- None
- Voting Link
-
Plenary discussion items
- Publish/Lookup Chapter (Dave ~30 min)
-
Revision Exception Votes
- None
- Presentation of the v5.0 Standard Release Candidate for discussion (Ken/Dave)
- Review quarterly meetings dates and plans
1Q 2023 - Virtual
- Jan. 24 & 26 (10 am - 1 pm US Central)
2Q 2023 - Virtual
- May 9 & 11 (10 am - 1 pm US Central)
3Q 2023 - Virtual
- July 18 & 20 (10 am - 1 pm US Central)
4Q 2023 - Virtual
- Oct. 17 & 19 (10 am - 1 pm US Central)
- ASC Membership
- Vote on new ASC Members
- Call for new ASC Members
- Release Planning
-
PMIx 4.2 Release (Josh/Ralph ~ 5 min)
- v4.2rc1 is available for review. Expected release 1Q 2023
- PMIx 5.0 Release (Ken/Dave ~ 10 min)
- v5 Draft for Approval
-
PMIx 4.2 Release (Josh/Ralph ~ 5 min)
- Working Group Updates (~ 10-15 minutes each)
- Client Separation / Implementation Agnostic Document
- Tools & Dynamic Workflows
- Open Call for New Working Groups
- Technical and Use Case presentations
- Josh Hursey (IBM) "A separated model for running rootless, unprivileged PMIx-enabled HPC applications in Kubernetes" (Presented at CANOPIE-HPC)
- Additional discussion items
Person | Institution | Day 1 | Day 2 |
---|---|---|---|
Josh Hursey | IBM | X | |
Michael Karo | Altair | X | |
Ken Raffenetti | ANL | X | |
Isaias Compres | TUM | X | |
Ralph Castain | Nanook ` | X | |
Norbert Eicker | JSC | X | |
Brice Goglin | INRIA | X | |
Dave Solt | IBM | X | |
Kathryn Mohror | LLNL | X | |
Dominik Huber | X | ||
Norbert Eicker | JSC | X | |
Thomas Naughton | ORNL | X |
- Introductions
- Reading: Add const to string parameters (Ken ~10 min)
- https://github.com/pmix/pmix-standard/pull/430
- No comments/concerns mentioned
- Note that already added into OpenPMIx so just a matter of adding to Standard text
- Plenary discussion
- Publish/Lookup Chapter (Dave ~30 min)
- https://github.com/pmix/pmix-standard/pull/398
- TODO: add dave’s slides
- Brief summary: Motivation was to separate things to publish from attributes influencing the publishing. Also, resolve some non-deterministic behavior when looking up on ranges. Led to introducing new APIs for Publish/Lookup (PMIx_Publish_datastore/PMIx_Lookup_datastore)
- The new publish datastore returns a unique publish_id, used to unpublish the specific item.
- Q: Who can unpublish?
- Generally any process with same userID can unpublish
- So as in theory could transfer the publish_id to another process under same userID could unpublish it. Maybe in future could add attributes to further refine these semantics/limits, but not being introduced now.
- Note: Maybe this is getting overly complicated, possibly starting to look more like a database?
- Intent was to keep as simple as possible
- The reason for publish_id was to ensure proper specificity on what should be unpublished.
- Q: how did get to multiple publishes for same key?
- Once have pub on ranges+key, it gets more complicated and just allow for publishing multiple times
- Maybe just remove the “realm” stuff and every key must be unique. And make the key be the unique part.
- Trying to get at exact need for this functionality and what’s minimum need to accomplish
- Example: Currently mainly used in Open MPI for rendezvous for connect/accept. And goal is to remove that method in future.
- From past notes: Multiple processes need to be able to publish the same key.
- Having ability to publish multiple keys also removes the requirement that the publisher check on “uniqueness” of the key.
- Discussion continued to review other points raised during design/requirements review…
- If adding complexity, do so intentionally to address clear use/need
- Review current APIs and datastructures
- Note: pmix_pdsdata_t contains the key now so have symmetry in blocking and non-blocking paths, also the publish_id contains the epoch
- In progress: Regarding epoch (pmix_epoch_t) increasing number for comparison and possibly ordering, with weaker ordering constraints across different processes (i.e., possibly same time so just know happened at same time) but w/in same process have stronger ordering.
- Question about implementation was discussed, generally it seems that an implementation would be needed before voting. Unclear who has time/resources for implementation – point for discussion within the working group. Note: Ralph will not have time to do this implementation.
- See also notes on PR https://github.com/pmix/pmix-standard/pull/398
- Publish/Lookup Chapter (Dave ~30 min)
- PMIx v5.0 presentation
- Note: for voting, will have two votes: v5 release and errata
- TODO: add Ken slides
- First major release under new governance procedures
- Using a time based release target (in contrast to feature based)
- V5 Additions
- Use-case WG additions (business card, debugging, hybrid prog models, cross-version)
- Implementation Agnostic WG (return codes, rework ch1-2, ch5-8, ABI (ABI partially in pmix-4.2))
- Storage WG (added in pmix-4.2 also)
- See Ken’s slide for detailed changelog
- Note: missed the macros converted to functions in changelog
- Need to double check items in Exception file that may be missing from standard. Namely the macros-to-function items.
- Procedurally - need to check before Thursday and revision exception if only a bullet to changelog. If items missing from standard (PRs) then may need to delay ratification.
- TODO: check on exception files and checker script and see if items missed before Thursday (Day2)
- Next meetings
- Longer gap until Q2 meeting in May
- Q3 meeting in …
- PMIx v4.2 release
- Plan is to have a Q2 release
- Quite a number of PR have piled up since the last release
- PMIx v5.0 release
- Identified some missing items (e.g., macros-to-functions) during this meeting and would like to discuss next steps until that list of items is complete.
- Ignore current v5 voting item (links already posted) and will do the vote on Thursday (Day#2).
- Working Group updates
- Client Separation / Implementation Agnostic Doc WG (Dave)
- See items discussed earlier this meeting
- Other work has been on ABI
- Tools & Dynamic Workflow WG (Isaias)
- Resumed meetings last week
- Will describe a use-case to discuss potential race conditions
- Brainstorming on more but not yet ready for presentation
- Call for New Working Groups
- Drift of the library away from the standard (Ralph)
- more than 200 functions right now
- Shall there be WG keeping an eye on this?
- Might be part of the Implementation Agnostic WG? (Ken)
- Drift of the library away from the standard (Ralph)
- Client Separation / Implementation Agnostic Doc WG (Dave)
- Voting link
- https://www.surveymonkey.com/r/6HP2RDC
- Note: Ignoring results from “v5.0” item on today’s survey (defer to Thu/Day#2)
- Technical Presentation
- Josh Hursey (IBM) "A separated model for running rootless, unprivileged PMIx-enabled HPC applications in Kubernetes" (Presented at CANOPIE-HPC)
- Slides
- Video Link
- Demo environment: https://github.com/jjhursey/pub-2022-CANOPIE-HPC
- Josh Hursey (IBM) "A separated model for running rootless, unprivileged PMIx-enabled HPC applications in Kubernetes" (Presented at CANOPIE-HPC)
- ...