-
Notifications
You must be signed in to change notification settings - Fork 22
/
Chap_Introduction.tex
319 lines (258 loc) · 23.4 KB
/
Chap_Introduction.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Chapter: Introduction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{Introduction}
\label{chap:intro}
\ac{PMIx} is an application programming interface standard that provides
libraries and programming models with portable and well-defined access to commonly
needed services in distributed and parallel computing systems.
A typical example of such a service is the portable and scalable exchange of network
addresses to establish communication channels between the processes of a parallel
application or service.
As such, \ac{PMIx} gives distributed system software providers a better understanding of how
programming models and libraries can interface with and use system-level services.
As a standard, \ac{PMIx} provides \acp{API} that allow for
portable access to these varied system software services and the
functionalities they offer. Although these services can be defined and implemented directly by the
system software components providing them, the
community represented by the \ac{ASC}
feels that the development of a shared standard better serves the
community.
As a result, \ac{PMIx} enables programming languages and libraries to focus on their core
competencies without having to provide their own system-level services.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Background}
\label{chap:introduction:background}
The \ac{PMI} has been used for quite some time as a means of exchanging wireup information needed for inter-process communication.
Two versions (PMI-1 and PMI-2 \cite{2010-Balaji-EuroMPI}) have been released as part of the MPICH effort, with PMI-2 demonstrating better scaling properties than its PMI-1 predecessor.
PMI-1 and PMI-2 can be implemented using \ac{PMIx} though \ac{PMIx} is not a strict superset of either.
Since its introduction, \ac{PMIx} has expanded
on earlier \ac{PMI} efforts by
providing an extended version of the \ac{PMI} \acp{API} which provide necessary functionality for launching and managing parallel applications and tools at scale.
The increase in adoption has motivated the creation of this document to formally specify the intended behavior of the \ac{PMIx} \acp{API}.
More information about the \ac{PMIx} standard and affiliated projects can be found at the \ac{PMIx} web site: \url{https://pmix.org}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{PMIx Architecture Overview}
\label{chap:intro:arch_overview}
The presentation of the \ac{PMIx} \acp{API} within this document makes some
basic assumptions about how these \acp{API}
are used and implemented. These assumptions are generally made only to simplify
the presentation and explain \ac{PMIx} with the expectation that most readers
have similar concepts on how computing systems are organized today. However, ultimately
this document should only be assumed to define a set of \acp{API}.
A concept that is fundamental to \ac{PMIx} is that a \ac{PMIx} implementation might
operate primarily as a \textit{messenger}, and not a \textit{doer} --- i.e., a \ac{PMIx}
implementation might rely heavily or fully on other software components to provide
functionality~\cite{2017-Castain-EuroMPI}.
Since a \ac{PMIx} implementation might only deliver requests and responses to other
software components, the \ac{API} calls include ways to provide arbitrary information to the
backend components that actually
implement the functionality. Also, because \ac{PMIx} implementations generally rely heavily
on other system software, a PMIx implementation might not be able to guarantee that a feature
is available on all platforms the implementation supports. These aspects are discussed in
detail in the remainder of this chapter.
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.8\textwidth]{figs/PMIxRoles.pdf}
\end{center}
\caption{PMIx-SMS Interactions}
\label{fig:roles}
\end{figure*}
\endgroup
Fig.~\ref{fig:roles} shows a typical \ac{PMIx} implementation in which the application is
built against a \ac{PMIx} client library that contains the client-side \acp{API},
attribute definitions, and communication support for interacting with the local \ac{PMIx} server.
\ac{PMIx} clients are processes which are started through the \ac{PMIx} infrastructure,
either by the PMIx implementation directly or through a \ac{SMS} component, and have registered
as clients. A \ac{PMIx} client
is created in such a way that the \ac{PMIx} client library
will be have sufficient information available to
authenticate with the \ac{PMIx} server.
The \ac{PMIx} server will have sufficient knowledge about the
process which it created, either directly or through other \ac{SMS}, to authenticate the
process and provide information the process requests such as its identity and the
identity of its peers.
As clients invoke \ac{PMIx} \acp{API}, it is possible that some client requests can
be handled at the client level. Other requests might require communication with the
local \ac{PMIx} server, which subsequently might request services from the host \ac{SMS}
(represented here by a \ac{RM} daemon).
The interaction between the \ac{PMIx} server and \ac{SMS} are
achieved using callback functions registered during server initialization.
The host \ac{SMS} can indicate its lack of support for any operation by simply providing a \textit{NULL} for the associated callback function, or can create a function entry that returns \textit{not supported} when called.
Recognizing the burden this places on SMS vendors, the PMIx community has included interfaces by
which the host \ac{SMS} (containing the local PMIx service instance) can request support from local \ac{SMS} elements via the PMIx API. Once the \ac{SMS} has transferred the request to
an appropriate location, a \ac{PMIx} server interface can be used to pass the request
between \ac{SMS} subsystems.
For example, a request for network traffic statistics can utilize the
PMIx networking abstractions to retrieve the information from the Fabric Manager.
This reduces the portability and
interoperability issues between the individual subsystems by transferring the burden of defining the
interoperable interfaces from the \ac{SMS} subsystems to the \ac{PMIx} community, which continues
to work with those providers to develop the necessary support.
Fig.~\ref{fig:roles} shows how tools can interact with the \ac{PMIx} architecture.
Tools, whether standalone or embedded in job scripts, are an exception to
the normal client registration process. A process can register as a tool, provided
the \ac{PMIx} client library has adequate rendezvous information to connect to the appropriate
\ac{PMIx} server (either hosted on the local machine or on a remote machine). This allows processes
which were not created by the PMIx infrastructure to request access to PMIx functionality.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Portability of Functionality}
\label{chap:intro:not_supported}
It is difficult to define a portable \ac{API} that will provide access to the many
and varied features underlying the operations for which \ac{PMIx} provides access.
For example, the options and features provided to request the creation
of new processes varied dramatically between different systems existing
at the time \ac{PMIx} was introduced. Many \acp{RM} provide rich interfaces
to specify the resources assigned to processes.
As a result, \ac{PMIx} is faced with the challenge
of attempting to meet the seamingly conflicting goals of creating an \ac{API} which allows
access to these diverse features while being portable across a wide range of
existing software environments. In addition, the functionalities required by different
clients vary greatly. Producing a \ac{PMIx} implementation
which can provide the needs of all possible clients on all of its target systems
could be so burdensome as to discourage \ac{PMIx} implementations.
To help address this issue, the \ac{PMIx} \acp{API} are designed to allow resource managers
and other system management stack components to decide on support of a
particular function and allow client applications to query and adjust to the level of support available. \ac{PMIx} clients should be written to account for the possibility that a \ac{PMIx} \ac{API} might return an error code indicating that the call is not supported.
The \ac{PMIx} community continues to look at ways to assist \ac{SMS} implementers in their decisions
on what functionality to support by highlighting functions and attributes that are
critical to basic application execution (e.g., \refapi{PMIx_Get}) for certain classes of applications.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Attributes in PMIx}
\label{chap:intro:attributes}
An area where differences between support on different systems can be challenging is regarding the attributes that provide information to the client process and/or control the behavior of a \ac{PMIx} \ac{API}. Most
\ac{PMIx} \ac{API} calls can accept additional information or attributes specified in the form of
key/value pairs. These attributes provide information to the \ac{PMIx} implementation that influence the
behavior of the \ac{API} call. In addition to \ac{API} calls being optional, support for the
individual attributes of an \ac{API} call can vary between systems or implementations.
An application can adapt to the attribute support on a particular system in one of two ways.
\ac{PMIx} provides an \ac{API} to enable an application to query the attributes
supported by a particular \ac{API} (See \ref{chap:api_job_mgmt:queryattrs}).
Through this \ac{API}, the \ac{PMIx} implementation can provide detailed information
about the attributes supported on a system for each \ac{API} call queried.
Alternatively, the application can mark attributes as
required using a flag within the \refstruct{pmix_info_t} (See \ref{chap:struct:info}).
If the required attribute is
not available on the system or the desired value for the attribute is not available, the call will
return the error code for \textit{not supported}.
For example, the \refattr{PMIX_TIMEOUT} attribute can be used to specify the time (in seconds) before the requested operation should time out. The intent of this attribute is to allow the client to avoid ``hanging'' in a request that takes longer than the client wishes to wait, or may never return (e.g., a \refapi{PMIx_Fence} that a blocked participant never enters).
The application can query the attribute support for \refapi{PMIx_Fence}
and search whether \refattr{PMIX_TIMEOUT} is listed as a supported attribute. The application can also
set the required flag in the \refstruct{pmix_info_t} for that attribute when making the \refapi{PMIx_Fence}
call. This will return an error if this attribute is not supported. If the required flag is not set,
the library and \ac{SMS} host are allowed to treat the attribute as optional, ignoring it if support
is not available.
It is therefore critical that users and application implementers:
\begin{compactalphaenum}
\item consider whether or not a given attribute is required, marking it accordingly; and
\item check the return status on all \ac{PMIx} function calls to ensure support was present and that the request was accepted. Note that for non-blocking \acp{API}, a return of \refconst{PMIX_SUCCESS} only indicates that the request had no obvious errors and is being processed – the eventual callback will return the status of the requested operation itself.
\end{compactalphaenum}
\ac{PMIx} clients (e.g., tools, parallel programming libraries) may find that they depend only on a small subset of interfaces and attributes to work correctly.
\ac{PMIx} clients are strongly advised to define a document itemizing the \ac{PMIx} interfaces and associated attributes that are required for correct operation, and are optional but recommended for full functionality.
The \ac{PMIx} standard cannot define this list for all given \ac{PMIx} clients, but such a list is valuable to \acp{RM} desiring to support these clients.
A \ac{PMIx} implementation may be able to support only a subset of the \ac{PMIx} \acs{API} and attributes on
a particular system due to either its own limitations or limitations of the \ac{SMS} with which it interfaces.
A \ac{PMIx} implemenation may also provide additional attributes beyond those defined herein in order to allow
applications to access the full features of the underlying \ac{SMS}.
\ac{PMIx} implementations are strongly advised to document the \ac{PMIx} interfaces and associated attributes they support, with any annotations about behavior limitations.
The \ac{PMIx} standard cannot define this support for implementations, but such documentation is valuable to \ac{PMIx} clients desiring to support a broad range of systems.
While a \ac{PMIx} library implementer, or an \ac{SMS} component server, may choose to support a particular \ac{PMIx} \ac{API}, they are not required to support every attribute that might apply to it. This would pose a significant barrier to entry for an implementer as there can be a broad range of applicable attributes to a given \ac{API}, at least some of which may rarely be used.
Note that an environment that does not include support for a particular attribute/\ac{API} pair is not ``incomplete'' or of lower quality than one that does include that support. Vendors must decide where to invest their time based on the needs of their target markets, and it is perfectly reasonable for them to perform cost/benefit decisions when considering what functions and attributes to support.
Attributes in this document are organized according to their primary usage, either grouped with a specific \ac{API} or included in an appropriate functional chapter. Attributes in the \ac{PMIx} Standard all start with \code{"PMIX"} in their name, and many include a functional description as part of their name (e.g., the use of \code{"PMIX_FABRIC_"} at the beginning of fabric-specific attributes). The \ac{PMIx} Standard also defines an attribute that can be used to indicate that an attribute variable has not yet been set:
\declareAttribute{PMIX_ATTR_UNDEF}{"pmix.undef"}{NULL}{
A default attribute name signifying that the attribute field of a \ac{PMIx} structure (e.g., a \refstruct{pmix_info_t}) has not yet been defined.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{PMIx Roles}
\label{chap:intro:roles}
The role of a \ac{PMIx} process in the \ac{PMIx} universe is grouped into one of three categories based on how it operates in the \ac{PMIx} environment namely as a \emph{client}, \emph{server}, or \emph{tool}.
As a result, there are three corresponding groupings of \acp{API} each with their own initialization and finalization functions.
If a process initializes as either a \emph{server} or a \emph{tool} that process may also access all of the \emph{client} \acp{API}.
A process operating as a \refterm{client} is connected to the \ac{PMIx} server instance within an \ac{RM} when the client calls the client \ac{PMIx} initialization routine.
The \refterm{client} is typically started directly or indirectly (for example, by an intermediate script) by that \ac{RM}.
Additionally, a \refterm{client} may be started directly by the user and then connect to an \ac{RM} which is typically referred to as a \declareterm{singleton} launch.
A process operating as a \declareterm{server} is responsible for starting client processes and coordinating with other server and tool processes in the same \ac{PMIx} universe.
Often processes operating as a \emph{server} are part of the \acf{RM} infrastructure.
A process operating as a \declareterm{tool} is started independently (e.g., via fork/exec) or by the \ac{RM} and will connect to a \ac{PMIx} \emph{server} to interact with the processes in the \ac{PMIx} universe.
An example of a \emph{tool} process is a parallel debugger that will connect to the server to assist with attaching to a set of client processes.
\ac{PMIx} serves as a conduit between processes acting in these three different roles.
As such, an \ac{API} is often described by how it interacts with processes operating in other roles in the \ac{PMIx} universe.
\adviceimplstart
A \ac{PMIx} implementation may support all or a subset of the \ac{API} role groupings defined in the standard.
A common nomenclature is defined here to aid in identifying levels of conformance of an implementation.
Note that it would not make sense for an implementation to exclude the \emph{client} interfaces from their implementation since they are also used by the \emph{server} and \emph{tool} roles.
Therefore the \emph{client} interfaces represent the minimal set of required functionality for \ac{PMIx} compliance.
A \ac{PMIx} implementation that supports only the \emph{client} \acp{API} is said to be \emph{client-role \ac{PMIx} standard compliant}.
Similarly, a \ac{PMIx} implementation that only supports the \emph{client} and \emph{tool} \acp{API} is said to be \emph{client-role and tool-role \ac{PMIx} standard compliant}.
Finally, a \ac{PMIx} implementation that only supports the \emph{client} and \emph{server} \acp{API} is said to be \emph{client-role and server-role \ac{PMIx} standard compliant}.
A \ac{PMIx} implementation that supports all three sets of the \ac{API} role groupings is said to be \emph{client-role, server-role, and tool-role \ac{PMIx} standard compliant}.
These \emph{client-role,server-role, and tool-role \ac{PMIx} standard compliant} implementations have the advantage of being able to support a broad set of \ac{PMIx} consumers in the different roles.
\adviceimplend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Application Binary Interface (ABI)}
\label{chap:intro:abi}
An \acf{API} defines how data types, data structures, and functions are represented in source code.
An \acf{ABI} defines how data types, data structures, and functions are represented in machine code for a given system making it platform-specific.
An important aspect of an \ac{ABI} is the size, layout, and alignment of data structures.
A stable \ac{ABI} may allow a program compiled with one implementation of the PMIx Standard to run with a different implementation of the PMIx Standard as long as both implementations adhere to the same \ac{ABI}.
The PMIx Standard strives to maintain a stable \ac{ABI} to support applications and tools that rely on more than one implementation of the PMIx Standard.
To facilitate such interoperability the PMIx \ac{ASC} maintains a set of standardized headers that are versioned with the PMIx Standard that applications and tools can reference~\footnote{PMIx Headers for ABI Compatibility \url{https://github.com/pmix/pmix-abi}}.
In recognition that there are circumstances where the \ac{ABI} needs to be modified this section defines some guidance for making such modifications.
Additions to the PMIx interface can occur without breaking \ac{ABI} compatibility.
Deprecating portions of the PMIx interface does not break \ac{ABI} compatibility but serves as a warning that the \ac{ABI} may be impacted in the future.
Removing portions of the PMIx interface does break \ac{ABI} compatibility.
Modifications to the existing PMIx interface do break \ac{ABI} compatibility.
The PMIx \ac{ABI} is comprised of the following:
\begin{compactitemize}
\item Function signatures
\item Constants and their values
\item Structures and their membership (including field position)
\item Attributes that are required to be implemented and their string representation
\end{compactitemize}
The \ac{ASC} prohibits data structure and function signature changes that break the \ac{ABI}
(e.g., modifications to data structure member alignment and/or size).
It is preferred that the authors create a new data structure or function and
duplicate all impacted PMIx interfaces.
\rationalestart
Breaking the \ac{ABI} of data structures and function signatures while using the same
name prevents a PMIx implementation from supporting backward compatible and
cross-version compatible \ac{PMIx} libraries.
\rationaleend
The value associated with attributes and constants are part of an \ac{API} and not part of a \refterm{Linker ABI}.
The PMIx Standard defines a \refterm{Build ABI} where the value associated with PMIx attributes and constants can be assumed based on the \ac{ABI} version.
The PMIx Standard \ac{ABI} is defined by a set of \ac{ASC} maintained headers.
The instantiation of the PMIx Standard \ac{ABI} in binary form is platform-specific.
The PMIx Standard and PMIx implementations must take care with data structures (e.g., \refstruct{pmix_value_t}) to ensure that the offset and size of members within the structure along with the size of the structure itself remain stable for a given platform.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{PMIx ABI Versioning}
\label{chap:intro:abi:versioning}
The PMIx Standard ABI is defined in two parts.
The \textit{PMIx Standard Stable ABI} represents the Stable PMIx Standard elements (see the PMIx Governance document).
The \textit{PMIx Standard Provisional ABI} represents the Provisional PMIx Standard elements (see the PMIx Governance document).
Both the Stable ABI and Provisional ABI are versioned with two increasing numbers:
\begin{compactitemize}
\item \texttt{MAJOR} incremented when the ABI changed in a backward-incompatible manner.
\item \texttt{MINOR} incremented when functionality is added to the ABI in a backward-compatible manner.
\end{compactitemize}
The PMIx Standard ABI version numbers largely follow the Semantic Versioning 2.0.0 specification\footnote{Semantic Versioning \url{https://semver.org/spec/v2.0.0.html}}.
However, a PMIx implementation may use a different version numbering technique for the objects associated with that implementation.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Linker versus Build ABI}
\label{chap:intro:abi:linkervbuild}
The software development community often discusses \ac{ABI} in two different ways, that of a \emph{Linker ABI} and of a \emph{Build ABI}.
A \declareterm{Linker ABI} defines a stable set of symbols (i.e., functions) against which a compiler will attempt to link the binary.
The Linker ABI does not specify constant values or macro definitions leaving those to individual implementations to define.
If Library A version \code{1} and Library A version \code{2} are Linker ABI compatible then a program that is compiled against Library A version \code{1} can link against Library A version \code{2}.
Note that the reverse is not necessarily true as Library A version \code{2} may define additional symbols not included in Library A version \code{1}.
An indication of this compatibility is useful when upgrading a library package on a system.
Often libraries rely on Semantic Versioning~\footnote{Semantic Versioning \url{https://semver.org/}} to signify breaks in the Linker ABI between versions of the library.
A \declareterm{Build ABI} defines the full set of symbols, constants, and macros used by a compiler to generate the resulting binary.
If Library A and Library B are Build ABI compatible then a program compiled against Library A will work when linked with Library B.
The PMIx Standard defines a Build ABI in the PMIx Standard ABI repository~\footnote{PMIx Headers for ABI Compatibility \url{https://github.com/pmix/pmix-abi}}.
Any program built against the headers defined in the PMIx Standard ABI version \code{X.Y} will work with any PMIx implementation that is \ac{ABI} compatible with version \code{X.Y}.
Note that the PMIx implementation may include additional items beyond the PMIX Standard ABI at version \code{X.Y} and still report being PMIx Standard version \code{X.Y} compliant.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%