-
Notifications
You must be signed in to change notification settings - Fork 189
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
update bugs: bos_image, basic_setup, shadow
Signed-off-by: tkucherera <[email protected]>
- Loading branch information
1 parent
f3393e1
commit 10fe611
Showing
30 changed files
with
1,262 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
104 changes: 104 additions & 0 deletions
104
docs/recipes/install/common/add_confluent_hosts_intro.tex
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
%\subsubsection{Register nodes for provisioning} | ||
|
||
\noindent Next, we add {\em compute} nodes and define their properties as | ||
attributes in \Confluent{} database. | ||
These hosts are grouped logically into a group named {\em | ||
compute} to facilitate group-level commands used later in the recipe. The compute | ||
group has to be defined first before we can add any nodes to the group using the | ||
{\texttt nodegroup define} command. Note the | ||
use of variable names for the desired compute hostnames, node IPs, MAC | ||
addresses, and BMC login credentials, which should be modified to accommodate | ||
local settings and hardware. To enable serial console access via \Confluent{}, | ||
{\texttt console.method} | ||
property is also defined. | ||
|
||
% begin_ohpc_run | ||
% ohpc_validation_newline | ||
% ohpc_validation_comment Add hosts to cluster \ref{sec:confluent_add_nodes} | ||
\begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily,] | ||
#define the compute group | ||
[sms](*\#*) nodegroupdefine compute | ||
|
||
# Define nodes as objects in confluent database | ||
[sms](*\#*) for ((i=0; i<$num_computes; i++)) ; do | ||
nodedefine ${c_name[$i]} groups=everything,compute hardwaremanagement.manager=${c_bmc[$i]} \ | ||
secret.hardwaremanagementuser=${bmc_username} secret.hardwaremanagementpassword=${bmc_password} \ | ||
net.hwaddr=${c_mac[$i]} | ||
done | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
\begin{center} | ||
\begin{tcolorbox}[] | ||
\small | ||
Defining nodes one-by-one, as done above, is only efficient | ||
for a small number of nodes. For larger node counts, | ||
\Confluent{} provides capabilities for automated detection and | ||
configuration. | ||
Consult the | ||
\href{https://hpc.lenovo.com/users/documentation/confluentdisco.html}{\color{blue}\Confluent{} | ||
Hardware Discovery \& Define Node Guide}. | ||
\end{tcolorbox} | ||
\end{center} | ||
|
||
%\clearpage | ||
If enabling {\em optional} IPoIB functionality (e.g. to support Lustre over \InfiniBand{}), additional | ||
settings are required to define the IPoIB network with \Confluent{} and specify | ||
desired IP settings for each compute. This can be accomplished as follows for | ||
the {\em ib0} interface: | ||
|
||
% begin_ohpc_run | ||
% ohpc_validation_newline | ||
% ohpc_validation_comment Setup IPoIB networking | ||
% ohpc_command if [[ ${enable_ipoib} -eq 1 ]];then | ||
% ohpc_indent 5 | ||
\begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily] | ||
# Register desired IPoIB IPs per compute | ||
[sms](*\#*) for ((i=0; i<$num_computes; i++)) ; do | ||
nodeattrib ${c_name[i]} net.ib0.ipv4_address=${c_ipoib[i]}/${ipoib_netmask} | ||
done | ||
\end{lstlisting} | ||
% ohpc_indent 0 | ||
% ohpc_command fi | ||
% end_ohpc_run | ||
|
||
%\clearpage | ||
With the desired compute nodes and domain identified, the remaining steps in the | ||
provisioning configuration process are to define the provisioning mode and | ||
image for the {\em compute} group and use \Confluent{} commands to complete | ||
configuration for network services like DNS and DHCP. These tasks are | ||
accomplished as follows: | ||
|
||
%\clearpage | ||
% begin_ohpc_run | ||
% ohpc_validation_newline | ||
% ohpc_validation_comment Complete networking setup, associate provisioning image | ||
\begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily,literate={BOSSHORT}{\baseosshort{}}1 {IMAGE}{\installimage{}}1] | ||
# Associate desired provisioning image for computes | ||
[sms](*\#*) nodedeploy -n compute -p rocky-9.4-x86_64-default | ||
\end{lstlisting} | ||
|
||
%%% If the Lustre client was enabled for computes in \S\ref{sec:lustre_client}, you | ||
%%% should be able to mount the file system post-boot using the fstab entry | ||
%%% (e.g. via ``\texttt{mount /mnt/lustre}''). Alternatively, if | ||
%%% you prefer to have the file system mounted automatically at boot time, a simple | ||
%%% postscript can be created and registered with \xCAT{} for this purpose as follows. | ||
%%% | ||
%%% % begin_ohpc_run | ||
%%% % ohpc_validation_newline | ||
%%% % ohpc_validation_comment Optionally create xCAT postscript to mount Lustre client | ||
%%% % ohpc_command if [ ${enable_lustre_client} -eq 1 ];then | ||
%%% % ohpc_indent 5 | ||
%%% \begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily,literate={BOSVER}{\baseos{}}1] | ||
%%% # Optionally create postscript to mount Lustre client at boot | ||
%%% [sms](*\#*) echo '#!/bin/bash' > /install/postscripts/lustre-client | ||
%%% [sms](*\#*) echo 'mount /mnt/lustre' >> /install/postscripts/lustre-client | ||
%%% [sms](*\#*) chmod 755 /install/postscripts/lustre-client | ||
%%% # Register script for computes | ||
%%% [sms](*\#*) chdef compute -p postscripts=lustre-client | ||
%%% \end{lstlisting} | ||
%%% % ohpc_indent 0 | ||
%%% % ohpc_command fi | ||
%%% % end_ohpc_run | ||
%%% | ||
|
61 changes: 61 additions & 0 deletions
61
docs/recipes/install/common/add_to_compute_confluent_intro.tex
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
% -*- mode: latex; fill-column: 120; -*- | ||
|
||
The next step is adding \OHPC{} components to the {\em compute} nodes that at this | ||
point are running basic OSes. This process will leverage two \Confluent{}-provided | ||
commands: \texttt{nodeshell} to run \texttt{\pkgmgr{}} installer on all the | ||
nodes in parallel and \texttt{nodersync} to distribute configuration files from the | ||
SMS to the {\em compute} nodes. | ||
|
||
\noindent To do this, repositories on the {\em compute} nodes need to be configured | ||
properly. | ||
|
||
\Confluent{} has automatically setup an OS repository on the SMS and configured the | ||
nodes to use it, but it has also enabled online OS repositories. | ||
|
||
|
||
\noindent Next, we alse add the OHPC repo to the compute nodes \S\ref{sec:enable_repo} | ||
|
||
% begin_ohpc_run | ||
% ohpc_comment_header Setup nodes repositories and Install OHPC components \ref{sec:add_components} | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
# Add OpenHPC repo | ||
[sms](*\#*) (*\chrootinstall*) http://repos.openhpc.community/OpenHPC/3/EL_9/x86_64/ohpc-release-3-1.el9.x86_64.rpm | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
The {\em compute} nodes also need access to the EPEL repository, a required | ||
dependency for \OHPC{} packages. | ||
|
||
% begin_ohpc_run | ||
% ohpc_comment_header Configure access to EPEL repo | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
# Add epel repo | ||
[sms](*\#*) (*\chrootinstall*) epel-release | ||
|
||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
|
||
\noindent Additionally, a workaround is needed for \OHPC{} documentation files, | ||
which are installed into a read-only NFS share /opt/ohpc/pub. Any package | ||
attempting to write to that directory will fail to install. The following | ||
prevents that by directing \texttt{rpm} not to install documentation files on | ||
the {\em compute} nodes: | ||
|
||
% begin_ohpc_run | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
[sms](*\#*) nodeshell compute echo -e %_excludedocs 1 \>\> ~/.rpmmacros | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
\noindent Now \OHPC{} and other cluster-related software components can be | ||
installed on the nodes. The first step is to install a base compute package: | ||
% begin_ohpc_run | ||
% ohpc_comment_header Add OpenHPC base components to compute image | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
# Install compute node base meta-package | ||
[sms](*\#*) (*\chrootinstall*) ohpc-base-compute | ||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
\noindent Next, we can include additional components: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Installation is accomplished in two steps: First, a generic OS | ||
image is installed on {\em compute} nodes and then, once the nodes are up | ||
and running, \OHPC{} components are added to both the SMS and the nodes at the | ||
same time. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
Here we set up \NFS{} mounting of a | ||
\$HOME file system and the public \OHPC{} install path (\texttt{/opt/ohpc/pub}) | ||
that will be hosted by the {\em master} host in this example configuration. | ||
|
||
\vspace*{0.15cm} | ||
% begin_ohpc_run | ||
% ohpc_comment_header Customize system configuration \ref{sec:master_customization} | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] | ||
# Disable /tftpboot and /install export entries | ||
[sms](*\#*) perl -pi -e "s|/tftpboot|#/tftpboot|" /etc/exports | ||
[sms](*\#*) perl -pi -e "s|/install|#/install|" /etc/exports | ||
|
||
# Export /home and OpenHPC public packages from master server | ||
[sms](*\#*) echo "/home *(rw,no_subtree_check,fsid=10,no_root_squash)" >> /etc/exports | ||
[sms](*\#*) echo "/opt/ohpc/pub *(ro,no_subtree_check,fsid=11)" >> /etc/exports | ||
[sms](*\#*) exportfs -a | ||
[sms](*\#*) systemctl restart nfs-server | ||
[sms](*\#*) systemctl enable nfs-server | ||
|
||
|
||
# Create NFS client mounts of /home and /opt/ohpc/pub on compute hosts | ||
[sms](*\#*) nodeshell compute echo \ | ||
"\""${sms_ip}:/home /home nfs nfsvers=3,nodev,nosuid 0 0"\"" \>\> /etc/fstab | ||
[sms](*\#*) nodeshell compute echo \ | ||
"\""${sms_ip}:/opt/ohpc/pub /opt/ohpc/pub nfs nfsvers=3,nodev 0 0"\"" \>\> /etc/fstab | ||
[sms](*\#*) nodeshell compute systemctl restart nfs | ||
|
||
# Mount NFS shares | ||
[sms](*\#*) nodeshell compute mount /home | ||
[sms](*\#*) nodeshell compute mkdir -p /opt/ohpc/pub | ||
[sms](*\#*) nodeshell compute mount /opt/ohpc/pub | ||
|
||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
59 changes: 59 additions & 0 deletions
59
docs/recipes/install/common/confluent_init_os_images_rocky.tex
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
% -*- mode: latex; fill-column: 120; -*- | ||
|
||
With the provisioning services enabled, the next step is to define | ||
a system image that can subsequently be | ||
used to provision one or more {\em compute} nodes. The following subsections highlight this process. | ||
|
||
\subsubsection{Build initial BOS image} \label{sec:assemble_bos} | ||
The following steps illustrate the process to build a minimal, default image for use with \Confluent{}. To begin, you will | ||
first need to have a local copy of the ISO image available for the underlying OS. In this recipe, the relevant ISO image | ||
is \texttt{Rocky-9.4-x86\_64-dvd.iso} (available from the Rocky | ||
\href{https://rockylinux.org/download/}{\color{blue}download} page). | ||
We initialize the image | ||
creation process using the \texttt{osdeploy} command assuming that the necessary ISO image is available locally in | ||
\texttt{\$\{iso\_path\}} as follows: | ||
|
||
The \texttt{osdeploy initialize} command is used to prepare a confluent server to deploy deploy operating systems. | ||
For first time setup, run osdeploy initialize interactively to be walked through the various options using: | ||
\texttt{osdeploy initialize -i} | ||
|
||
% begin_ohpc_run | ||
% ohpc_comment_header Initialize OS images for use with Confluent \ref{sec:assemble_bos} | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true,keepspaces,literate={BOSVER}{\baseos{}}1] | ||
[sms](*\#*) osdeploy initialize -${initialize_options} | ||
[sms](*\#*) osdeploy import ${iso_path} | ||
|
||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
\noindent Once completed, OS image should be available for use within \Confluent{}. These can be queried via: | ||
|
||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true,keepspaces,literate={BOSVER}{\baseos{}}1] | ||
# Query available images | ||
[sms](*\#*) osdeploy list | ||
Distributions: | ||
rocky-8.5-x86_64 | ||
rocky-9.4-x86_64 | ||
Profiles: | ||
rhel-9.4-x86_64-default | ||
rocky-8.5-x86_64-default | ||
\end{lstlisting} | ||
|
||
If needing to copy files from the sms node to the compute nodes during deployment, this can be done by | ||
modifying the syncfiles file that is created when \texttt{osdeploy import} command is run. For an environment | ||
that has no DNS server and needs to have /etc/hosts file synced amongst all the nodes, the following command | ||
should be run. | ||
|
||
% begin_ohpc_run | ||
% ohpc_validation_newline | ||
% ohpc_validation_comment Sync the hosts file in cluster | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true,keepspaces,literate={BOSVER}{\baseos{}}1] | ||
[sms](*\#*) echo "/etc/hosts -> /etc/hosts" >> /var/lib/confluent/public/os/rocky-9.4-x86_64-default/syncfiles | ||
|
||
\end{lstlisting} | ||
% end_ohpc_run | ||
|
||
|
||
%The \texttt{CHROOT} environment variable highlights the path and is used by | ||
%subsequent commands to augment the basic installation. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
At this point, all of the packages necessary to use \Confluent{} on the {\em master} | ||
host should be installed. Next, we enable support for local provisioning using | ||
a second private interface (refer to Figure~\ref{fig:physical_arch}) | ||
|
||
% begin_ohpc_run | ||
% ohpc_comment_header Complete basic Confluent setup for master node \ref{sec:setup_confluent} | ||
%\begin{verbatim} | ||
\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true,keepspaces] | ||
# Enable internal interface for provisioning | ||
[sms](*\#*) ip link set dev ${sms_eth_internal} up | ||
[sms](*\#*) ip address add ${sms_ip}/${internal_netmask} broadcast + dev ${sms_eth_internal} | ||
|
||
\end{lstlisting} | ||
%\end{verbatim} | ||
% end_ohpc_run | ||
|
||
|
||
\noindent \Confluent{} requires a network domain name specification for system-wide name | ||
resolution. This value can be set to match your local DNS schema or given a | ||
unique identifier such as `local`. A default group called everything is | ||
automatically added to every node. It provides a method to indicate global settings. | ||
Attributes may all be specified on the command line, and an example set could be: | ||
|
||
% begin_ohpc_run | ||
% ohpc_validation_newline | ||
% ohpc_validation_comment Define local domainname, deployment protocol and dns | ||
\begin{lstlisting}[language=bash,keywords={},upquote=true,basicstyle=\footnotesize\ttfamily,literate={BOSVER}{\baseos{}}1] | ||
[sms](*\#*) nodegroupattrib everything deployment.useinsecureprotocols=${deployment_protocols} dns.domain=${dns_domain} | ||
[sms](*\#*) nodegroupattrib everything dns.servers=${dns_servers} net.ipv4_gateway=${sms_ip} | ||
\end{lstlisting} | ||
|
||
\noindent We will also define |
Oops, something went wrong.