Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gitops ( FluxCD ) Installation with template YAMLS #3407

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 6 additions & 156 deletions docs/source/jupyterhub/installation.md
Original file line number Diff line number Diff line change
@@ -1,161 +1,11 @@
(quick-install)=

# Installing JupyterHub

With a {doc}`Kubernetes cluster </kubernetes/setup-kubernetes>` available
and {doc}`Helm </kubernetes/setup-helm>` installed, we can install JupyterHub
in the Kubernetes cluster using the JupyterHub Helm chart.

## Initialize a Helm chart configuration file

Helm charts' contain {term}`templates <Helm template>` that can be rendered to
the {term}`Kubernetes resources <Kubernetes resource>` to be installed. A user
of a Helm chart can override the chart's default values to influence how the
templates render.

In this step we will initialize a chart configuration file for you to adjust
your installation of JupyterHub. We will name and refer to it as `config.yaml`
going onwards.
Installing JupyterHub offers two primary methods: Helm and GitOps tools like FluxCD, ArgoCD. The choice between these methods depends on the specific requirements of your team and deployment scenario. If you're setting up JupyterHub for a limited number of users or for simpler deployments, installing via Helm may be the preferred option. Helm provides a straightforward installation process with minimal configuration. On the other hand, for larger, multi-user teams or complex deployment environments, utilizing GitOps tools such as FluxCD, ArgoCD can significantly streamline infrastructure management. FluxCD automates the deployment and maintenance of JupyterHub, ensuring consistency and reliability through version-controlled manifests. Consider the size of your team and the complexity of your deployment environment when choosing between Helm and GitOps tools for installing JupyterHub.

```{admonition} Introduction to YAML
If you haven't worked with YAML before, investing some
minutes [learning about it](https://www.youtube.com/watch?v=cdLNKUoMc6c)
will likely be worth your time.
```

As of version 1.0.0, you don't need any configuration to get started so you can
just create a `config.yaml` file with some helpful comments.
```{toctree}
:maxdepth: 2
:caption: Installing JupyterHub

```yaml
# This file can update the JupyterHub Helm chart's default configuration values.
#
# For reference see the configuration reference and default values, but make
# sure to refer to the Helm chart version of interest to you!
#
# Introduction to YAML: https://www.youtube.com/watch?v=cdLNKUoMc6c
# Chart config reference: https://zero-to-jupyterhub.readthedocs.io/en/stable/resources/reference.html
# Chart default values: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/HEAD/jupyterhub/values.yaml
# Available chart versions: https://hub.jupyter.org/helm-chart/
#
installation/helm
installation/fluxcd
```

In case you are working from a terminal and are unsure how to create this file,
can try with `nano config.yaml`.

## Install JupyterHub

1. Make Helm aware of the [JupyterHub Helm chart repository](https://hub.jupyter.org/helm-chart/) so you can install the
JupyterHub chart from it without having to use a long URL name.

```
helm repo add jupyterhub https://hub.jupyter.org/helm-chart/
helm repo update
```

This should show output like:

```
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "stable" chart repository
...Successfully got an update from the "jupyterhub" chart repository
Update Complete. ⎈ Happy Helming!⎈
```

2. Now install the chart configured by your `config.yaml` by running this
command from the directory that contains your `config.yaml`:

```
helm upgrade --cleanup-on-fail \
--install <helm-release-name> jupyterhub/jupyterhub \
--namespace <k8s-namespace> \
--create-namespace \
--version=<chart-version> \
--values config.yaml
```

where:

- `<helm-release-name>` refers to a [Helm release name](https://helm.sh/docs/glossary/#release), an identifier used to
differentiate chart installations. You need it when you are changing or
deleting the configuration of this chart installation. If your Kubernetes
cluster will contain multiple JupyterHubs make sure to differentiate them.
You can list your Helm releases with `helm list`.
- `<k8s-namespace>` refers to a [Kubernetes namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/),
an identifier used to group Kubernetes resources, in this case all
Kubernetes resources associated with the JupyterHub chart. You'll need the
namespace identifier for performing any commands with `kubectl`.
- This step may take a moment, during which time there will be no output
to your terminal. JupyterHub is being installed in the background.
- If you get a `release named <helm-release-name> already exists` error, then
you should delete the release by running `helm delete <helm-release-name>`.
Then reinstall by repeating this step. If it persists, also do `kubectl delete namespace <k8s-namespace>` and try again.
- In general, if something goes _wrong_ with the install step, delete the
Helm release by running `helm delete <helm-release-name>`
before re-running the install command.
- If you're pulling from a large Docker image you may get a
`Error: timed out waiting for the condition` error, add a
`--timeout=<number-of-minutes>m` parameter to the `helm` command.
- The `--version` parameter corresponds to the _version of the Helm
chart_, not the version of JupyterHub. Each version of the JupyterHub
Helm chart is paired with a specific version of JupyterHub. E.g.,
`0.11.1` of the Helm chart runs JupyterHub `1.3.0`.
For a list of which JupyterHub version is installed in each version
of the JupyterHub Helm Chart, see the [Helm Chart repository](https://hub.jupyter.org/helm-chart/).

3. While Step 2 is running, you can see the pods being created by entering in
a different terminal:

```
kubectl get pod --namespace <k8s-namespace>
```

To remain sane we recommend that you enable autocompletion for kubectl
(follow [the kubectl installation instructions for your platform](https://kubernetes.io/docs/tasks/tools/#kubectl)
to find the shell autocompletion instructions)

and set a default value for the `--namespace` flag:

```
kubectl config set-context $(kubectl config current-context) --namespace <k8s-namespace>
```

4. Wait for the _hub_ and _proxy_ pod to enter the `Running` state.

```
NAME READY STATUS RESTARTS AGE
hub-5d4ffd57cf-k68z8 1/1 Running 0 37s
proxy-7cb9bc4cc-9bdlp 1/1 Running 0 37s
```

5. Find the IP we can use to access the JupyterHub. Run the following
command until the `EXTERNAL-IP` of the `proxy-public` [service](https://kubernetes.io/docs/concepts/services-networking/service/)
is available like in the example output.

```
kubectl --namespace <k8s-namespace> get service proxy-public
```

```
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
proxy-public LoadBalancer 10.51.248.230 104.196.41.97 80:31916/TCP 1m
```

Or, use the short form:

```
kubectl --namespace <k8s-namespace> get service proxy-public --output jsonpath='{.status.loadBalancer.ingress[].ip}'
```

6. To use JupyterHub, enter the external IP for the `proxy-public` service in
to a browser. JupyterHub is running with a default _dummy_ authenticator so
entering any username and password combination will let you enter the hub.

Congratulations! Now that you have basic JupyterHub running, you can {ref}`extend it <extending-jupyterhub>` and {ref}`optimize it <optimization>` in many
ways to meet your needs.

Some examples of customizations are:

- Configure the login to use the account that makes sense to you (Google, GitHub, etc.).
- Use a suitable pre-built image for the user container or build your own.
- Host it on <https://your-domain.com>.
157 changes: 157 additions & 0 deletions docs/source/jupyterhub/installation/fluxcd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
(flux-cd)=

# Installing JupyterHub Using FluxCD

## Why FluxCD for JupyterHub?

Running JupyterHub can often result in significant infrastructure overhead, especially for data science and data engineering teams. Provisioning and managing compute resources, dependencies, and ensuring high availability can be time-consuming and error-prone.

With FluxCD, you can have everything - from configuration to state - maintained in your Git repositories. This means that all configuration changes are version-controlled, providing a single source of truth for your JupyterHub deployment.

FluxCD automates the deployment and management of JupyterHub infrastructure, reducing manual effort and minimizing the risk of misconfigurations. Whether it's scaling resources up or down, FluxCD handles it seamlessly, providing a hassle-free experience for teams.Implementing continuous delivery practices with FluxCD enables automated deployment pipelines for JupyterHub. This ensures faster and more reliable delivery of updates and enhancements to your JupyterHub environment, empowering data science and data engineering teams to focus more on their work and less on infrastructure management.

## Prerequisites and Setting Up FluxCD

Before setting up FluxCD and deploying JupyterHub, ensure you have the following prerequisites:

1. **Access to Your Kubernetes Cluster:**
Make sure you have access to your Kubernetes cluster, whether it's on EKS, AKS, GKE, or any other Kubernetes distribution.

2. **FluxCLI:**
Install FluxCLI on your local machine. You can refer to the [official Flux documentation](https://fluxcd.io/flux/get-started/) for installation instructions.

3. **Repositories Setup:**
Ensure you have repositories set up where you'll maintain the code to be bootstrapped with FluxCD. This could be on GitLab, GitHub or any other version control platform.You can follow the steps outlined in the [FluxCD documentation](https://fluxcd.io/flux/installation/bootstrap/) to set up your repositories with FluxCD.

4. **Repository Structure:**
Consider how you want to structure your repositories based on your use case and interaction frequency with the infrastructure. For a simple structure, you can have all three required YAML files (Kustomization, HelmRelease, and HelmRepo) under one directory. This straightforward approach simplifies management and organization.If your use case demands more complexity or if you anticipate frequent changes to the infrastructure, you may opt for a more layered or modular structure. This could involve separate directories ( apps, infra, clusters) for different kustomize overlays of your infrastructure, with each repository containing its own set of configuration files and manifests.

For more guidance on repository structure options, refer to the [FluxCD documentation on repository structure](https://fluxcd.io/flux/guides/repository-structure/).

## Install JupyterHub

For the base installation of JupyterHub, you can refer to the template FluxCD files located in the `jupyterhub/templates/fluxcd/baseinstall` directory of this repository. These template files provide a starting point for deploying JupyterHub using FluxCD.

**Basic Configuration YAMLs for Installing JupyterHub with FluxCD:**

1. **HelmRepository YAML:**

- Defines the Helm chart repository source for JupyterHub.

```yaml
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
name: jupyterhub
namespace: flux-system
spec:
interval: 1m
url: "https://jupyterhub.github.io/helm-chart/"
```

**Where:** This YAML defines a HelmRepository named "jupyterhub" in the "flux-system" namespace. It specifies the URL of the Helm chart repository for JupyterHub and sets the interval for checking for updates to 1 minute.

2. **HelmRelease YAML:**

- Deploys JupyterHub using the Helm chart obtained from the HelmRepository.

```yaml
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: jupyterhub
namespace: flux-system
spec:
interval: 5m
releaseName: jupyterhub
targetNamespace: jupyter
chart:
spec:
chart: jupyterhub
version: "X.X.X"
sourceRef:
kind: HelmRepository
name: jupyterhub
namespace: flux-system
```

**Where:** This YAML defines a HelmRelease named "jupyterhub" in the "flux-system" namespace. It specifies the Helm chart to be deployed, including its version, obtained from the "jupyterhub" HelmRepository. It sets the interval for checking for updates to 5 minutes and deploys JupyterHub to the "jupyter" namespace.

3. **Kustomization YAML:**

- Manages the customization of the deployment, including additional configurations or resources.

```yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- jupyterhub-repo.yaml
- jupyterhub-release.yaml
```

**Where:** This YAML defines a Kustomization resource that includes the YAML files for the HelmRepository and HelmRelease. It specifies the resources to be managed by Kustomize for generating the final set of Kubernetes resources.

These YAML files provide the basic configuration for deploying JupyterHub using FluxCD. Customize them as needed for your specific deployment requirements.

**Push to Git Repository:**
Once you've configured these YAML files, push them to your Git repository that you have bootstrapped with your Kubernetes cluster. FluxCD will automatically detect and apply changes from the Git repository to your cluster.

Certainly! Here's the updated section:

**Post-Deployment Steps:**

After pushing all YAML files to your Git repository that you have bootstrapped with your Kubernetes cluster, you can check the deployment and access JupyterHub using the following steps:

1. **Check Pod Status:**
Monitor the creation of pods by entering the following command in a separate terminal:

```bash
kubectl get pod --namespace <k8s-namespace>
```

_Replace `<k8s-namespace>` with the namespace you used for the deployment._

2. **Enable kubectl Autocompletion:**
To remain sane, we recommend enabling autocompletion for kubectl. Follow the kubectl installation instructions for your platform to find the shell autocompletion instructions. Additionally, set a default value for the `--namespace` flag with the following command:

```bash
kubectl config set-context $(kubectl config current-context) --namespace <k8s-namespace>
```

3. **Wait for Pods to Enter Running State:**
Wait for the hub and proxy pods to enter the Running state. You can use the following command to monitor their status:

```bash
kubectl get pod --namespace <k8s-namespace>
```

_Replace `<k8s-namespace>` with the namespace you used for the deployment._

Example output:

```
NAME READY STATUS RESTARTS AGE
hub-5d4ffd57cf-k68z8 1/1 Running 0 37s
proxy-7cb9bc4cc-9bdlp 1/1 Running 0 37s
```

4. **Find External IP:**
Once the pods are running, find the external IP that you can use to access JupyterHub. Run the following command until the `EXTERNAL-IP` of the `proxy-public` service is available:

```bash
kubectl --namespace <k8s-namespace> get service proxy-public
```

_Replace `<k8s-namespace>` with the namespace you used for the deployment._

Or, use the short form:

```bash
kubectl --namespace <k8s-namespace> get service proxy-public --output jsonpath='{.status.loadBalancer.ingress[].ip}'
```

5. **Access JupyterHub:**
To use JupyterHub, enter the external IP for the `proxy-public` service into a browser. JupyterHub is running with a default dummy authenticator, so entering any username and password combination will let you enter the hub.

Congratulations! Now that you have basic JupyterHub running, you can {ref}`extend it <extending-jupyterhub>` and {ref}`optimize it <optimization>` in many
ways to meet your needs.
Loading
Loading