Dave's instructions (working)
In this mini-course, we're going to take a thorough look at the basics of Kubernetes. You're going to:
- set up a load-balanced web application that mixes stateful and stateless containers
- Learn the primitives that you'll use to configure your application on Kubernetes: Containers, Pods, Services, Deployments, ReplicaSets, DaemonSets
- leverage DigitalOcean's kubernetes service so you can save time and get right to defining and rolling out an application
- learn about Kubernetes' architecture (which will give you the context to understand how all of this works)
- avoid common problems and misunderstandings that can make using Kubernetes painful
- a bunch more stuff
Kubernetes is a lego set that lets you build an infrastructure platform for your application or distributed services. You will never assign services to specific nodes again. The main conceptual thing it does is SCHEDULE WORK on your 'datacenter computer.'
It's about getting away from managing individual, physical servers, and moving towards defining your application in a way that lets it feed on shared resources (CPU time, RAM, persistent storage, cloud infrastructure abstractions like load balancers) in a datacenter somewhere far away.
- It orchestrates and manages the containers that your application is made of
- It takes you from "some sofware packaged into containers" to "a complex real-life application, highly available, properly abstracted into separately scalable components, and straightforward to operate."
- Pods manage containers
- ReplicaSets manage Pods
- Services expose pod processes to the outside world
- ConfigMaps and Secrets configure Pods
- Labels are the plumbing that ties everything together
- Deployments manage the change from one set of containers to another (e.g. for a software release)
- DaemonSets and Jobs give you alternate ways to think about running processes (e.g. sidecar containers and one-offs)
- Provider-specific concepts like LoadBalancers let kubernetes tie into cloud-specific services and abstractions
- Dynamic Volume Provisioning and StorageClass objects
- Datastore Replication using StatefulSets
Requirements:
- a basic understanding of linux containers and why you might use them
- a DO account (link in description). You can get away without this but you'll have to adapt some of the things we do to whatever kubernetes setup you're using.
A containerized microservice architecture. Separately scalable components. Stateful components decoupled from stateless components.
Getting your application architecture to that point is 90% of the battle. Defining, running, tooling, and maintaining it on Kubernetes is the other 90%. Yay job security!
- see 01-kubectl-setup.md
- see 02-digitalocean-setup.md
Kubernetes is, like so many other things, a CRUD app. You can
- Create resources (kubectl apply)
- Read resources (kubectl get)
- Update resources (kubectl edit/apply)
- Delete resources (kubectl delete)
Get help with
kubectl help
kubectl get nodes (you've already seen this) kubectl get componentstatuses (what the kubernetes master is made of)
kubectl describe nodes node-1 kubectl describe
Default output is shortened.
- use -o wide to get full output (like -v for other CLI applications)
- use -o json or -o yaml to get full output in machine-parsable formats
- if you're piping output into other Unix tools, use --no-headers to avoid having to trash the first line
- you can also use jsonpath output and pass in a template, but I prefer working with the 'jq' tool on the command line
- represented as JSON or YAML files
- you create these and then push them to the kubernetes API with kubectl
- (or receive them as output from kubectl after it hits the kubernetes API for you)
Regardless of the type of object you're dealing with, the basic syntax is:
kubectl apply -f obj.yaml
You can interactively edit using
kubectl edit <resource-name> <obj-name>
This will:
- download a yaml representation of the object using the kubernetes API
- launch your $EDITOR on that textfile
- re-upload your saved changed using the Kubernetes API, which will re-evaluate that object definition and make the necessary changes.
If you've got a resource file (manifest) handy:
kubectl delete -f obj.yaml
or any object in the kubernetes system with:
kubectl delete <resource-name> <obj-name>
Lots of stuff (load balancing) is done with labels. Add labels to objects like this:
Add a label to a pod object, overwrite that label value, then delete the label completely:
kubectl label pods davespod color=red
kubectl label --overwrite pods davespod color=blue
kubectl label pods davespod color-
Like folders, used to organize objects If you don't specify one with --namespace, kubectl uses the 'default' namespace
More permanent namespaces (modifies your kubernetes config file at $HOME/.kube/config) Useful if you're managing multiple kubernetes clusters
kubectl config set-context my-context --namespace=mystuff kubectl config use-context my-context
kubectl get deployments --namespace=kube-system kube-dns kubectl get deployments --namespace=kube-system kubernetes-dashboard (dashboard) kubectl get services --namespace=kube-system kubernetes-dashboard (dashboard LB service)
Pods are the smallest deployable unit in Kubernetes. So you're never deploying 'a container.' You're at the very least deploying a POD that only consists of one container.
Containers that belong together should be grouped into a pod. It's tempting to think of your Kubernetes nodes in the way you thought about 'machines' in the past, but that's not correct. Pods are the closest analogue to the traditional machine concept:
Containers in the same pod get scheduled together on the same node and have access to:
- shared network namespace
- shared hostname (UTS namespace)
- shared IPC namespace
- storage
- the only private thing containers in the same pod have is their own cgroup
The big difference is that different Pods, EVEN THOSE THAT ARE SCHEDULED ON THE SAME NODE, are isolated from each other. They don't share a hostname, IP, or IPC namespace.
Important note:
This Pod abstraction gives you something really important: it allows you to stick to the one-process per container philosphy with a minimum of hackery when using Docker and Kubernetes, because it adapts that one-process-per-container model to how real-life applications are architected. Sticking to one process per container will prevent all kinds of nasty problems that will bite you later on.
"Oh cool, my webapp, nginx container, redis service, and postgres all belong in the same pod!"
WRONG! Harshly criticize yourself and repent.
The question is, what components of your application do you need to scale together? Which processes absolutely have to be colocated on the same machine (shared storage or IPC) to function perfectly?
Here are some of the problems with the mega-pod above:
- All of these components will need to be scaled separately, and in different ways (a stateless webapp can easily scale horizontally -- run 20 extra copies of it! -- while your database might need to scale vertically -- just give more memory and CPU time to your single copy of the postgres process).
I'm going to show you how to manage pods, BUT: ReplicaSets (which we'll cover later) are a better way of actually managing your pods in production.
That said, all the concepts here (pod management, healthchecks, resource management) are building blocks that you need before you can go to more clever ways to deploy your application.
kubectl apply -f mypod.yaml
kubectl get pods
kubectl describe pods mypod
Delete it by name or from a manifest:
kubectl delete pods/kuard
kubectl delete -f mypod.yaml
This will let the pod shut down gracefully (default grace period is 30 secs). Container state is NOT saved (unless you're using PersistentVolumes).
kubectl port-forward kuard 8080:8080
We want to check if our pod services are healthy. Because processes can get into weird states, we want to make sure all of our containers in a pod are actually behaving normally. You get to define 'normally' in your pod manifest, and then that check will run against every container in your pod:
"Is this container alive?"
apiVersion: v1 kind: Pod metadata: name: kuard spec: containers: - image: gcr.io/kuar-demo/kuard-amd64:1 name: kuard livenessProbe: httpGet: path: /healthy port: 8080 initialDelaySeconds: 5 timeoutSeconds: 1 periodSeconds: 10 failureThreshold: 3 readinessProbe: httpGet: path: /ready port: 8080 ports: - containerPort: 8080 name: http protocol: TCP
kubectl apply -f kuard-pod-health.yaml
kubectl port-forward kuard 8080:8080
Point your browser to http://localhost:8080. Click the “Liveness Probe” tab. You should see a table that lists all of the probes that this instance of kuard has received. If you click the “fail” link on that page, kuard will start to fail health checks.
"Is this container ready to receive traffic?" Good for containers that have some bootstrapping/initialization to do before they're ready to receive traffic or respond to liveness checks.
- tcpsocket
- exec probes
As part of your pod definition, you'll want to define the resources you want to dedicate to each of your containers.
Add the following to your pod manifest (for each -containers item):
resources: requests: cpu: "500m" memory: "128Mi"
This requests a minimum amount of resources for each container, so that the scheduler knows how to pack these together on nodes.
Your containers will be guaranteed to get a minimum of these resources. If the node isn't filled to capacity, your containers can consume more resources (as much as is free on the node).
If you want to set resource limits for a container, you can do it like this:
limits: cpu: "1000m" memory: "256Mi"
By default, the filesystems mounted to your containers don't survive container or pod restarts. This is fine for stateless web applications, but not exactly ideal for your database.
You'll define two things in the pod manifest:
- The volumes that this pod will have available to it.
- The actual mounts that you want each of your different containers to have.
apiVersion: v1 kind: Pod metadata: name: kuard spec: volumes: - name: "kuard-data" hostPath: path: "/var/lib/kuard" containers: - image: gcr.io/kuar-demo/kuard-amd64:1 name: kuard volumeMounts: - mountPath: "/data" name: "kuard-data" ports: - containerPort: 8080 name: http protocol: TCP
You'll most commonly be using a few types, although many more are available:
- emptydir: a shared, empty directory mounted on all containers.
- cloud-provider-specific network storage (EBS, DigitalOcean's storage which we'll be using)
- hostpath: mounts from the host filesystem into the containers
- if a node fails after a pod is scheduled, that pod is not rescheduled
- you can't launch multiple running versions of the same pod (you'd have to copy and rename)
- once a pod is scheduled onto a node, it doesn't move
Labels are simple key-value pairs. 63-char alphanumeric strings. Can contain a slash, before which there must be a domain. e.g.
tutorialinux.com/foobar
Annotations are 'hints' that can be used by other tools.
kubectl get pods --selector="ver=2"
kubectl get pods --selector="app=bandicoot,ver=2"
kubectl get pods --selector="app in (alpaca,bandicoot)"
Is 'canary' set to anything? kubectl get deployments --selector="canary"
=, !=, in, notin, foobar, !foobar
Selectors are used in manifests:
--selector="app=alpaca,ver=1"
becomes:
selector:
app: alpaca
ver: 1
Additional metadata. Same format as label keys. More focus on the 'namespace' part (domain before the slash). e.g. kubernetes.io/change-cause
Usually higher-level semantic stuff, used for integration with tools, 3rd-party libraries, tracking reasons for doing things, adding other information that you don't want to filter by.
If you want to use it to filter, use a label instead.
Services are used to expose your pod deployments to other applications in your Kubernetes cluster (or the outside world). They use readiness checks to make sure that traffic is only sent to ready/healthy pods, and let you do all kinds of fancy things.
Create a service with
kubectl expose
e.g. # Create the deployment kubectl run alpaca-prod --image=gcr.io/kuar-demo/kuard-amd64:1 --replicas=3 --port=8080 --labels="ver=1,app=alpaca,env=prod"
# Create a service
kubectl expose deployment alpaca-prod
# Check it out
kubectl get services -o wide
Port forward and check out the service on localhost:48858
ALPACA_POD=$(kubectl get pods -l app=alpaca -o jsonpath='{.items[0].metadata.name}')
kubectl port-forward $ALPACA_POD 48858:8080
# DNS is created for you
# servicename.namespace.type.clusterbasedomain
dig alpaca-prod.default.svc.cluster.local
A NodePort maps a certain port to a service, so that any Kubernetes node receiving traffic there forwards it to your service (which distributes it to a healthy pod).
Set spec.type for your service to NodePort.
Let's do this like professionals, and create a Cloud Load Balancer that points to our service.
Set spec.type for your service to LoadBalancer.
Check your EXTERNAL IP (the public DNS name or IP of your Cloud Load Balancer) and feel free to point DNS at it!
kubectl get services
######### TODO add service definition using LoadBalancer.
Each service also comes with 'endpoint' objects. So if you're writing software that knows it can hook into Kubernetes for service discovery, you can have your application do things like this, instead of hardcoding IPs or using DNS:
kubectl get endpoints alpaca-prod --watch
kubectl describe endpoints alpaca-prod
Great, you want to run a pod. But might you want to run multiple copies of the same pod?
- redundancy / high availability
- scaling
- sharding (farming out work to multiple nodes)
Keyword: reconciliation loop (Houston, the pinky has left the teacup!) Fancy word for a thing that all infra-as-code systems do -- they look at the current state, compare it to the ideal state, and make necessary changes. Start a container, kill a container, whatever.
ReplicaSets let you do this:
kubectl scale replicasets kuard --replicas=4
Horizontal pod autoscaling is supported. (Vertical is coming)
- Cluster autoscaling (actually adding nodes, i.e. compute resources, to a cluster) is supported via 3rd-party tie-ins (depending on your cloud provider).
Here's some horizontal pod autoscaling:
kubectl autoscale rs kuard --min=2 --max=5 --cpu-percent=80
Let you schedule a service across a set of nodes in a cluster (or the whole cluster). This is useful for running sidecar containers, e.g.
- log collectors/streamers
- "one pod per node" use cases
Like everything else in Kubernetes, everything is found/matched via labels.
Use DaemonSets when ONE copy of a pod/service should run on nodes. Use ReplicaSets when it's irrelevant where the copies run.
Use spec:NodeSelectors to limit what nodes you're running on.
One-off tasks (as opposed to eternally-running services). A job creates pods that run until they succeed (i.e. exit with 0)
-
Database Migrations
-
batch jobs
-
maintenance/cleanup tasks
-
queue workers
kubectl run -i oneshot --image=gcr.io/kuar-demo/kuard-amd64:1 --restart=OnFailure
-- --keygen-enable --keygen-exit-on-complete --keygen-num-to-gen 10
--restart=OnFailure tells Kubernetes to create a job. everything after '--' is a command-line arg that's passed to the container binary.
Jobs automatically create a unique label (since there are probably a lot of them, and manual naming would get hard/confusing). Using labels with them might lead to weird outcomes.
Configuration data (json objects, really). A bunch of different ways to configure things, crammed into one overloaded concept.
- Environment variables that are visible to your container process
- Command-line arguments that are appended to your container 'command'
- A filesystem that can be mounted in your container (yeeeah...). Keynames become filenames, values become file contents.
Configmaps are UTF-8 plaintext. They can't hold binary data.
See example manifests in this project.
See your configmaps with:
kubectl get configmaps
For sensitive data -- API keys, passwords, TLS or other encryption keys, etc.
Secrets can have one or more data elements associated with them (e.g. a TLSKEY secret can consist of both a .key and .crt file).
As of Kubernetes 1.7, this handles your secrets in a reasonable way: secrets are encrypted and only visible on nodes that run pods which need those secrets.
Your pods will probably access secrets by using a secrets volume (they can also use the API, of course). These are mounted on tmpfs volumes, and will not sync to disk (they're in-memory only). Each data element will be represented by a separate file in the mounted secrets volume.
Example: see the secret manifest in this project.
Secrets can store binary data. If you're defining secrets in YAML files, the data needs to be base64-encoded.
kubectl get secrets
When you update secrets, the appropriate files and env vars are updated on running containers. It's up to your application or container process to re-read files and env vars as needed (if things fail, for example). Yes, this couples your application to kubernetes in a small but real way.
Rolling out new versions/release of your code.
You get stuff for free! https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/controllers/node/examples
get node k8s-worker-02 -o yaml
Make a service manifest with spec.type LoadBalancer and DO creates a load balancer for this app! Woohoo!
kubectl run
In production, you want to be using a log aggregation service for managing container logs. BUT:
Pod logs:
kubectl logs <pod-name>
Specific container:
kubectl logs <pod-name> -c containername
Previous run of a container (if container process is failing or in a reboot loop)
kubectl logs <pod-name> -c containername --previous
Launch an interactive bash shell in a pod:
kubectl exec -it <pod-name> -- bash
The more kubernetes-friendly way of doing this is to create a job manifest:
- in the spec, use the container you'd like to execute a one-time task on
- use the 'command' feature and pass a list of command/argument strings
Copy a file from/to a pod container (if you find yourself copying to a container, re-examine the life decisions that have led you to this point):
kubectl cp <pod-name>:/path/to/remote/file /path/to/local/file
##########################
vim dcohen-nginx-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment-example
spec:
replicas: 1
selector:
matchLabels:
app: nginx-deployment-example
template:
metadata:
labels:
app: nginx-deployment-example
spec:
containers:
- name: nginx
image: library/nginx
kubectl create -f ./dcohen-nginx-deployment.yml
Are our cluster UI and RESTful resources reachable from the Internet?