Skip to content

Commit

Permalink
Synchronize containerd and docker configuration with machine-controll…
Browse files Browse the repository at this point in the history
…er (#136)

* Synchronize containerd and docker configuration with machine-controller

Signed-off-by: Waleed Malik <[email protected]>

* Bump machine-controller go dep

Signed-off-by: Waleed Malik <[email protected]>

* Bump boilerplace in prow job

Signed-off-by: Waleed Malik <[email protected]>

* Refactored code

Signed-off-by: Waleed Malik <[email protected]>

* Resolve merge conflicts

Signed-off-by: Waleed Malik <[email protected]>

* Resolve issues with using a single client

Signed-off-by: Waleed Malik <[email protected]>

* Refactored code

Signed-off-by: Waleed Malik <[email protected]>

* Update generated name for OSC, include namespace name in pattern

Signed-off-by: Waleed Malik <[email protected]>

* Minor refactor

* Bump machine-controller to v1.42.2

Signed-off-by: Waleed Malik <[email protected]>
  • Loading branch information
ahmedwaleedmalik authored Jan 28, 2022
1 parent f0e0e01 commit 85df96e
Show file tree
Hide file tree
Showing 41 changed files with 1,664 additions and 528 deletions.
13 changes: 5 additions & 8 deletions .gimps.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,13 @@

# This is the configuration for https://github.com/xrstf/gimps.

importOrder: [std, external, envoy, kubermatic, kubernetes]
importOrder: [std, external, kubermatic, kubernetes]
sets:
- name: kubermatic
patterns:
- 'k8c.io/**'
- 'github.com/kubermatic/**'
- "k8c.io/**"
- "github.com/kubermatic/**"
- name: kubernetes
patterns:
- 'k8s.io/**'
- '*.k8s.io/**'
- name: envoy
patterns:
- 'github.com/envoyproxy/**'
- "k8s.io/**"
- "*.k8s.io/**"
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@
_build
.vscode/
.local
.DS_Store
4 changes: 0 additions & 4 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,3 @@ linters:
linters-settings:
goimports:
local-prefixes: k8c.io/operating-system-manager

issues:
exclude:
- type name will be used as config.ConfigVarResolver by other packages, and that stutters; consider calling this VarResolver
2 changes: 1 addition & 1 deletion .prow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ presubmits:
preset-goproxy: "true"
spec:
containers:
- image: quay.io/kubermatic-labs/boilerplate:v0.1.1
- image: quay.io/kubermatic-labs/boilerplate:v0.2.0
command:
- make
args:
Expand Down
44 changes: 23 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,12 @@ This project is experimental and currently a work-in-progress. **This is not sup

Currently this workflow has the following limitations/issues:

- Machine Controller expects **ALL** the supported OS plugins to exist and be ready. User might only be interested in a subset of the available operating systems.
- The `cloud-configs` are generated against pre-defined templates like [this](https://github.com/kubermatic/machine-controller/blob/master/pkg/userdata/ubuntu/provider.go#L133). This is not ideal because code changes are required to update those templates.
- Each cloud provider sets some limit for `user-data` size, machine won't be created in case of non-compliance. For example, at the time of writing this, AWS has set a [hard limit of 16KB](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-add-user-data.html) for `user-data` size.
- Machine Controller expects **ALL** the supported user-data plugins to exist and be ready. User might only be interested in a subset of the available operating systems. For example, user might only want to work with `ubuntu`.
- The user-data plugins have templates defined [in-code](https://github.com/kubermatic/machine-controller/blob/master/pkg/userdata/ubuntu/provider.go#L133). Which is not ideal because code changes are required to update those templates.
- Managing configs for multiple cloud providers, OS flavors and OS versions, adds a lot of complexity and redundancy in machine-controller.
- Since the templates are defined in-code, there is no way for an end user to customize them to suit their use-cases.
- Each cloud provider sets some sort of limits for the size of `user-data`, machine won't be created in case of non-compliance. For example, at the time of writing this, AWS has set a [hard limit of 16KB](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-add-user-data.html).
- Better support for air-gapped environments is required.

### Solution

Expand All @@ -27,33 +29,33 @@ Operating System Manager was created to solve the above mentioned issues. It dec

OSM introduces the following resources:

- OperatingSystemProfile: A resource that represents the details of each operating system.
- OperatingSystemConfig: A resource that contains the `cloud-configs` that are going to be used to bootstrap and provision the worker nodes.

`OperatingSystemConfig` are a subset of `OperatingSystemProfile` and are auto-generated by the `osc-controller` against a certain OSP and MachineDeployment.
For each cluster there are at least two OSC objects:
### OperatingSystemProfile

1. OSC for accessing the cluster; OSC is sent to the worker node via user-data and processed as a cloud-init or ignition config, in order to fetch the second OSC object.
2. OSC for provisioning the machine; OSC represents the actual cloud-config that provision the worker node.
Templatized resource that represents the details of each operating system. OSPs are immutable and default OSPs for supported operating systems are provided/installed automatically by kubermatic. End users can create custom OSPs as well to fit their own use-cases.

The created OSCs are processed by the controllers and they eventually generate a secret inside each user cluster. Which is then consumed by the worker nodes.
Its dedicated controller runs in the **seed** cluster, in user cluster namespace, and operates on the `OperatingSystemProfile` custom resource. It is responsible for installing the default OSPs in user-cluster namespace.

![Architecture](./docs/images/architecture-osm.png)
### OperatingSystemConfig

Immutable resource that contains the actual configurations that are going to be used to bootstrap and provision the worker nodes. It is a subset of OperatingSystemProfile, rendered using OperatingSystemProfile, MachineDeployment and flags

Its dedicated controller runs in the **seed** cluster, in user cluster namespace, and is responsible for generating the OSCs in **seed** and secrets in `cloud-init-settings` namespace in the user cluster.

### OperatingSystemProfile Controller

This controller runs in the `master` cluster and operates on the `OperatingSystemProfile` custom resource. It is responsible for creating the `OperatingSystemConfig` resources.
For each cluster there are at least two OSC objects:

1. **Bootstrap**: OSC used for initial configuration of machine and to fetch the provisioning OSC object.
2. **Provisioning**: OSC with the actual cloud-config that provision the worker node.

### OperatingSystemConfig Controller
OSCs are processed by controllers to eventually generate **secrets inside each user cluster**. These secrets are then consumed by worker nodes.

This controller runs in the `seed` cluster in the namespace of the user cluster and operates on the `OperatingSystemConfig` custom resource. It is responsible for generating `user-data` secret through the OperatingSystemConfig resource.
![Architecture](./docs/images/architecture-osm.png)

### Air-gapped Environment

This controller was designed by keeping air-gapped environments in mind. Customers can use their own VM images by creating custom OSP profiles to provision nodes in a cluster that doesn't have outbound internet access.

![Architecture](./docs/images/architecture-osm-air-gapped.png)

More work is being done to make it even easier to use OSM in air-gapped environments.

## Support
Expand All @@ -68,10 +70,6 @@ _The code and sample YAML files in the master branch of the operating-system-man

## Development

### Testing

Simply run `make test`

### Local Development

To run OSM locally:
Expand All @@ -81,6 +79,10 @@ To run OSM locally:
- Create relevant OperatingSystemProfile resources. Check [sample](./deploy/osps/default) for reference.
- Run `make run`

### Testing

Simply run `make test`

## Troubleshooting

If you encounter issues [file an issue][1] or talk to us on the [#kubermatic channel][6] on the [Kubermatic Slack][7].
Expand Down
53 changes: 37 additions & 16 deletions cmd/osm-controller/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,13 @@ import (
"flag"
"fmt"
"net"
"os"
"path"
"strconv"
"strings"

"go.uber.org/zap"

clusterv1alpha1 "github.com/kubermatic/machine-controller/pkg/apis/cluster/v1alpha1"
"github.com/kubermatic/machine-controller/pkg/containerruntime"
"k8c.io/operating-system-manager/pkg/controllers/osc"
"k8c.io/operating-system-manager/pkg/controllers/osp"
osmv1alpha1 "k8c.io/operating-system-manager/pkg/crd/osm/v1alpha1"
Expand All @@ -40,7 +39,6 @@ import (
"k8s.io/client-go/kubernetes/scheme"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/tools/clientcmd"
"k8s.io/client-go/util/homedir"
"k8s.io/klog"
ctrl "sigs.k8s.io/controller-runtime"
ctrlruntimeclient "sigs.k8s.io/controller-runtime/pkg/client"
Expand All @@ -57,8 +55,6 @@ type options struct {
externalCloudProvider bool
pauseImage string
initialTaints string
nodeHTTPProxy string
nodeNoProxy string
nodePortRange string
podCidr string
enableLeaderElection bool
Expand All @@ -71,6 +67,16 @@ type options struct {
metricsAddress string
workerHealthProbeAddress string
workerMetricsAddress string

// Flags for configuring CRI
nodeInsecureRegistries string
nodeRegistryMirrors string
nodeRegistryCredentialsSecret string
nodeContainerdRegistryMirrors containerruntime.RegistryMirrorsFlags

// Flags for proxy
nodeHTTPProxy string
nodeNoProxy string
}

func init() {
Expand All @@ -95,12 +101,16 @@ func main() {
flag.StringVar(&opt.clusterDNSIPs, "cluster-dns", "10.10.10.10", "Comma-separated list of DNS server IP address.")
flag.StringVar(&opt.pauseImage, "pause-image", "", "pause image to use in Kubelet.")
flag.StringVar(&opt.initialTaints, "initial-taints", "", "taints to use when creating the node.")
flag.StringVar(&opt.nodeHTTPProxy, "node-http-proxy", "", "If set, it configures the 'HTTP_PROXY' & 'HTTPS_PROXY' environment variable on the nodes.")
flag.StringVar(&opt.nodeNoProxy, "node-no-proxy", ".svc,.cluster.local,localhost,127.0.0.1", "If set, it configures the 'NO_PROXY' environment variable on the nodes.")

flag.StringVar(&opt.podCidr, "pod-cidr", "172.25.0.0/16", "The network ranges from which POD networks are allocated")
flag.StringVar(&opt.nodePortRange, "node-port-range", "30000-32767", "A port range to reserve for services with NodePort visibility")
flag.StringVar(&opt.kubeletFeatureGates, "node-kubelet-feature-gates", "RotateKubeletServerCertificate=true", "Feature gates to set on the kubelet")

flag.StringVar(&opt.nodeHTTPProxy, "node-http-proxy", "", "If set, it configures the 'HTTP_PROXY' & 'HTTPS_PROXY' environment variable on the nodes.")
flag.StringVar(&opt.nodeNoProxy, "node-no-proxy", ".svc,.cluster.local,localhost,127.0.0.1", "If set, it configures the 'NO_PROXY' environment variable on the nodes.")
flag.StringVar(&opt.nodeInsecureRegistries, "node-insecure-registries", "", "Comma separated list of registries which should be configured as insecure on the container runtime")
flag.StringVar(&opt.nodeRegistryMirrors, "node-registry-mirrors", "", "Comma separated list of Docker image mirrors")

flag.StringVar(&opt.healthProbeAddress, "health-probe-address", "127.0.0.1:8085", "The address on which the liveness check on /healthz and readiness check on /readyz will be available")
flag.StringVar(&opt.metricsAddress, "metrics-address", "127.0.0.1:8080", "The address on which Prometheus metrics will be available under /metrics")

Expand All @@ -120,6 +130,7 @@ func main() {

opt.kubeconfig = flag.Lookup("kubeconfig").Value.(flag.Getter).Get().(string)

// Parse flags
parsedClusterDNSIPs, err := parseClusterDNSIPs(opt.clusterDNSIPs)
if err != nil {
klog.Fatalf("invalid cluster dns specified: %v", err)
Expand All @@ -130,6 +141,19 @@ func main() {
klog.Fatalf("invalid kubelet feature gates specified: %v", err)
}

containerRuntimeOpts := containerruntime.Opts{
ContainerRuntime: opt.containerRuntime,
ContainerdRegistryMirrors: opt.nodeContainerdRegistryMirrors,
InsecureRegistries: opt.nodeInsecureRegistries,
PauseImage: opt.pauseImage,
RegistryMirrors: opt.nodeRegistryMirrors,
RegistryCredentialsSecret: opt.nodeRegistryCredentialsSecret,
}
containerRuntimeConfig, err := containerruntime.BuildConfig(containerRuntimeOpts)
if err != nil {
klog.Fatalf("failed to generate container runtime config: %v", err)
}

logger, err := zap.NewProduction()
if err != nil {
klog.Fatal(err)
Expand Down Expand Up @@ -218,6 +242,8 @@ func main() {
opt.nodeNoProxy,
opt.nodePortRange,
opt.podCidr,
containerRuntimeConfig,
opt.nodeRegistryCredentialsSecret,
parsedKubeletFeatureGates,
); err != nil {
klog.Fatal(err)
Expand All @@ -238,7 +264,10 @@ func createManager(opt *options) (manager.Manager, error) {
HealthProbeBindAddress: opt.healthProbeAddress,
MetricsBindAddress: opt.metricsAddress,
Port: 9443,
Namespace: opt.namespace,
}

if opt.workerClusterKubeconfig != "" {
options.Namespace = opt.namespace
}

mgr, err := manager.New(config.GetConfigOrDie(), options)
Expand Down Expand Up @@ -292,11 +321,3 @@ func parseKubeletFeatureGates(s string) (map[string]bool, error) {
}
return featureGates, nil
}

// getKubeConfigPath returns the path to the kubeconfig file.
func getKubeConfigPath() string {
if os.Getenv("KUBECONFIG") != "" {
return os.Getenv("KUBECONFIG")
}
return path.Join(homedir.HomeDir(), ".kube/config")
}
37 changes: 17 additions & 20 deletions deploy/osps/default/osp-amzn2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ apiVersion: operatingsystemmanager.k8c.io/v1alpha1
kind: OperatingSystemProfile
metadata:
name: osp-amzn2
namespace: cloud-init-settings
namespace: kube-system
spec:
osName: "amzn2"
osVersion: "2.0"
version: "v0.1.0"
version: "v0.1.1"
supportedCloudProviders:
- name: "aws"
supportedContainerRuntimes:
Expand All @@ -32,23 +32,7 @@ spec:
inline:
encoding: b64
data: |
version = 2
[metrics]
address = "127.0.0.1:1338"
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".containerd]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
{{ .ContainerRuntimeConfig}}
templates:
containerRuntimeInstallation: |-
mkdir -p /etc/systemd/system/containerd.service.d
Expand Down Expand Up @@ -79,7 +63,7 @@ spec:
inline:
encoding: b64
data: |-
{"exec-opts":["native.cgroupdriver=systemd"],"storage-driver":"overlay2","log-driver":"json-file","log-opts":{"max-file":"5","max-size":"100m"}}
{{ .ContainerRuntimeConfig}}
templates:
containerRuntimeInstallation: |-
mkdir -p /etc/systemd/system/containerd.service.d /etc/systemd/system/docker.service.d
Expand Down Expand Up @@ -503,6 +487,19 @@ spec:
- "{{ . }}"
{{- end }}
clusterDomain: cluster.local
{{- /* containerLogMaxSize and containerLogMaxFiles have no effect for docker */}}
{{- if ne .ContainerRuntime "docker" }}
{{- if .ContainerLogMaxSize }}
containerLogMaxSize: {{ .ContainerLogMaxSize }}
{{- else }}
containerLogMaxSize: 100Mi
{{- end }}
{{- if .ContainerLogMaxFiles }}
containerLogMaxFiles: {{ .ContainerLogMaxFiles }}
{{- else }}
containerLogMaxFiles: 5
{{- end }}
{{- end }}
featureGates:
{{- if .KubeletFeatureGates -}}
{{ range $key, $val := .KubeletFeatureGates }}
Expand Down
37 changes: 17 additions & 20 deletions deploy/osps/default/osp-centos.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ apiVersion: operatingsystemmanager.k8c.io/v1alpha1
kind: OperatingSystemProfile
metadata:
name: osp-centos
namespace: cloud-init-settings
namespace: kube-system
spec:
osName: "centos"
osVersion: "7.7"
version: "v0.1.0"
version: "v0.1.1"
supportedCloudProviders:
- name: "aws"
- name: "azure"
Expand All @@ -39,23 +39,7 @@ spec:
inline:
encoding: b64
data: |
version = 2
[metrics]
address = "127.0.0.1:1338"
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".containerd]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
{{ .ContainerRuntimeConfig}}
templates:
containerRuntimeInstallation: |-
yum install -y yum-utils
Expand Down Expand Up @@ -91,7 +75,7 @@ spec:
inline:
encoding: b64
data: |-
{"exec-opts":["native.cgroupdriver=systemd"],"storage-driver":"overlay2","log-driver":"json-file","log-opts":{"max-file":"5","max-size":"100m"}}
{{ .ContainerRuntimeConfig}}
templates:
containerRuntimeInstallation: |-
yum install -y yum-utils
Expand Down Expand Up @@ -532,6 +516,19 @@ spec:
- "{{ . }}"
{{- end }}
clusterDomain: cluster.local
{{- /* containerLogMaxSize and containerLogMaxFiles have no effect for docker */}}
{{- if ne .ContainerRuntime "docker" }}
{{- if .ContainerLogMaxSize }}
containerLogMaxSize: {{ .ContainerLogMaxSize }}
{{- else }}
containerLogMaxSize: 100Mi
{{- end }}
{{- if .ContainerLogMaxFiles }}
containerLogMaxFiles: {{ .ContainerLogMaxFiles }}
{{- else }}
containerLogMaxFiles: 5
{{- end }}
{{- end }}
featureGates:
{{- if .KubeletFeatureGates -}}
{{ range $key, $val := .KubeletFeatureGates }}
Expand Down
Loading

0 comments on commit 85df96e

Please sign in to comment.