Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reviving service deployment docs #631

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

BernardZhao
Copy link
Member

Decided to put some work on trying to update and bring back the work @keur made in #508, especially since @pxhanus could use some documentation as to how to deploy the discordbridge. These are some preliminary changes, and I still need a lot of help describing parts of the process I don't understand as well.

  • How should I go about formatting the Markdown? I found the existing file to be well spaced, but make lint didn't seem to do the trick.
  • Are there some parts of Keur's original documentation that aren't true anymore? I tried to not cut stuff that I didn't 100% know was not relevant anymore, but I really ended up cutting nothing.

Comment on lines +62 to +70
Finally, we have to include a `Jenkinsfile` to the repository, so that Jenkins
knows how we want it to go about deploying the service. In this case, it's just
specifying that we want it to go through the pipeline.

```
servicePipeline(
upstreamProjects: ['ocf/dockers/master'],
)
```
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is a little confusing to me, as in I described my understanding of it, but in other projects an empty upstreamProjects array also seemed to work fine. Perhaps @jvperrin could chime in on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a great question! The only thing this does is mean that a service build pipeline gets kicked off if the upstream one listed here succeeds. The reason why you see ocf/dockers/master in a number of services is when they are both (1) built off the OCF base docker images and (2) redeploying them can be done smoothly. These redeploys are mainly done so that security updates are pulled into service containers in a timely manner, so it doesn't have to be a daily thing but could probably be weekly or something instead.

For instance, ocfweb has both ocflib and dockers as upstreams because it depends on the latest versions of both of those repos, and because redeploying after either one finishes building successfully can be done smoothly. slackbridge on the other hand doesn't have any upstream projects listed despite being built off the OCF base image because every time it rebuilt it causes large join/quit spam on IRC that wasn't a pleasant experience.

Essentially, to figure out what's listed here for a new service, you only need to know what base image the service is using, and how disruptive a deploy would be.

Comment on lines 319 to 332
## Wrapping up

Now we have all the necessary configuration to deploy our service. To see if
everything works, we will deploy the service manually. On `supernova`, first
run `kinit`. This will obtain a [[kerberos|doc staff/backend/kerberos]] ticket
giving us access to the Kubernetes cluster. Now run

```
kubectl create namespace <myapp>
kubectl apply -n <myapp> -f <myapp>.yaml
```

You can run `kubectl -n <myapp> get all` to Kubernetes create your `Service`
and `Deployment` objects.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't relevant with Jenkins right? I thought this might be nice to include as example commands if someone was planning to deploy to dev-kubernetes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe it's necessary because Jenkins will create the app namespace and all that when deploying, and I think it also does the service/deployment creation parts too but I'm not too familiar with that side of things.

Maybe just mention before this that it's something you can do with your dev namespace to test out running the service somewhere?

Comment on lines +355 to +373
## Reference Material

The best way to get services up-and-running is to read code for existing services.

[templates][templates-deploy]: The simplest possible deployment. `nginx` server with static content.

[kanboard][kanboard-deploy]: Project management software that makes use of `ldap` and mounts `nfs`.

[mastodon][mastodon-deploy] (Advanced): Applies custom patches, uses `ldap`, mounts `nfs`, has pods for `redis`, `sidekiq`, and `http-streaming`.

[kafka][kafka-deploy] (Advanced): Runs a `kafka` cluster inside of Kubernetes.

[templates]: https://templates.ocf.berkeley.edu
[dockerhub]: https://hub.docker.com
[puppet]: https://github.com/ocf/puppet/tree/master/modules/ocf_kubernetes/files/persistent-volume-nfs
[templates-deploy]: https://github.com/ocf/templates/tree/master/kubernetes
[kanboard-deploy]: https://github.com/ocf/kanboard/tree/master/kubernetes
[mastodon-deploy]: https://github.com/ocf/mastodon/tree/master/kubernetes
[kafka-deploy]: https://github.com/ocf/kafka/tree/master/kubernetes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a lot more services now - this many not be that accurate. What are the best examples at the OCF today?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

templates is still a really good simple example, and I think mastodon is a good complex example, but I'm not sure about the others. I do know kanboard isn't running any more (gives a 503), and I don't really know the state of kafka.

Copy link
Member

@Baisang Baisang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for restartingthis, OCF has really needed this for a while. It made sense to me as someone who has deployed services on OCF k8s before.

Only big change I think you should make is regarding the persistent volumes. I would try to direct people toward using the nfs-provisioner when possible. Ideally we should have people use ceph but I don't know if that is fully operational yet.

here, just include whatever your service needs.

[[Jenkins|doc staff/backend/jenkins]] will try to build that container,
and then send the image to the OCF Docker server. This has to be specified in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Docker repository" is the word you're looking for

```

**NOTE**: While Jenkins is supposed to scan all of the repositories in the organization,
it won't necessarily be triggered if you transfter a repository over. Trigger the job
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transfer

permissions, you should be able to transfer, fork, or just create a
repository in the org.

### Docker & Jenkins
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better for people who want to deploy services to just clone or copy these files somehow. They're just going to copy paste from this doc anyway. I would wager that the vast majority of people who want to deploy services on kubernetes don't need to know how this works, it just has to "work." (of course, we should provide docs somewhere that explain how it works anyway)

You could have them run some script that takes in their proposed service name and it could even generate a basic template for those kubernetes yaml files too.

Probably out of scope for this PR, but something to consider for the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github has template repositories that can help solve the issue of copying files, so we can create one for Kubernetes deployments.

opt for 3 instances to handle failover.

The `containers` resource is where Kubernetes looks to obtain `docker` images
to deploy. For production services this will _always_ be the OCF docker server:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker repository

<app-nfs-pv.yaml>. In this example we'll create 30 gigabytes of readable and
writeable storage.

```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have people try to use the https://github.com/ocf/nfs-provisioner instead. We set it up so you no longer need to make puppet changes to create PVs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a lot easier. All you need to do is make a PVC like in the README with the correct storageClassName.

Now run

```
kubectl create namespace <myapp>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically you can include the yaml to create this within the .yaml file

name: <myapp-data>
```


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually you'll want to add a section for adding keycloak auth and handling secrets, but that's out of scope.

- containerPort: 8000
```

The last object we need to create for the Templates service is `Ingress`. We
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does adding the name to https://github.com/ocf/puppet/blob/master/modules/ocf_kubernetes/manifests/master/loadbalancer.pp still apply? If so we need to include that step.


DOCKER_REVISION ?= testing-$(USER)
DOCKER_TAG = docker-push.ocf.berkeley.edu/templates:$(DOCKER_REVISION)
RANDOM_PORT := $(shell expr $$(( 8000 + (`id -u` % 1000) + 1 )))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, this + 1 is to space out ports so you can run multiple services under one user without them colliding. I think we should probably come up with a better port allocation strategy, or just use the same port for the same user each time and forego this method as it's pretty confusing if you don't understand it.

Comment on lines +58 to +60
The important targets to pay attention here are `cook-image` and `push-image`,
Jenkins will try to run these as part of the deployment pipeline that continuously
deploys to Kubernetes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make test is also a good one to mention, it's automatically run if it exists as part of the service pipeline so that's a great place to put any tests that a service may have.

Copy link
Member

@jvperrin jvperrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviving this, I think this is really great to have and certainly something there should be docs on!

This `HOWTO` uses one of the OCF's simplest services as an example:
[Templates][templates]. Templates is service used internally by OCF staff
serving 'copy-pasteable' email templates. You can use [[git|doc
staff/backend/git]] to `clone` this [repo](https://github.com/ocf/templates/blob/master/kubernetes/templates.yml.erb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd link to the root of the repo instead of just the kubernetes config

Comment on lines +22 to +24
[OCF's Github organization](https://github.com/ocf). If you have the correct
permissions, you should be able to transfer, fork, or just create a
repository in the org.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume that people reading this guide are less experienced at creating services and likely don't have these permissions. Instead, I'd probably suggest to ask in #rebuild or somewhere similar for creating a repo (or do so if you have the permissions already).

As a tangent to this, I am aiming to have this be done through https://github.com/ocf/terraform at some point in the future, as it has a GitHub provider and that would be a nice way to not rely on individual permissions but instead define the state of the ocf org in version-controlled code. I think that would make this part nicer, as creating a new repo would then involve creating and merging a PR and essentially nothing else.


Finally, we have to include a `Jenkinsfile` to the repository, so that Jenkins
knows how we want it to go about deploying the service. In this case, it's just
specifying that we want it to go through the pipeline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, you might want to link to https://github.com/ocf/shared-pipeline/blob/4537c1ce537a6beffa1075c719e015cd60cc54eb/vars/servicePipeline.groovy if anyone is curious what the service pipeline actually does. It's not incredibly complicated, but I could also see not wanting to bog this down too much with implementation details like this. I think it could be useful for someone wanting to know what commands are run as part of the build or what runs in parallel though (tests and cooking the docker image for instance).


**NOTE**: While Jenkins is supposed to scan all of the repositories in the organization,
it won't necessarily be triggered if you transfter a repository over. Trigger the job
in the Jenkins UI manually if you don't see it come up.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do this in particular, it's not a build of any one job, but rather going to https://jenkins.ocf.berkeley.edu/job/ocf/, signing in, and clicking "Scan Organization Now" in the left sidebar. I think it would be good to recommend pushing a new commit to master after transferring the repo, and I believe that should trigger a pipeline creation (needs testing though to be sure).

Comment on lines +107 to +108
application uses _inside_ of the Docker container. In the case of templates we
bind to port `8000`. Here is the `Service` configuration for templates with all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to briefly mention here that the ports are not the same between the two (80 on both for instance) because stuff running within the container typically can't bind on anything 1024 or below. It's usually not running as root or a privileged user, as per security best practices.

Not sure that'll be clear to someone setting up a service for the first time otherwise.

```

This section can be a bit daunting, but we'll go through it step-by-step. Fill
in `<app-name>` and `<docker-port>` with the same name you used in your
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this <myapp> instead of <app-name>?

Comment on lines +177 to +179
millicores for CPU units, so 1 core = 1000m). Do note that every instance of
the application gets these resources, so with _N_ instances you are using _N *
limits_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ this is quite good to specify, as is the warning below about being stuck in a pending state


### OCF DNS

If your application at any point uses OCF-specific DNS, like using the hostname
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably mention the term "search domain" somewhere in here as that's the keyword that can help to find more information on the internet about this behavior.

Comment on lines +373 to +375
[templates]: https://templates.ocf.berkeley.edu
[dockerhub]: https://hub.docker.com
[puppet]: https://github.com/ocf/puppet/tree/master/modules/ocf_kubernetes/files/persistent-volume-nfs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For these 3 links that are referred to earlier on in the doc, I'd put them closer to their respective paragraphs just so that it's easier to see what is actually being linked to closer to the text that's doing the linking. (That or using the []() link syntax instead)

git clone [email protected]:ocf/dns.git
```

Since we are adding DNS for a Kubernetes service, we run `ldapvi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll also need a kinit $USER/admin before the ldapvi (either as a separate command or just put as a prefix to the command) to actually be able to edit and submit changes to ldap.

I think these editing DNS instructions should actually almost be their own docs. I looked and there's https://www.ocf.berkeley.edu/docs/staff/procedures/new-host/#h3_step-11-add-the-ldap-entry in the docs for adding a new host and DNS, but this is a bit different as it's just editing the existing lb-kubernetes pseudo-host instead. Anyway, something to think about, but this might be useful as a separate docs page that multiple places can link to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, lb-kubernetes no longer exists since there's no non-kubernetes version, it's just a cn=lb now (this is something that changed since the previous pull request).

jvperrin referenced this pull request in encadyma/ocf-kube-template Aug 29, 2020
@nikhiljha
Copy link
Member

The funny thing is we might actually obsolete these docs before they're merged.

@BernardZhao
Copy link
Member Author

The funny thing is we might actually obsolete these docs before they're merged.

Thats why I paused on updating this

@nikhiljha
Copy link
Member

https://notes.ocf.berkeley.edu/jCl7t_M7RYGzcqQpbzWrOQ#

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants