Skip to content

Commit

Permalink
Merge pull request #2810 from cal-itp/more-docs
Browse files Browse the repository at this point in the history
More docs (mostly infra) and fixing CI configuration
  • Loading branch information
atvaccaro authored Jul 24, 2023
2 parents 3aa1270 + 90faee8 commit 00d5729
Show file tree
Hide file tree
Showing 13 changed files with 109 additions and 111 deletions.
20 changes: 7 additions & 13 deletions docs/infrastructure/README.md → .github/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,6 @@
# Infrastructure code
# GitHub Actions

# pyinvoke

[invoke](https://docs.pyinvoke.org/en/stable/) is a Python framework for executing subprocess and building a CLI application.
The tasks are defined in `tasks.py` and configuration in `invoke.yaml`; config values under the top-level `calitp`
are specific to our defined tasks.

Run `poetry run invoke -l` to list the available commands, and `poetry run invoke -h <command>` to get more detailed help for each individual command.

## CI/CD

All CI/CD automation in this project is executed via GitHub Actions, whose workflow files live in the `.github` directory.
All CI/CD automation in this project is executed via GitHub Actions, whose workflow files live in the [./workflows/](./workflows) directory.

## deploy-airflow.yml

Expand All @@ -19,6 +9,10 @@ While we're using GCP Composer, "deployment" of Airflow consists of two parts:
1. Calling `gcloud composer environments update ...` to update the Composer environment with new (or specific versions of) packages
2. Copying the `dags` and `plugins` folders to a GCS bucket that Composer reads (this is specified in the Composer Environment)

## deploy-apps-maps.yml

This workflow builds a static website from the Svelte app and deploys it to Netlify.

## build-*.yml workflows

Workflows prefixed with `build-` generally lint, test, and (usually) publish either a Python package or a Docker image.
Expand All @@ -31,7 +25,7 @@ Workflows prefixed with `service-` deal with Kubernetes deployments.
* `service-release-diff.yml` renders kubectl diffs on PRs targeting release branches
* `service-release-channel.yml` deploys to a given channel (i.e. environment) on updates to a release branch

Some of these workflows use the same `invoke` framework described earlier.
Some of these workflows use hologit or invoke. See the READMEs in [.holo](../.holo) and [ci](../ci) for documentation regarding hologit and invoke, respectively.

## GitOps
The workflows described above also define their triggers. In general, developer workflows should follow these steps.
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/build-calitp-map-utils.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,4 @@ jobs:
- run: poetry run mypy calitp_map_utils
- run: poetry run pytest --spec
- run: poetry run build
# TODO: should we actually publish to pypi?
2 changes: 2 additions & 0 deletions .github/workflows/publish-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ name: Build and publish docs

on:
push:
branches:
- main
paths:
- 'docs/**'
pull_request:
Expand Down
22 changes: 22 additions & 0 deletions .holo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# hologit

[hologit](https://github.com/JarvusInnovations/hologit) is a tool that facilitates manipulation of
GitHub branches in a manner that facilitates an "obvious" GitOps workflow for CI/CD. Specifically,
hologit allows:

1. Building branches containing only a subset of repository contents (for example, a branch only including infra-related code)
* This action is called "projection"
2. Bringing in contents from another repository without relying on published artifacts such as Helm charts
3. Applying transformations to files as part of #1
* These transformations are called "lenses"

In this repository, we declare one holobranch named [release-candidate](../branches/release-candidate).
By projecting this holobranch in GitHub Actions, individual "candidate" branches end up containing
only the code relevant to infra/Kubernetes as well as Kubernetes code from the upstream [cluster-template](https://github.com/JarvusInnovations/cluster-template)
repository. Then, a PR from a `candidate/<some-branch>` to `releases/<env>` (such as `releases/test`) will only show changes/content
relevant to infra in addition to `releases/*` branches only ever containing infra code. For example:

1. Create a [PR making an infra-related change](https://github.com/cal-itp/data-infra/pull/2828)
2. Create and merge a [PR to deploy a candidate branch to test](https://github.com/cal-itp/data-infra/pull/2829)
3. Merge the PR from #1
4. After merge, [PR to deploy the main candidate branch to prod](https://github.com/cal-itp/data-infra/pull/2832)
19 changes: 19 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Contribution guidelines

These guidelines are meant to provide a foundation for collaboration in Cal-ITP's data services repos,
primarily [#data-infra](https://github.com/cal-itp/data-infra).

## Issues
* When submitting an issue, please try to use an existing template if one is appropriate
* Provide enough information and context; try to do one or more of the following:
* Include links to specific lines of code, error logs, Slack context, etc.
* Include error messages or tracebacks if relevant and short
* Connect issues to Sentry issues

## Pull Requests
* We generally use merge commits as we think they provide clarity in a PR-based workflow
* PRs should be linked to any issues that they close. [Keywords](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) are one good way to do this
* Google provides a [How to do a code review reference](https://google.github.io/eng-practices/review/reviewer/) that reviewers may find helpful
* Use draft PRs to keep track of work without notifying reviewers, and avoid giving pre-emptive feedback on draft PRs
* Reviewers should not generally merge PRs themselves and should instead let the author merge, since authors will have the most context about merge considerations (for example, whether additional reviews are still needed, or whether any communication is needed about the impacts of the PR when it merges)
* After a PR is merged, the author has the responsibility of monitoring any subsequent CI actions for successful completions
25 changes: 22 additions & 3 deletions apps/maps/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,17 @@ that store their data in the static HTML of a rendered JupyterBook webpage. The
defines the contract between data producers (i.e. notebooks) and the data consumer (i.e. the Svelte app) as well
as provides some utilities for validating the GeoJSON of specific analysis types.

## Developing
## calitp_map_utils

As mentioned above, `calitp_map_utils` is a small utility library that defines Pydantic types that can be used for
data validation as well as Typescript types for type-hinting in the Svelte app. Also, the library contains
a few CLI commands to facilitate validating pre-existing data with those types and/or generate a quick URL
with state for testing.

TODO: a GitHub Action workflow exists to build the package, but it does not currently publish to pypi; right now
that is done manually with `poetry publish`.

## Developing the maps app

You can run a development server locally; use the calitp-map-utils CLI to generate a valid state URL for testing.

Expand All @@ -19,13 +29,22 @@ echo '{ "legend_url": "https://storage.googleapis.com/calitp-map-tiles/legend_te
URL: localhost:5173?state=H4sIAO38fWQC_6WSMU_DMBCF_8opM03GSt0KDAwUgcqGkHVNr47B8RnfhZJW_e8kbVE6FAY6WSef3_fek7dZwJqyCYTG-yvIPLaUpJtftj832e0Y5opKcOdstcYWHkjXnN6zbr9Jvl-pVKNMikKUE1rKLbP1hNFJXnJdlOidxlGNcaTOkxTLsZEq5Jb4Tbg7N72WtnHPkx5mqgPMhAEWE0dK6kiOhndXMNi86SArTsHh3ig8NT21heeEQZzCNBEKjLosHOUC7yWa6sPoQdVIr3Y-yLHSv13fT2FGmhiuG4F5JFrCDKPAdAaPhJd0XPeyBuvfWu5Z3fb5Yl_3f0GN5zAk2TDXw7RY8NfJhEKdmik5rJw9-VBkKSzNPzMcXyuJ5vJps9039EE5DrACAAA%3D
```

You can point the `--host` parameter at a Netlify URL to provide an easy way to test against an already-published version of the app. As of 2023-07-21
the production URL is [https://embeddable-maps.calitp.org](https://embeddable-maps.calitp.org) but you can also use preview
Netlify sites deployed via `netlify deploy ...` with `--alias=some-alias` and/or without the `--prod` flag (see below).

## Build and deploy to Netlify

To create a production version of your app:
The site is deployed to production on merges to main, as defined in [../../.github/workflows/deploy-apps-maps.yml](../../.github/workflows/deploy-apps-maps.yml).

You may also deploy manually with the following:
```bash
(from the apps/maps folder)
npm run build
netlify deploy --site=cal-itp-data-analyses --dir=build --alias=leaflet-speedmaps
netlify deploy --site=embeddable-maps-calitp-org --dir=build
```

By default, this deploys a preview site with a generated alias prefix. You may pass an explicit alias with `--alias=<some-alias>`
or deploy to production with `--prod`.

We could look into using the [Netlify adapter](https://kit.svelte.dev/docs/adapter-netlify) at some point.
17 changes: 17 additions & 0 deletions ci/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# CI - deploys via pyinvoke

This folder contains code and YAML that drives the deployments of Kubernetes-based applications and services. For example,
a deployment named `archiver` is configured in [the prod channel](./channels/prod.yaml) and is ultimatedly deployed
by `invoke` (see below) calling `kubectl` commands.

## invoke (aka pyinvoke)
[invoke](https://docs.pyinvoke.org/en/stable/) is a Python framework for executing subprocess and building a CLI application.
The tasks are defined in `tasks.py` and configuration in `invoke.yaml`; config values under the top-level `calitp`
are specific to our defined tasks.

Run `poetry run invoke -l` to list the available commands, and `poetry run invoke -h <command>` to get more detailed help for each individual command.
Individual release channels/environments are config files that are passed to invoke. For example, to deploy to test:

```bash
poetry run invoke release -f channels/test.yaml
```
5 changes: 0 additions & 5 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,6 @@ parts:
- file: kubernetes/JupyterHub
- file: kubernetes/architecture
- file: kubernetes/deployment
- file: infrastructure/README
- file: github/contribute_to_repos.md
- file: services/overview
sections:
- file: services/gtfs-ckan-uploader
- file: backups/metabase
- caption: Contribute to the Docs!
chapters:
Expand Down
62 changes: 0 additions & 62 deletions docs/github/contribute_to_repos.md

This file was deleted.

3 changes: 0 additions & 3 deletions docs/services/gtfs-ckan-uploader.md

This file was deleted.

4 changes: 0 additions & 4 deletions docs/services/overview.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/warehouse/developing_dbt_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ When you run dbt commands locally on JupyterHub, your models will be created in

Once your models are working the way you want, please make sure to update the associated YAML files (there will generally be one or two YAML files per folder with model tests, documentation, and additional configuration.) Especially if you created a brand-new model, you will want to add tests for things like unique, non-null primary keys and valid foreign keys. The YAML is also where table- and column-level documentation is populated. [Here is an example YAML file from our project](https://github.com/cal-itp/data-infra/blob/main/warehouse/models/mart/gtfs/_mart_gtfs_dims.yml), and [here is an example PR that created a new mart table with accompanying documentation](https://github.com/cal-itp/data-infra/pull/2097).

Because the warehouse is collectively maintained and changes can affect a variety of users, please open PRs against `main` when work is ready to merge and keep an eye out for comments and questions from reviewers, who might require tweaks before merging. ([See our repo contribution page](contribute-to-repos) for more information on GitHub practices.)
Because the warehouse is collectively maintained and changes can affect a variety of users, please open PRs against `main` when work is ready to merge and keep an eye out for comments and questions from reviewers, who might require tweaks before merging. See CONTRIBUTING.md in the repo for more information on GitHub practices.)

## Modeling considerations

Expand Down
Loading

0 comments on commit 00d5729

Please sign in to comment.