Merge pull request #2810 from cal-itp/more-docs

More docs (mostly infra) and fixing CI configuration
cal-itp · Jul 24, 2023 · 00d5729 · 00d5729
2 parents 3aa1270 + 90faee8
commit 00d5729
Show file tree

Hide file tree

Showing 13 changed files with 109 additions and 111 deletions.
diff --git a/docs/infrastructure/README.md → .github/README.md b/docs/infrastructure/README.md → .github/README.md
@@ -1,16 +1,6 @@
-# Infrastructure code
+# GitHub Actions
 
-# pyinvoke
-
-[invoke](https://docs.pyinvoke.org/en/stable/) is a Python framework for executing subprocess and building a CLI application.
-The tasks are defined in `tasks.py` and configuration in `invoke.yaml`; config values under the top-level `calitp`
-are specific to our defined tasks.
-
-Run `poetry run invoke -l` to list the available commands, and `poetry run invoke -h <command>` to get more detailed help for each individual command.
-
-## CI/CD
-
-All CI/CD automation in this project is executed via GitHub Actions, whose workflow files live in the `.github` directory.
+All CI/CD automation in this project is executed via GitHub Actions, whose workflow files live in the [./workflows/](./workflows) directory.
 
 ## deploy-airflow.yml
 
@@ -19,6 +9,10 @@ While we're using GCP Composer, "deployment" of Airflow consists of two parts:
 1. Calling `gcloud composer environments update ...` to update the Composer environment with new (or specific versions of) packages
 2. Copying the `dags` and `plugins` folders to a GCS bucket that Composer reads (this is specified in the Composer Environment)
 
+## deploy-apps-maps.yml
+
+This workflow builds a static website from the Svelte app and deploys it to Netlify.
+
 ## build-*.yml workflows
 
 Workflows prefixed with `build-` generally lint, test, and (usually) publish either a Python package or a Docker image.
@@ -31,7 +25,7 @@ Workflows prefixed with `service-` deal with Kubernetes deployments.
 * `service-release-diff.yml` renders kubectl diffs on PRs targeting release branches
 * `service-release-channel.yml` deploys to a given channel (i.e. environment) on updates to a release branch
 
-Some of these workflows use the same `invoke` framework described earlier.
+Some of these workflows use hologit or invoke. See the READMEs in [.holo](../.holo) and [ci](../ci) for documentation regarding hologit and invoke, respectively.
 
 ## GitOps
 The workflows described above also define their triggers. In general, developer workflows should follow these steps.

diff --git a/.github/workflows/build-calitp-map-utils.yml b/.github/workflows/build-calitp-map-utils.yml
@@ -32,3 +32,4 @@ jobs:
       - run: poetry run mypy calitp_map_utils
       - run: poetry run pytest --spec
       - run: poetry run build
+      # TODO: should we actually publish to pypi?
diff --git a/.github/workflows/publish-docs.yml b/.github/workflows/publish-docs.yml
@@ -2,6 +2,8 @@ name: Build and publish docs
 
 on:
   push:
+    branches:
+      - main
     paths:
       - 'docs/**'
   pull_request:

diff --git a/.holo/README.md b/.holo/README.md
@@ -0,0 +1,22 @@
+# hologit
+
+[hologit](https://github.com/JarvusInnovations/hologit) is a tool that facilitates manipulation of
+GitHub branches in a manner that facilitates an "obvious" GitOps workflow for CI/CD. Specifically,
+hologit allows:
+
+1. Building branches containing only a subset of repository contents (for example, a branch only including infra-related code)
+    * This action is called "projection"
+2. Bringing in contents from another repository without relying on published artifacts such as Helm charts
+3. Applying transformations to files as part of #1
+    * These transformations are called "lenses"
+
+In this repository, we declare one holobranch named [release-candidate](../branches/release-candidate).
+By projecting this holobranch in GitHub Actions, individual "candidate" branches end up containing
+only the code relevant to infra/Kubernetes as well as Kubernetes code from the upstream [cluster-template](https://github.com/JarvusInnovations/cluster-template)
+repository. Then, a PR from a `candidate/<some-branch>` to `releases/<env>` (such as `releases/test`) will only show changes/content
+relevant to infra in addition to `releases/*` branches only ever containing infra code. For example:
+
+1. Create a [PR making an infra-related change](https://github.com/cal-itp/data-infra/pull/2828)
+2. Create and merge a [PR to deploy a candidate branch to test](https://github.com/cal-itp/data-infra/pull/2829)
+3. Merge the PR from #1
+4. After merge, [PR to deploy the main candidate branch to prod](https://github.com/cal-itp/data-infra/pull/2832)
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,19 @@
+# Contribution guidelines
+
+These guidelines are meant to provide a foundation for collaboration in Cal-ITP's data services repos,
+primarily [#data-infra](https://github.com/cal-itp/data-infra).
+
+## Issues
+* When submitting an issue, please try to use an existing template if one is appropriate
+* Provide enough information and context; try to do one or more of the following:
+    * Include links to specific lines of code, error logs, Slack context, etc.
+    * Include error messages or tracebacks if relevant and short
+    * Connect issues to Sentry issues
+
+## Pull Requests
+* We generally use merge commits as we think they provide clarity in a PR-based workflow
+* PRs should be linked to any issues that they close. [Keywords](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) are one good way to do this
+* Google provides a [How to do a code review reference](https://google.github.io/eng-practices/review/reviewer/) that reviewers may find helpful
+* Use draft PRs to keep track of work without notifying reviewers, and avoid giving pre-emptive feedback on draft PRs
+* Reviewers should not generally merge PRs themselves and should instead let the author merge, since authors will have the most context about merge considerations (for example, whether additional reviews are still needed, or whether any communication is needed about the impacts of the PR when it merges)
+* After a PR is merged, the author has the responsibility of monitoring any subsequent CI actions for successful completions
diff --git a/apps/maps/README.md b/apps/maps/README.md
@@ -8,7 +8,17 @@ that store their data in the static HTML of a rendered JupyterBook webpage. The
 defines the contract between data producers (i.e. notebooks) and the data consumer (i.e. the Svelte app) as well
 as provides some utilities for validating the GeoJSON of specific analysis types.
 
-## Developing
+## calitp_map_utils
+
+As mentioned above, `calitp_map_utils` is a small utility library that defines Pydantic types that can be used for
+data validation as well as Typescript types for type-hinting in the Svelte app. Also, the library contains
+a few CLI commands to facilitate validating pre-existing data with those types and/or generate a quick URL
+with state for testing.
+
+TODO: a GitHub Action workflow exists to build the package, but it does not currently publish to pypi; right now
+that is done manually with `poetry publish`.
+
+## Developing the maps app
 
 You can run a development server locally; use the calitp-map-utils CLI to generate a valid state URL for testing.
 
@@ -19,13 +29,22 @@ echo '{ "legend_url": "https://storage.googleapis.com/calitp-map-tiles/legend_te
 URL: localhost:5173?state=H4sIAO38fWQC_6WSMU_DMBCF_8opM03GSt0KDAwUgcqGkHVNr47B8RnfhZJW_e8kbVE6FAY6WSef3_fek7dZwJqyCYTG-yvIPLaUpJtftj832e0Y5opKcOdstcYWHkjXnN6zbr9Jvl-pVKNMikKUE1rKLbP1hNFJXnJdlOidxlGNcaTOkxTLsZEq5Jb4Tbg7N72WtnHPkx5mqgPMhAEWE0dK6kiOhndXMNi86SArTsHh3ig8NT21heeEQZzCNBEKjLosHOUC7yWa6sPoQdVIr3Y-yLHSv13fT2FGmhiuG4F5JFrCDKPAdAaPhJd0XPeyBuvfWu5Z3fb5Yl_3f0GN5zAk2TDXw7RY8NfJhEKdmik5rJw9-VBkKSzNPzMcXyuJ5vJps9039EE5DrACAAA%3D
 ```
 
+You can point the `--host` parameter at a Netlify URL to provide an easy way to test against an already-published version of the app. As of 2023-07-21
+the production URL is [https://embeddable-maps.calitp.org](https://embeddable-maps.calitp.org) but you can also use preview
+Netlify sites deployed via `netlify deploy ...` with `--alias=some-alias` and/or without the `--prod` flag (see below).
+
 ## Build and deploy to Netlify
 
-To create a production version of your app:
+The site is deployed to production on merges to main, as defined in [../../.github/workflows/deploy-apps-maps.yml](../../.github/workflows/deploy-apps-maps.yml).
 
+You may also deploy manually with the following:
 ```bash
+(from the apps/maps folder)
 npm run build
-netlify deploy --site=cal-itp-data-analyses --dir=build --alias=leaflet-speedmaps
+netlify deploy --site=embeddable-maps-calitp-org --dir=build
 ```
 
+By default, this deploys a preview site with a generated alias prefix. You may pass an explicit alias with `--alias=<some-alias>`
+or deploy to production with `--prod`.
+
 We could look into using the [Netlify adapter](https://kit.svelte.dev/docs/adapter-netlify) at some point.
diff --git a/ci/README.md b/ci/README.md
@@ -0,0 +1,17 @@
+# CI - deploys via pyinvoke
+
+This folder contains code and YAML that drives the deployments of Kubernetes-based applications and services. For example,
+a deployment named `archiver` is configured in [the prod channel](./channels/prod.yaml) and is ultimatedly deployed
+by `invoke` (see below) calling `kubectl` commands.
+
+## invoke (aka pyinvoke)
+[invoke](https://docs.pyinvoke.org/en/stable/) is a Python framework for executing subprocess and building a CLI application.
+The tasks are defined in `tasks.py` and configuration in `invoke.yaml`; config values under the top-level `calitp`
+are specific to our defined tasks.
+
+Run `poetry run invoke -l` to list the available commands, and `poetry run invoke -h <command>` to get more detailed help for each individual command.
+Individual release channels/environments are config files that are passed to invoke. For example, to deploy to test:
+
+```bash
+poetry run invoke release -f channels/test.yaml
+```
diff --git a/docs/_toc.yml b/docs/_toc.yml
@@ -50,11 +50,6 @@ parts:
         - file: kubernetes/JupyterHub
         - file: kubernetes/architecture
         - file: kubernetes/deployment
-      - file: infrastructure/README
-      - file: github/contribute_to_repos.md
-      - file: services/overview
-        sections:
-        - file: services/gtfs-ckan-uploader
       - file: backups/metabase
   - caption: Contribute to the Docs!
     chapters:

diff --git a/docs/github/contribute_to_repos.md b/docs/github/contribute_to_repos.md
diff --git a/docs/services/gtfs-ckan-uploader.md b/docs/services/gtfs-ckan-uploader.md
diff --git a/docs/services/overview.md b/docs/services/overview.md
diff --git a/docs/warehouse/developing_dbt_models.md b/docs/warehouse/developing_dbt_models.md
@@ -28,7 +28,7 @@ When you run dbt commands locally on JupyterHub, your models will be created in
 
 Once your models are working the way you want, please make sure to update the associated YAML files (there will generally be one or two YAML files per folder with model tests, documentation, and additional configuration.) Especially if you created a brand-new model, you will want to add tests for things like unique, non-null primary keys and valid foreign keys. The YAML is also where table- and column-level documentation is populated. [Here is an example YAML file from our project](https://github.com/cal-itp/data-infra/blob/main/warehouse/models/mart/gtfs/_mart_gtfs_dims.yml), and [here is an example PR that created a new mart table with accompanying documentation](https://github.com/cal-itp/data-infra/pull/2097).
 
-Because the warehouse is collectively maintained and changes can affect a variety of users, please open PRs against `main` when work is ready to merge and keep an eye out for comments and questions from reviewers, who might require tweaks before merging. ([See our repo contribution page](contribute-to-repos) for more information on GitHub practices.)
+Because the warehouse is collectively maintained and changes can affect a variety of users, please open PRs against `main` when work is ready to merge and keep an eye out for comments and questions from reviewers, who might require tweaks before merging. See CONTRIBUTING.md in the repo for more information on GitHub practices.)
 
 ## Modeling considerations