From d81d53bbcdcce6bf24d992d83ffb1f4fc1879fd1 Mon Sep 17 00:00:00 2001
From: Laurie <55149902+lauriemerrell@users.noreply.github.com>
Date: Thu, 10 Aug 2023 11:29:01 -0500
Subject: [PATCH] Airtable docs updates (#2868)
* wip updates
* foreign key docs and other updates
* rearrange navigation
* remove legacy docs section
* address failures in docs build - remove unused airflow page and fix toc
* rename airtable page
* remove references to contacting charlie
* update link to refactored architecture data page
* phrasing update per pr review and add link to the google sheet
---
docs/_toc.yml | 9 +-
docs/airflow/dags-maintenance.md | 4 +-
docs/airflow/overview.md | 6 -
docs/analytics_onboarding/overview.md | 2 +-
docs/analytics_tools/jupyterhub.md | 2 +-
docs/contribute/contribute-best-practices.md | 2 +-
docs/datasets_and_tables/transitdatabase.md | 131 ------------------
docs/transit_database/transitdatabase.md | 80 +++++++++++
.../navigating_dbt_docs.md} | 7 -
9 files changed, 88 insertions(+), 155 deletions(-)
delete mode 100644 docs/airflow/overview.md
delete mode 100644 docs/datasets_and_tables/transitdatabase.md
create mode 100644 docs/transit_database/transitdatabase.md
rename docs/{datasets_and_tables/overview.md => warehouse/navigating_dbt_docs.md} (81%)
diff --git a/docs/_toc.yml b/docs/_toc.yml
index 9e741068bf..c5e4a6dc0c 100644
--- a/docs/_toc.yml
+++ b/docs/_toc.yml
@@ -31,13 +31,11 @@ parts:
- file: warehouse/overview
sections:
- file: warehouse/warehouse_starter_kit
+ - file: warehouse/navigating_dbt_docs
- file: warehouse/what_is_agency
- file: warehouse/developing_dbt_models
- file: warehouse/adding_oneoff_data
- file: warehouse/what_is_gtfs
- - file: datasets_and_tables/overview
- sections:
- - file: datasets_and_tables/transitdatabase
- file: publishing/overview
sections:
- glob: publishing/sections/*
@@ -47,9 +45,8 @@ parts:
sections:
- file: architecture/services
- file: architecture/data
- - file: airflow/overview
- sections:
- - file: airflow/dags-maintenance
+ - file: airflow/dags-maintenance
+ - file: transit_database/transitdatabase
- file: kubernetes/README
sections:
- file: kubernetes/JupyterHub
diff --git a/docs/airflow/dags-maintenance.md b/docs/airflow/dags-maintenance.md
index 5303e934c4..b2a4549ef6 100644
--- a/docs/airflow/dags-maintenance.md
+++ b/docs/airflow/dags-maintenance.md
@@ -1,7 +1,7 @@
(dags-maintenance)=
-# Production DAGs Maintenance
+# Airflow Operational Considerations
-We use [Airflow](https://airflow.apache.org/) to orchestrate our data ingest processes. This page describes how to handle cases where an Airflow [DAG task](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/tasks.html) fails.
+We use [Airflow](https://airflow.apache.org/) to orchestrate our data ingest processes. This page describes how to handle cases where an Airflow [DAG task](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/tasks.html) fails. For general information about Airflow development, see the [Airflow README in the data-infra GitHub repo](https://github.com/cal-itp/data-infra/blob/main/airflow/README.md).
## Monitoring DAGs
diff --git a/docs/airflow/overview.md b/docs/airflow/overview.md
deleted file mode 100644
index 553a8a0207..0000000000
--- a/docs/airflow/overview.md
+++ /dev/null
@@ -1,6 +0,0 @@
-(airflow)=
-# Airflow
-
-Cal-ITP has a managed instance of Airflow (called Google Cloud Composer) used to orchestrate various pieces of our data pipeline.
-
-See the [Airflow README](https://github.com/cal-itp/data-infra/blob/main/airflow/README.md) for local development and testing instructions.
diff --git a/docs/analytics_onboarding/overview.md b/docs/analytics_onboarding/overview.md
index 178b76ba5e..4cb510c2fb 100644
--- a/docs/analytics_onboarding/overview.md
+++ b/docs/analytics_onboarding/overview.md
@@ -42,7 +42,7 @@
(get-help)=
```{admonition} Still need access to a non-Caltrans tool above?
-DM Charlie on Cal-ITP Slack using this link, or by email.
+Ask on the `#services-team` channel in the Cal-ITP Slack.
```
## New Analyst Training Curriculum
diff --git a/docs/analytics_tools/jupyterhub.md b/docs/analytics_tools/jupyterhub.md
index 0e230c060d..601b9c712f 100644
--- a/docs/analytics_tools/jupyterhub.md
+++ b/docs/analytics_tools/jupyterhub.md
@@ -28,7 +28,7 @@ This avoids the need to set up a local environment, provides dedicated storage,
JupyterHub currently lives at [notebooks.calitp.org](https://notebooks.calitp.org/).
-Note: you will need to have been added to the Cal-ITP organization on GitHub to obtain access. If you have yet to be added to the organization and need to be, DM Charlie on Cal-ITP Slack using this link.
+Note: you will need to have been added to the Cal-ITP organization on GitHub to obtain access. If you have yet to be added to the organization and need to be, ask in the `#services-team` channel in Slack.
(connecting-to-warehouse)=
### Connecting to the Warehouse
diff --git a/docs/contribute/contribute-best-practices.md b/docs/contribute/contribute-best-practices.md
index 921c827067..9aa63913a2 100644
--- a/docs/contribute/contribute-best-practices.md
+++ b/docs/contribute/contribute-best-practices.md
@@ -33,7 +33,7 @@ If you feel a new section is warranted, make sure you follow Jupyter Book's guid
(new-pages)=
### New Pages and Chapters
-Add new pages and chapters only as truly needed. If you're unsure of whether a new page or chapter is necessary, reach out to `@Charlie Costanzo` on `Cal-ITP Slack`.
+Add new pages and chapters only as truly needed.
If you are adding new pages or chapters, you will need to also update the `_toc.yml` file. You can find more information at Jupyter Book's resource [Structure and organize content](https://jupyterbook.org/basics/organize.html).
diff --git a/docs/datasets_and_tables/transitdatabase.md b/docs/datasets_and_tables/transitdatabase.md
deleted file mode 100644
index 2dd6080122..0000000000
--- a/docs/datasets_and_tables/transitdatabase.md
+++ /dev/null
@@ -1,131 +0,0 @@
-# Transit Database
-
-The Cal-ITP Airtable Transit Database stores key relationships about how transit services are organized and operated in California as well as how well they are performing. See Evan or Hunter to get a link and gain access.
-
-We have chosen to group and maintain the tables into the following Airtable bases as follows:
-
-| **Table Set** | **Description** | **Data Maintainer** |
-| :------------ | :-------------- | :------------------ |
-| [**California Transit**](#california-transit) | Defines key organizational relationships and properties. Organizations, geography, funding programs, transit services, service characteristics, transit datasets such as GTFS, and the intersection between transit datasets and services. | *Elizabeth*
Evan handling uptake to warehouse |
-| [**Transit Data Assessments**](#transit-data-assessments) | Articulates data performance metrics and assessments.| *Elizabeth*
*Evan* handling uptake to warehouse
*Olivia* a key User Advocate. |
-| [**Transit Technology Stacks**](#transit-technology-stacks) | Defines operational setups at transit provider organizations. Defines relationships between vendor organizations, transit provider and operator organizations, products, contracts to provide products, transit stack components, and how they relate to one-another. Structure still somewhat a `WIP`. | *Elizabeth*
No warehouse uptake for time being. |
-
-While `organizations` and `services` are central to many of the tables, we have chosen to maintain them as part of the California Transit Base which will be referenced by the other two.
-
-## Airtable things
-
-### Primary Keys
-
-Airtable forces the use of the left-most field as the primary key of the database: the field that must be referenced in other tables, similar to a VLOOKUP in a spreadsheet. Unlike many databases, Airtable doesn't enforce uniqueness in the values of the primary key field. Instead, it assigns it an underlying and mostly hidden unique [`RECORD ID`](https://support.airtable.com/hc/en-us/articles/360051564873-Record-ID), which can be exposed by creating a formula field to reference it.
-
-For the sake of this documentation, we've noted the [`Primary Field`](https://support.airtable.com/hc/en-us/articles/202624179-The-primary-field), which is not guaranteed to be unique. Some tables additionally expose the unique [`RECORD ID`](https://support.airtable.com/hc/en-us/articles/360051564873-Record-ID) as well.
-
-### Full Documentation of Fields
-
-AirTable does not currently have an effective mechanism to programmatically download your data schema (they have currently paused issuing keys to their metadata API). Rather than manually type-out and export each individual field definition from AirTable, please see the [AirTable-based documentation of fields](https://airtable.com/appPnJWrQ7ui4UmIl/api/docs) which is produced as a part of their API documentation. Note that you must be authenticated with access to the base to reach this link.
-
-## California Transit
-
-| **Name**
*Key(s)*| **Description** |
-| :------------- | :-------------- |
-| `organizations`
*Primary Field*: `Name` | Records are legal organizations, including companies, governmental bodies, or non-profits.
Table includes information on organizational properties (i.e. locale, type) as well as summarizations of its various relationships (e.g. `services` for a transit provider, or `products` for a vendor).
An organization MAY:
- *manage* one more more `services`
- *operate* one more `services`
- *own* one more `contracts`
- *hold* one more `contracts`
- *sell* one more `products`
- *consume* one more `gtfs datasets`
- *produce* one more `gtfs datasets`
-| `services`
*Primary Field*: `Name` | Each record defines a transit service and its properties.
While there are a small number of exceptions (e.g. Solano Express, which is jointly managed by Solano and Napa), generally each transit service is managed by a single organization. Transit services are differentiated from each other by variation (or the potentiality of variation) in one or more of the following:- operator `organization` such as City Staff vs a contracted service,
- `rider requirements` such as ADA Paratransit eligibility, senior status, school trips, etc.
- Operational characteristics such as reservations, on-demand vs fixed-route, service frequencies, and mode
Business processes such as transit technology stacks or personnel- Funding type, which can effect longevity of the service and how well integrated it is with other services managed by the same organization.
- Rider-facing branding, which is often an indicator of one of the above.
Services MAY:- be *reflected by* a `gtfs service data` record
- be *subject to* one or more `rider requirements`
- be *funded by* one or more `funding programs`
- *operated by* one or more `organizations`
- *managed by* one or more organizations
- be *operated in* one or more `place geography`
- *use* one or more `products`. |
-| `tasks`
*Primary Field*: `Name` | Each record defines an action we are either undertaking or tracking to meet our OKRs and quarterly milestones. These tasks are helpful for crosswalking "what we are doing" with each GTFS-entity. |
-| `gtfs datasets`
*Primary Field*: `gtfs_dataset_id` | Each record represents a gtfs dataset (feed) that is either a type of GTFS Schedule, Trip Updates, Vehicle Locations or Alerts. A gtfs dataset MAY:- be *disaggregated into* one or more `gtfs service data` records.
- be *produced* by one or more `organizations`
- be *published* by an `organizations`. |
-| `gtfs service data`
*Primary Field*: `Name`, a combination a single `Services` record and a single `gtfs datasets` record. | Each record links together a single `gtfs dataset` and one `services`. Additional fields define how to isolate the service within the `gtfs dataset`.
Many services have more than one GTFS dataset which describes their service. Often these are either precursors to *final* datasets (e.g. AC Transit's GTFS dataset is a precursor to the Bay Area 511 dataset) or artifacts produced in other processes such as creating GTFS Realtime (e.g. [the VCTC GTFS produced by GMV](https://airtable.com/appPnJWrQ7ui4UmIl/tblnVt5FZ2FZmDjDx/viw3NtcDP3Qm0BYyG/recJhrNj21mYVETJG?blocks=hide)). The property `Category` indicates if this is the `primary` dataset, a `precursor` or `unknown` in order to distinguish which should be used.|
-| `place geography`
*Primary Field*: `Name` | Each place is a Census recognized Place with a FIPS code. Additional properties include County and Caltrans District. |
-| `county geography`
*Primary Field*: `Name` | Each record is a county and has fields to lookup key information about that county such as Caltrans District. |
-| `fare systems`
*Primary Field*: `Fare System` | `WIP` A list of fare systems and their properties. Fare systems apply to one or more Organization which manages a service. |
-| `funding programs`
*Primary Field*: `Program` | `WIP` A very broad list of funding programs for `services` such that we are able to identify which `services` may be subject to various requirements and/or to better classify `services`. |
-| `rider requirements`
*Primary Field*: `Requirement` | A very broad list of rider requirements to categorize and analyze `services`. |
-| `eligibility programs`
*Primary Field*: `Program` | `WIP` Each record is a program/process which riders must use to become eligible to ride one or more `services`. Each program is operated by an `organization` and evaluates one or more `rider requirements`. |
-| `service-component` | Imported table from [Transit Technology Stacks](#transit-technology-stacks) which allows us to look at what products services use. |
-| `feed metrics` | Metrics (i.e. service area, population, etc. ) that are calculated by Data Analysts using data warehouse that are useful for planning and strategy. **Not necessarily up-to-date; should be used as back of the envelope numbers only; not for presentation.** |
-| `NTD Agency Info` | 2018 NTD Agency Info Table imported 10/6/2021 from https://www7.fta.dot.gov/ntd/data-product/2018-annual-database-agency-information |
-| `API Keys` | Storage of API keys for accessing `gtfs datasets` via a keyed API |
-
-### Additional Field Documentation
-
-Because fields are documented in the Airtable GUI itself and its associated API documentation, this section only contains additional information that cannot be appropriately documented in those places or is specific to the needs of the Data Services team.
-
-#### `gtfs service data` notation for isolating GTFS Services within GTFS Datasets
-
-Context: `gtfs service data` is an association table between `services` and subsets (or entire) `GTFS Datasets`.
-
-Summary:
-
-- Selection levels are in following order `agency_id`, `network_id`, `route_id`
-- `BLANK` indicates ALL records
-- Comma-separated list for values that should be selected at that selection level
-- `*` indicates remaining records after other selections at that selection level
-
-Relevant Fields:
-
-- `gtfs service data.agency_id`: if only a selection of `agency.agency_id` within the GTFS Dataset should be selected to represent a specific `services` record, list them here. If all `agency_id` should be selected, leave blank. Indicate if the "leftover" `agency_id` from other `agency_id` selections for the same `GTFS dataset` should be selected with `*`.
-- `gtfs service data.network_id`: if only a selection of `routes.network_id` within the GTFS Dataset should be selected to represent a specific `services` record, list them here. If all `network_id` within the `agency_id` selection should be selected, leave blank. Indicate if the "leftover" `network_id` from other `network_id` selections for the same `GTFS dataset` should be selected with `*`.
-- `gtfs service data.route_id`: if only a selection of `routes.route_id` within the GTFS Dataset should be selected to represent a specific `services` record, list them here. If all `route_id` within the `agency_id` and `network_id` selection should be selected, leave blank. Indicate if the "leftover" `route_id` from other `route_id` selections for the same `GTFS dataset` should be selected with `*`.
-
-### Common How Tos
-
-#### Update a GTFS Dataset's URL
-
-Navigate to the record for the dataset in `California Transit.gtfs datasets` and update the field `URI`. If there is an interesting story or issue behind the update, you can add a comment to the record by [expanding the record](https://support.airtable.com/hc/en-us/articles/202576579-Expanding-records).
-
-#### Add an additional GTFS Dataset
-
-- Add a record to the `California Transit.gtfs datasets` sheet.
-- Name the record something that can be easily recognized when referencing records, i.e. `AC Transit TripUpdates`
-- Add information for the `URI`, `dataset producer`, `dataset publisher`, and `type` (of dataset)
-- Click on the `+` in the `gtfs service mapping` and create a new record, linking a single `service` and the `gtfs dataset`. Be sure and also note the `category` of the mapping, which should be `primary` unless it is not the primarily used dataset.
-
-### California Transit: Entity Relationship Diagram
-
-[![](https://mermaid.ink/img/pako:eNqVVEtv4jAQ_iuWz0W9c1stbbWHbhFw5DLEEzJax07HDqss4b_vOCQQXlLLBSX6XuP5nL3OvEE91cgzgi1DuXZKfh-8BUf_IJJ36tBOJn6vlsg7ynCq1roEB1sMa_0ltK-QIT6C-w7-FvMwgwgBY6Jk3oW6_BalYm_q7HuUemMpFEfOY9b4XaJRUBUwuqj8GO3zu9btQxEwJTkKEZnc9sta7a3W-_zjebGa_1C5ZxULVJIO0sN3RCRQolYWnEt5oI6FZ4rNQ9VHw7bqp69dbN7QS6OqounlCwTzWQPLwGgUuUHnCq3atgs4t5DhhYbBkDFtMKiso0ws7tCKkoQqjwFG8V4l77KR4y0HxeuRoaosiVr05wL0vQ3D8kc9Pu4dIoMLFAer3qx2Rk5tzilu2V2C9oKcC-DUzQUZ5AV-1sRYpiLdml1mS6QXS1uSwsbm5HLDsvLvgtCwA5Pt93czX3ck_Y3oX6WLkTQYc4tZlBVtmsF7dHGG2e4wKS2mrJiCkM8VXkH4c_c8Bep3GJ7_ehaAPxXiFdG8Y2TKwtCoq5t7bkIqJqOVne5QiaLnCE7mO9v_Xs1-SUMGpXEJQtIqvDXinueUEdgEV0acujz6SZco3SIj38h90ltrcSxxrY8xcqhtTE4HgdaVEPHFUPSsp5FrfNJyjfyycdnwfMT0H1s9zcEGPPwHJNjt_A)](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqVVEtv4jAQ_iuWz0W9c1stbbWHbhFw5DLEEzJax07HDqss4b_vOCQQXlLLBSX6XuP5nL3OvEE91cgzgi1DuXZKfh-8BUf_IJJ36tBOJn6vlsg7ynCq1roEB1sMa_0ltK-QIT6C-w7-FvMwgwgBY6Jk3oW6_BalYm_q7HuUemMpFEfOY9b4XaJRUBUwuqj8GO3zu9btQxEwJTkKEZnc9sta7a3W-_zjebGa_1C5ZxULVJIO0sN3RCRQolYWnEt5oI6FZ4rNQ9VHw7bqp69dbN7QS6OqounlCwTzWQPLwGgUuUHnCq3atgs4t5DhhYbBkDFtMKiso0ws7tCKkoQqjwFG8V4l77KR4y0HxeuRoaosiVr05wL0vQ3D8kc9Pu4dIoMLFAer3qx2Rk5tzilu2V2C9oKcC-DUzQUZ5AV-1sRYpiLdml1mS6QXS1uSwsbm5HLDsvLvgtCwA5Pt93czX3ck_Y3oX6WLkTQYc4tZlBVtmsF7dHGG2e4wKS2mrJiCkM8VXkH4c_c8Bep3GJ7_ehaAPxXiFdG8Y2TKwtCoq5t7bkIqJqOVne5QiaLnCE7mO9v_Xs1-SUMGpXEJQtIqvDXinueUEdgEV0acujz6SZco3SIj38h90ltrcSxxrY8xcqhtTE4HgdaVEPHFUPSsp5FrfNJyjfyycdnwfMT0H1s9zcEGPPwHJNjt_A)
-
-[editable source](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqVVEtv4jAQ_iuWz0W9c1stbbWHbhFw5DLEEzJax07HDqss4b_vOCQQXlLLBSX6XuP5nL3OvEE91cgzgi1DuXZKfh-8BUf_IJJ36tBOJn6vlsg7ynCq1roEB1sMa_0ltK-QIT6C-w7-FvMwgwgBY6Jk3oW6_BalYm_q7HuUemMpFEfOY9b4XaJRUBUwuqj8GO3zu9btQxEwJTkKEZnc9sta7a3W-_zjebGa_1C5ZxULVJIO0sN3RCRQolYWnEt5oI6FZ4rNQ9VHw7bqp69dbN7QS6OqounlCwTzWQPLwGgUuUHnCq3atgs4t5DhhYbBkDFtMKiso0ws7tCKkoQqjwFG8V4l77KR4y0HxeuRoaosiVr05wL0vQ3D8kc9Pu4dIoMLFAer3qx2Rk5tzilu2V2C9oKcC-DUzQUZ5AV-1sRYpiLdml1mS6QXS1uSwsbm5HLDsvLvgtCwA5Pt93czX3ck_Y3oX6WLkTQYc4tZlBVtmsF7dHGG2e4wKS2mrJiCkM8VXkH4c_c8Bep3GJ7_ehaAPxXiFdG8Y2TKwtCoq5t7bkIqJqOVne5QiaLnCE7mO9v_Xs1-SUMGpXEJQtIqvDXinueUEdgEV0acujz6SZco3SIj38h90ltrcSxxrY8xcqhtTE4HgdaVEPHFUPSsp5FrfNJyjfyycdnwfMT0H1s9zcEGPPwHJNjt_A)
-
-## Transit Data Assessments
-
-| **Name**
*Key(s)*| **Description** |
-| :------------- | :-------------- |
-| `Provider Assessments` | Each record is an aggregated assessment for a single `organizations` which manages one or more `services` conducted using a specific set of `gtfs checks` at a specific point in time. Each record has a `Reviewer`, `Status`, and a link to the document sent to the `organizations` (if applicable) |
-| `gtfs checks` | Each record represents a check that can be performed on either a(n):- `organization` (who manages a transit service)
- `gtfs dataset` as a whole, or
- a `gtfs service data` record within that `gtfs dataset`. Each check has a score type of either:
- `Boolean`: yes or no
- `Nominal`: with defined scores for specific criteria levels as defined in the `Scoring Criteria` field, or
- `Continuous`: as defined by the `Scoring Criteria` column and the `Max Score` field.
Further, scores for each check can be sourced by one of - `human`: manual work
- `auto`: programmable
- `gtfs-trained-human`: a huma with necessary technical training, or
- `combo`: of these.
These sources are indicated in their current, near-future, and goal states in the columns `Source`, `Source: medium-term`, and `Source: goal`. |
-| `gtfs-service check data` | Each record is a specific assessment of a single `gtfs checks` for a single `gtfs service data` record at a specific point in time. |
-| `gtfs-dataset check data` | Each record is a specific assessment of a single `gtfs checks` for a single `gtfs dataset` record at a specific point in time. |
-| `provider check data` | Each record is a specific assessment of a single `gtfs checks` for a single `organizations` which *manages* one or more `services` at a specific point in time. |
-| `assessors` | List of people that can be assigned to Transit Data Assessment completion and review. Now that these people are collaborators in AirTable we can update this to just flag the person directly. |
-| `assessed transit providers` | Imported view from [Transit Service Base](#transit-service-base) which selects for organizations which `category` is either `core` or `other public transit` and `service type` is eitehr `fixed-route` or `deviated fixed-route`. |
-| `WIP gtfs grading scheme` | `WIP` Each record is a specific check that can be done on a `gtfs service data` for a specific GTFS Grading Scheme version. Scoring for the check should be based on the field `Scoring Criteria`. |
-| `WIP gtfs grades` | `WIP` Each record is a specific assessment of a single datapoint within a `gtfs service data` for a specific `gtfs grading scheme` check at a specific point in time. The `Score` field is backed up by `comments supporting score` and `gtfs dataset text`, `official reference` and `official reference attach.` (for screen shots) so that identified issues can be more easily remedied by transit providers. |
-| `WIP data improvement strategy` | `WIP` Each record is a single data improvement strategy for a single `Organizations` which responds to a single `provider gtfs assessments`. Fields includes links to the documents, `Status`, and dates. |
-
-### Transit Data Assessments: Entity Relationship Diagram
-
-[![](https://mermaid.ink/img/pako:eNqtlMty2zAMRX8Fw7XdD9Cuk6RZtjPOUhtIhCxMKdLDh1vH8r8X1MNRYsdJO9VKA4IHF5cEj6p2mlShyN8zbj12pS0tyPfDuz1r8l9DoBA6sjFA79brvocxRPrJow0c58xQQKmmtQCxJXh8-rYB10Agv-dagh1a3JKG6lCqG2X6ocy8dNdS_fMeI2b-nbMR2QZg2zjfYWRnofGuuwU8CdAd4TE2IXMCxf_D7EfmZmzv88zX1PM-OIm_rh-YQ3BwNGU3a8-RPOOAAJyFXevo3yhve_g7yrsNjSbdujC0R5MwChcjIOwc2yimQeSOPmxzwZ7WCzgDP2husXeRE14BRsScNqd8pi30dOW2X5DyVXf9O0JK5akxVMeBATor1xTE_-oqdbl71NgvrcvAioyzW5lO93Jsl6LGMV8Kye0ghFQJRyZ6Bb9arlvI4YZ_k16D86Jtz-KbXo8h71KkL7BLleHaHNa4RzZYGSFZDU2yWrrIvy8-AU5aYDeb-bbJs8uzzO9-i5afhwm7pTW0LhkpQucD1jcNnPCL2DzbFwU85VNiux1evZ3nDr2cjlqpjmT2WcsDe8ylSiUJ-V5njKYGk4lZxElS004OmB40R-dV0aAJtFKYotscbK2K6BPNSdNTPWWd_gBy8g8g)](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqtlMty2zAMRX8Fw7XdD9Cuk6RZtjPOUhtIhCxMKdLDh1vH8r8X1MNRYsdJO9VKA4IHF5cEj6p2mlShyN8zbj12pS0tyPfDuz1r8l9DoBA6sjFA79brvocxRPrJow0c58xQQKmmtQCxJXh8-rYB10Agv-dagh1a3JKG6lCqG2X6ocy8dNdS_fMeI2b-nbMR2QZg2zjfYWRnofGuuwU8CdAd4TE2IXMCxf_D7EfmZmzv88zX1PM-OIm_rh-YQ3BwNGU3a8-RPOOAAJyFXevo3yhve_g7yrsNjSbdujC0R5MwChcjIOwc2yimQeSOPmxzwZ7WCzgDP2husXeRE14BRsScNqd8pi30dOW2X5DyVXf9O0JK5akxVMeBATor1xTE_-oqdbl71NgvrcvAioyzW5lO93Jsl6LGMV8Kye0ghFQJRyZ6Bb9arlvI4YZ_k16D86Jtz-KbXo8h71KkL7BLleHaHNa4RzZYGSFZDU2yWrrIvy8-AU5aYDeb-bbJs8uzzO9-i5afhwm7pTW0LhkpQucD1jcNnPCL2DzbFwU85VNiux1evZ3nDr2cjlqpjmT2WcsDe8ylSiUJ-V5njKYGk4lZxElS004OmB40R-dV0aAJtFKYotscbK2K6BPNSdNTPWWd_gBy8g8g)
-
-[editable source](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqtlMty2zAMRX8Fw7XdD9Cuk6RZtjPOUhtIhCxMKdLDh1vH8r8X1MNRYsdJO9VKA4IHF5cEj6p2mlShyN8zbj12pS0tyPfDuz1r8l9DoBA6sjFA79brvocxRPrJow0c58xQQKmmtQCxJXh8-rYB10Agv-dagh1a3JKG6lCqG2X6ocy8dNdS_fMeI2b-nbMR2QZg2zjfYWRnofGuuwU8CdAd4TE2IXMCxf_D7EfmZmzv88zX1PM-OIm_rh-YQ3BwNGU3a8-RPOOAAJyFXevo3yhve_g7yrsNjSbdujC0R5MwChcjIOwc2yimQeSOPmxzwZ7WCzgDP2husXeRE14BRsScNqd8pi30dOW2X5DyVXf9O0JK5akxVMeBATor1xTE_-oqdbl71NgvrcvAioyzW5lO93Jsl6LGMV8Kye0ghFQJRyZ6Bb9arlvI4YZ_k16D86Jtz-KbXo8h71KkL7BLleHaHNa4RzZYGSFZDU2yWrrIvy8-AU5aYDeb-bbJs8uzzO9-i5afhwm7pTW0LhkpQucD1jcNnPCL2DzbFwU85VNiux1evZ3nDr2cjlqpjmT2WcsDe8ylSiUJ-V5njKYGk4lZxElS004OmB40R-dV0aAJtFKYotscbK2K6BPNSdNTPWWd_gBy8g8g)
-
-## Transit Technology Stacks
-
-| **Name**
*Key(s)*| **Description** |
-| :------------- | :-------------- |
-| `products`
*Primary Field*: `Name` | Each record is a product used in a transit technology stack at another organization (e.g. fixed-route scheduling software) or by riders (e.g. Google Maps). Products have properties such as input/output capabilities.
Products MAY:- *function as* one or more `stack components` |
-| `contracts`
*Primary Field*: `Name` | Each record is a contract between `organizations` to either provide one or more `products` or operate one or more `services`. Each contract has properties such as execution date, expiration, and renewals. |
-| `components` | Each component represents a single *lego* in a transit technology stack. It is part of a single `Component Group` (e.g. CAD/AVL) which are often bundled together, which in turn is part of a `Function Group` (e.g. Fare collection ,scheduling, etc) and is operated in physical space defined in `Location` (e.g. Cloud, Vehicle, etc.) |
-| `service-components` | Each record is an association between one or more `services`, a `product`, and one or more `components` which that product is serving as. |
-| `relationships service-components` | Each record is an one-way association between two `organization stack components` (`Component A` and `Component B`) using a `data schemas` and `Mechanism` (e.g. `auto-triggered pull`, `intra-product`, `human transaction`, etc.) |
-| `data schemas` | Each record indicates a data schema which can be used in one or more `relationships service-components`. |
-| `organizations` | Imported from [California Transit Base](#california-transit).|
-| `services` | Imported from [California Transit Base](#california-transit).|
-
-### Transit Stacks: Entity Relationship Diagram
-
-[![](https://mermaid.ink/img/pako:eNqdk7tuwzAMRX9F0JzH7jXp0ClF09ELITGyAFs0KClFG-ffS7_6SNu0iEbp3MtLSjppQxZ1oZG3HhxDUwYla8cOgn-F5ClEde6WSzqpPfLRGyxUqRsI4DCW-kecBnxDITGY1PP0HP4PV1Tb6_QDk80jHLGuB3jEp4wbaloKGJIoVqvuS3YfVY5oVSLV1hDWYy9rapEh4Vz3N6NPpcUoSlQF8bqoU-8bk0wa9cH9JbwYi-hapqO3ffaKKbvqo-9HrMcRVb79ZtZ1Q4rL_d50x975AEOcOJ4rMwNzulvNn4Adpht9pwlsIcHeVNhA73gfxSUENEmGkKOkLrVe6Aa5AW_lHZ9651InEchV9hKLB8j1UPMsaG6t3PKd9YlYFweoIy405ET7l2B0kTjjDE0_YqLOb7JEHuQ)](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqdk7tuwzAMRX9F0JzH7jXp0ClF09ELITGyAFs0KClFG-ffS7_6SNu0iEbp3MtLSjppQxZ1oZG3HhxDUwYla8cOgn-F5ClEde6WSzqpPfLRGyxUqRsI4DCW-kecBnxDITGY1PP0HP4PV1Tb6_QDk80jHLGuB3jEp4wbaloKGJIoVqvuS3YfVY5oVSLV1hDWYy9rapEh4Vz3N6NPpcUoSlQF8bqoU-8bk0wa9cH9JbwYi-hapqO3ffaKKbvqo-9HrMcRVb79ZtZ1Q4rL_d50x975AEOcOJ4rMwNzulvNn4Adpht9pwlsIcHeVNhA73gfxSUENEmGkKOkLrVe6Aa5AW_lHZ9651InEchV9hKLB8j1UPMsaG6t3PKd9YlYFweoIy405ET7l2B0kTjjDE0_YqLOb7JEHuQ)
-
-[editable source](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqdk7tuwzAMRX9F0JzH7jXp0ClF09ELITGyAFs0KClFG-ffS7_6SNu0iEbp3MtLSjppQxZ1oZG3HhxDUwYla8cOgn-F5ClEde6WSzqpPfLRGyxUqRsI4DCW-kecBnxDITGY1PP0HP4PV1Tb6_QDk80jHLGuB3jEp4wbaloKGJIoVqvuS3YfVY5oVSLV1hDWYy9rapEh4Vz3N6NPpcUoSlQF8bqoU-8bk0wa9cH9JbwYi-hapqO3ffaKKbvqo-9HrMcRVb79ZtZ1Q4rL_d50x975AEOcOJ4rMwNzulvNn4Adpht9pwlsIcHeVNhA73gfxSUENEmGkKOkLrVe6Aa5AW_lHZ9651InEchV9hKLB8j1UPMsaG6t3PKd9YlYFweoIy405ET7l2B0kTjjDE0_YqLOb7JEHuQ)
-
-## Dashboards
-
-## DAGs Maintenance
-
-You can find further information on DAGs maintenance for Transit Database data [on this page](dags-maintenance).
diff --git a/docs/transit_database/transitdatabase.md b/docs/transit_database/transitdatabase.md
new file mode 100644
index 0000000000..d004461d2f
--- /dev/null
+++ b/docs/transit_database/transitdatabase.md
@@ -0,0 +1,80 @@
+# Transit Database (Airtable)
+
+The Cal-ITP Airtable Transit Database stores key relationships about how transit services are organized and operated in California as well as how well they are performing. See Evan or post in the `#airtable-data` Slack channel to get a link and gain access.
+
+Important Airtable documentation is maintained elsewhere:
+
+* [Airtable Data Documentation Google Doc](https://docs.google.com/document/d/1KvlYRYB8cnyTOkT1Q0BbBmdQNguK_AMzhSV5ELXiZR4/edit#heading=h.u7y2eosf0i1d) - documentation of specific fields in Airtable
+* [California Transit Data - Operating Procedures Google Doc](https://docs.google.com/document/d/1IO8x9-31LjwmlBDH0Jri-uWI7Zygi_IPc9nqd7FPEQM/edit#) - outlines the processes by which Airtable data is maintained
+
+In addition, some documentation is available automatically within Airtable (these require Airtable authentication to access):
+* Airtable creates an API documentation page for each base (for example, [here is the page for California Transit](https://airtable.com/appPnJWrQ7ui4UmIl/api/docs)). This page provides technical information about field types and relationships. Airtable does not currently have an effective mechanism to programmatically download your data schema (they have paused issuing keys to their metadata API).
+* When looking at a base, there is an `Extensions` tab at the far upper right corner (below the share, notifications, and user icons). If you click that, an extensions sidebar will open. In that sidebar, there is an extension called `Base schema` (you may have to open it fullscreen to actually see it.) This extension will let you see an auto-generated visualization of the technical relationships among fields in the base.
+
+Cal-ITP uses two main Airtable bases:
+
+| **Base** | **Description** |
+| :------------ | :-------------- |
+| [**California Transit**](#california-transit) | Defines key organizational relationships and properties. Organizations, geography, funding programs, transit services, service characteristics, transit datasets such as GTFS, and the intersection between transit datasets and services.
+| [**Transit Technology Stacks**](#transit-technology-stacks) | Defines operational setups at transit provider organizations. Defines relationships between vendor organizations, transit provider and operator organizations, products, contracts to provide products, transit stack components, and how they relate to one-another.
+
+The rest of this page outlines stray technical considerations associated with Airtable and its ingestion into the data warehouse.
+
+## Primary Keys
+
+Airtable forces the use of the left-most field as the primary key of the database: the field that must be referenced in other tables, similar to a VLOOKUP in a spreadsheet. Unlike many databases, Airtable doesn't enforce uniqueness in the values of the primary key field. Instead, it assigns it an underlying and mostly hidden unique [`RECORD ID`](https://support.airtable.com/hc/en-us/articles/360051564873-Record-ID), which can be exposed by creating a formula field to reference it.
+
+## Importing Airtable data into the Cal-ITP data warehouse
+
+We ingest data from Airtable into the Cal-ITP data warehouse. For an overview of the data ingest process/architecture, see [the pipeline architecture documentation](architecture-data). For pointers to where Airtable-specific code and artifacts, see [the pipeline reference Google Sheet](https://docs.google.com/spreadsheets/d/1bv1K5lZMnq1eCSZRy3sPd3MgbdyghrMl4u8HvjNjWPw/edit#gid=0).
+
+To ingest a new Airtable table or base and make it available in the warehouse, you need to make updates throughout the data ingest flow, from the Airtable scraper Airflow DAG all the way to dbt mart tables. See [data infra PR #2781](https://github.com/cal-itp/data-infra/pull/2781) for an example of what this can look like. Ingesting new columns in an existing table is similar; see [data infra PR #2383](https://github.com/cal-itp/data-infra/pull/2383) for an example.
+
+### Gotchas
+Bringing Airtable data into the warehouse can involve a few tricky situations. Here are a few we've encountered so far, with suggested resolutions.
+
+#### Foreign keys and bridge tables
+Airtable allows users to define links between tables, to create relationships between records of different types. In the Airtable UI, these links display the primary field for the linked record in the relevant column (so, for example, the `Services.provider` column contains an organization's name like `City of Anaheim`.) However, these foreign key links are exported via the Airtable API as an array of the back-end record IDs (so, instead of a single organization name like `City of Anaheim`, that `Services.provider` field will appear as an array containing a record ID, like `[rec0123asdf]`.) It does this even if the given field only ever contains exactly one foreign key (i.e., it turns it into an array even if all the arrays have only one entry.)
+
+This means:
+* All foreign keys need to be unpacked from arrays in the warehouse to become useful for joins. See below for more on this.
+* If a linked field is severed in Airtable (if the foreign key relationship is removed, but the columns that contained the links are not deleted) it can break our data ingest, because these array-type fields will become string-type fields. Ideally, it is best to just delete any associated columns when a foreign key relationship/link is ended. If this is not done and the data ingest does break, the solution is to suppress the broken column from the associated table by removing it from the external table schema. If the external table uses schema auto-detect, you may have to define a schema for the table that does not include the broken column. See [data infra PR #2441](https://github.com/cal-itp/data-infra/pull/2441) for an example of this process (though addressing a different issue.)
+
+Airtable foreign keys in the warehouse also require some special handling because:
+
+* Most Airtable data is treated as dimensions (i.e., entities that we version over time)
+* Some Airtable data contains many-to-many relationships
+
+The mechanism that we have used to deal with both of these is the **bridge table**, [described in our dbt docs](https://dbt-docs.calitp.org/#!/overview). The bridge table stores the foreign key pairs to allow you to traverse a relationship, instead of trying to store these on each of the tables in the relationship itself. Trying to store the foreign keys on the tables directly opens you up to issues:
+
+* You have to either store the foreign keys as an array or change the cardinality of the table (to account for the fact that one record may need to store multiple foreign keys, either to capture versioning on the foreign table or to capture relationships with multiple records). Metabase does not natively allow unnesting arrays to do joins in the GUI query editor, so we try to have non-array foreign keys in mart tables.
+* You risk infinite loops if you try to version a record that includes a versioned foreign key on both sides of the relationship (which is how Airtable stores these relationships). For example, you have an organization and a service that are linked, with both containing a foreign key to the other. An attribute is changed on the service, creating a new versioned key. You need to add that new versioned service key to the organization record. But now that has triggered a change on the organization record, which makes a new versioned key on the organization record. So now you have to update the organization versioned key on the service record. And thus to infinity. Another solution here is to only store the relationship on one side, but then you still have the first problem of arrays and cardinality.
+
+Bridge tables do introduce some complexity in handling fanout from joins, but they remove that complexity from the dimension tables themselves. Another solution would be to only store the unversioned natural key for the foreign key, in which case you would only need bridge tables for true many-to-many relationships (to handle the array/cardinality issue), but that would still create fanout without the explicit artifact of the bridge table to help troubleshoot.
+
+#### Synced tables
+Airtable allows you to "sync" a table from one base to another, where it appears with all the data from its source location and can be linked to records in the second base. An example in our Airtable is the `California Transit.organizations` table is synced to `Transit Technology Stacks.organizations`; you will see a little lightning icon to show that it is a synced table.
+
+This requires special handling when importing to the warehouse, because Airtable assigns new back-end record IDs in the synced table, which means that foreign keys to the synced table in the second base will not match record IDs in the source table. We resolve this by mapping all foreign keys to point to the source table in a base layer in dbt. See [data infra PR #2781](https://github.com/cal-itp/data-infra/pull/2781) for an example.
+
+## Entity Relationship Diagrams
+
+The following entity relationship diagrams were last updated in 2022 but are preserved for general reference purposes.
+
+### California Transit
+
+[![](https://mermaid.ink/img/pako:eNqVVEtv4jAQ_iuWz0W9c1stbbWHbhFw5DLEEzJax07HDqss4b_vOCQQXlLLBSX6XuP5nL3OvEE91cgzgi1DuXZKfh-8BUf_IJJ36tBOJn6vlsg7ynCq1roEB1sMa_0ltK-QIT6C-w7-FvMwgwgBY6Jk3oW6_BalYm_q7HuUemMpFEfOY9b4XaJRUBUwuqj8GO3zu9btQxEwJTkKEZnc9sta7a3W-_zjebGa_1C5ZxULVJIO0sN3RCRQolYWnEt5oI6FZ4rNQ9VHw7bqp69dbN7QS6OqounlCwTzWQPLwGgUuUHnCq3atgs4t5DhhYbBkDFtMKiso0ws7tCKkoQqjwFG8V4l77KR4y0HxeuRoaosiVr05wL0vQ3D8kc9Pu4dIoMLFAer3qx2Rk5tzilu2V2C9oKcC-DUzQUZ5AV-1sRYpiLdml1mS6QXS1uSwsbm5HLDsvLvgtCwA5Pt93czX3ck_Y3oX6WLkTQYc4tZlBVtmsF7dHGG2e4wKS2mrJiCkM8VXkH4c_c8Bep3GJ7_ehaAPxXiFdG8Y2TKwtCoq5t7bkIqJqOVne5QiaLnCE7mO9v_Xs1-SUMGpXEJQtIqvDXinueUEdgEV0acujz6SZco3SIj38h90ltrcSxxrY8xcqhtTE4HgdaVEPHFUPSsp5FrfNJyjfyycdnwfMT0H1s9zcEGPPwHJNjt_A)](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqVVEtv4jAQ_iuWz0W9c1stbbWHbhFw5DLEEzJax07HDqss4b_vOCQQXlLLBSX6XuP5nL3OvEE91cgzgi1DuXZKfh-8BUf_IJJ36tBOJn6vlsg7ynCq1roEB1sMa_0ltK-QIT6C-w7-FvMwgwgBY6Jk3oW6_BalYm_q7HuUemMpFEfOY9b4XaJRUBUwuqj8GO3zu9btQxEwJTkKEZnc9sta7a3W-_zjebGa_1C5ZxULVJIO0sN3RCRQolYWnEt5oI6FZ4rNQ9VHw7bqp69dbN7QS6OqounlCwTzWQPLwGgUuUHnCq3atgs4t5DhhYbBkDFtMKiso0ws7tCKkoQqjwFG8V4l77KR4y0HxeuRoaosiVr05wL0vQ3D8kc9Pu4dIoMLFAer3qx2Rk5tzilu2V2C9oKcC-DUzQUZ5AV-1sRYpiLdml1mS6QXS1uSwsbm5HLDsvLvgtCwA5Pt93czX3ck_Y3oX6WLkTQYc4tZlBVtmsF7dHGG2e4wKS2mrJiCkM8VXkH4c_c8Bep3GJ7_ehaAPxXiFdG8Y2TKwtCoq5t7bkIqJqOVne5QiaLnCE7mO9v_Xs1-SUMGpXEJQtIqvDXinueUEdgEV0acujz6SZco3SIj38h90ltrcSxxrY8xcqhtTE4HgdaVEPHFUPSsp5FrfNJyjfyycdnwfMT0H1s9zcEGPPwHJNjt_A)
+
+[editable source](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqVVEtv4jAQ_iuWz0W9c1stbbWHbhFw5DLEEzJax07HDqss4b_vOCQQXlLLBSX6XuP5nL3OvEE91cgzgi1DuXZKfh-8BUf_IJJ36tBOJn6vlsg7ynCq1roEB1sMa_0ltK-QIT6C-w7-FvMwgwgBY6Jk3oW6_BalYm_q7HuUemMpFEfOY9b4XaJRUBUwuqj8GO3zu9btQxEwJTkKEZnc9sta7a3W-_zjebGa_1C5ZxULVJIO0sN3RCRQolYWnEt5oI6FZ4rNQ9VHw7bqp69dbN7QS6OqounlCwTzWQPLwGgUuUHnCq3atgs4t5DhhYbBkDFtMKiso0ws7tCKkoQqjwFG8V4l77KR4y0HxeuRoaosiVr05wL0vQ3D8kc9Pu4dIoMLFAer3qx2Rk5tzilu2V2C9oKcC-DUzQUZ5AV-1sRYpiLdml1mS6QXS1uSwsbm5HLDsvLvgtCwA5Pt93czX3ck_Y3oX6WLkTQYc4tZlBVtmsF7dHGG2e4wKS2mrJiCkM8VXkH4c_c8Bep3GJ7_ehaAPxXiFdG8Y2TKwtCoq5t7bkIqJqOVne5QiaLnCE7mO9v_Xs1-SUMGpXEJQtIqvDXinueUEdgEV0acujz6SZco3SIj38h90ltrcSxxrY8xcqhtTE4HgdaVEPHFUPSsp5FrfNJyjfyycdnwfMT0H1s9zcEGPPwHJNjt_A)
+
+### Transit Stacks
+
+[![](https://mermaid.ink/img/pako:eNqdk7tuwzAMRX9F0JzH7jXp0ClF09ELITGyAFs0KClFG-ffS7_6SNu0iEbp3MtLSjppQxZ1oZG3HhxDUwYla8cOgn-F5ClEde6WSzqpPfLRGyxUqRsI4DCW-kecBnxDITGY1PP0HP4PV1Tb6_QDk80jHLGuB3jEp4wbaloKGJIoVqvuS3YfVY5oVSLV1hDWYy9rapEh4Vz3N6NPpcUoSlQF8bqoU-8bk0wa9cH9JbwYi-hapqO3ffaKKbvqo-9HrMcRVb79ZtZ1Q4rL_d50x975AEOcOJ4rMwNzulvNn4Adpht9pwlsIcHeVNhA73gfxSUENEmGkKOkLrVe6Aa5AW_lHZ9651InEchV9hKLB8j1UPMsaG6t3PKd9YlYFweoIy405ET7l2B0kTjjDE0_YqLOb7JEHuQ)](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqdk7tuwzAMRX9F0JzH7jXp0ClF09ELITGyAFs0KClFG-ffS7_6SNu0iEbp3MtLSjppQxZ1oZG3HhxDUwYla8cOgn-F5ClEde6WSzqpPfLRGyxUqRsI4DCW-kecBnxDITGY1PP0HP4PV1Tb6_QDk80jHLGuB3jEp4wbaloKGJIoVqvuS3YfVY5oVSLV1hDWYy9rapEh4Vz3N6NPpcUoSlQF8bqoU-8bk0wa9cH9JbwYi-hapqO3ffaKKbvqo-9HrMcRVb79ZtZ1Q4rL_d50x975AEOcOJ4rMwNzulvNn4Adpht9pwlsIcHeVNhA73gfxSUENEmGkKOkLrVe6Aa5AW_lHZ9651InEchV9hKLB8j1UPMsaG6t3PKd9YlYFweoIy405ET7l2B0kTjjDE0_YqLOb7JEHuQ)
+
+[editable source](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqdk7tuwzAMRX9F0JzH7jXp0ClF09ELITGyAFs0KClFG-ffS7_6SNu0iEbp3MtLSjppQxZ1oZG3HhxDUwYla8cOgn-F5ClEde6WSzqpPfLRGyxUqRsI4DCW-kecBnxDITGY1PP0HP4PV1Tb6_QDk80jHLGuB3jEp4wbaloKGJIoVqvuS3YfVY5oVSLV1hDWYy9rapEh4Vz3N6NPpcUoSlQF8bqoU-8bk0wa9cH9JbwYi-hapqO3ffaKKbvqo-9HrMcRVb79ZtZ1Q4rL_d50x975AEOcOJ4rMwNzulvNn4Adpht9pwlsIcHeVNhA73gfxSUENEmGkKOkLrVe6Aa5AW_lHZ9651InEchV9hKLB8j1UPMsaG6t3PKd9YlYFweoIy405ET7l2B0kTjjDE0_YqLOb7JEHuQ)
+
+## Dashboards
+
+## DAGs Maintenance
+
+You can find further information on DAGs maintenance for Transit Database data [on this page](dags-maintenance).
diff --git a/docs/datasets_and_tables/overview.md b/docs/warehouse/navigating_dbt_docs.md
similarity index 81%
rename from docs/datasets_and_tables/overview.md
rename to docs/warehouse/navigating_dbt_docs.md
index 91a30f412a..bfd67d92bf 100644
--- a/docs/datasets_and_tables/overview.md
+++ b/docs/warehouse/navigating_dbt_docs.md
@@ -48,10 +48,3 @@ To examine the documentation for our tables from the `Project` perspective:
2. Within that list, select `models`
3. From here, file directories will appear below.
4. Select the directory of your choice. A dropdown list of tables will appear and you can select a table to view its documentation
-
-# Legacy documentation
-In general, the dbt docs should be the main source of all documentation for warehouse entities (sources, views, tables, "models", etc.) but the following pages contain some not-yet-migrated documentation.
-
-| page | description | datasets |
-| ---- | ----------- | -------- |
-| [Transit Database](./transitdatabase.md) | A representation of Cal-ITP's internal knowledge about our Transit Operators in CA and various pieces of National Transit Database statistics for ease of use | `airtable.*`, `staging.transit_database__*`, `transitstacks.*` |