Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a column for the 2022 ntd ids #3364

Merged
merged 6 commits into from
Jun 4, 2024
Merged

Adds a column for the 2022 ntd ids #3364

merged 6 commits into from
Jun 4, 2024

Conversation

vevetron
Copy link
Contributor

Description

Trying to make it easier to get NTD ids.

Based off of: cal-itp/data-analyses#1121, the only difference between our airtable ids and ntd ids was pulling out the 5 digits, so I did that algorithmically.

There are 27 orgs not on the list, deriving results for them regardless.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

How has this been tested?

poetry run dbt run -s dim_organizations

SELECT *,
IF(LENGTH(ntd_id) >= 10,
SUBSTR(ntd_id, -5),
ntd_id) AS ntd_id_2022
FROM cal-itp-data-infra.mart_transit_database.dim_organizations
where _is_current is true
and
ntd_id is not null

Both seemed fine. There is one org where the ntd id is 9R02, basically CA DOT

Post-merge follow-ups

  • No action required

@vevetron vevetron marked this pull request as ready for review May 31, 2024 23:20
Copy link
Member

@evansiroky evansiroky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for now to get us going with the understanding that we may end up with a longer-term data modeling task to be able to join across multiple years of NTD data in some other kind of way.

@vevetron vevetron merged commit 5c61b0f into main Jun 4, 2024
3 of 4 checks passed
@vevetron vevetron deleted the add_2022_ntd_id branch June 4, 2024 03:58
@tiffanychu90
Copy link
Member

Follow-up notes and references:

  • add seed table that's csv
  • poetry run dbt seed --select "ntd_id_to_source_record_id"
  • would adding a state and/or uza column in the table that could be useful for filtering
  • https://docs.getdbt.com/docs/build/seeds

@vevetron
Copy link
Contributor Author

Basically this code was redacted since I felt it would be more useful to have the ntd ids in the source airtable, and more available throughout the pipelines and analysis tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants