Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude private datasets from 6 geoportal datasets, GTFS digest #1223

Merged
merged 6 commits into from
Sep 11, 2024

Conversation

tiffanychu90
Copy link
Member

@tiffanychu90 tiffanychu90 commented Sep 11, 2024

  • Add a function in gtfs_utils_v2 to grab either a list or the df of public datasets - normally, this is preferred because we get a list of schedule_gtfs_dataset_keys we can keep.
    • Add function to filter df down in publish_utils -- sometimes in GTFS digest, we do not have schedule_gtfs_dataset_key, so we use schedule_gtfs_dataset_name to filter.
  • high quality transit areas and open data - do this in last steps before exporting
  • speeds - do this in publish_open_data via parquet filtering and publish_public_gcs
  • GTFS digest - do this in merge_* scripts so that all the dates are concatenated, and before exporting, private datasets are dropped. This should allow the portfolio yaml to build and report to run as usual.
  • Run Sep monthly publishing and double check results are as expected (exclude Big Blue Bus Swiftly Schedule)
  • Research Request - Suppress private datasets from being published  #1220

TODO related GTFS Digest:
some redundancies were noticed while updating merge scripts. wherever references are shared, adapted functions to take additional uses.

  • this affects service hours, having weekday / weekend columns in addition to rows holding those values is confusing, since column values are mixing normalized (per day) and not normalized (sums). also month changed to month_year because month typically holds month values (1-12 or Jan-Dec).
  • need to rerun all the merge_* scripts to create new concatenated digest tables

@tiffanychu90 tiffanychu90 merged commit aaa8816 into main Sep 11, 2024
2 checks passed
@tiffanychu90 tiffanychu90 deleted the private-datasets branch September 11, 2024 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant