Skip to content
jwalsh28 edited this page Dec 9, 2024 · 26 revisions

Welcome to the mobility-from-poverty wiki!

This wiki provides resources and instructions for metric leads and technical reviewers on the mobility from poverty data project. See the menu to the right to navigate to different topics.

Tracker of all metric updates

For an overview of our planned updates for this upcoming round (version2025), see our tracker that is predictor/metric-specific HERE. This file tracks upcoming data releases, subgroup/methodological updates, metric lead assignments, and prioritizes tasks we want for each metric.

2025 Metric Update Milestones

TASK Milestones DEADLINE
New branch created that is titled to match the corresponding issue Mid/late-Oct
“Final data expectations form” started (admin will fill in and confirm with metric leads) Late-Oct
If programming in R: R Quarto document created and labeled appropriately at top (should include metric reviewer name, date and note on purpose of update  If programming in STATA or SAS: Do file or SAS program file created with notation at the top with authors name, date and a note explaining purpose of update Early-Nov
Program updates started for all assigned metrics - new code written (any new code). If bringing in new or updated data: Code is written and executed that reads data from an API. If API is not available then raw data is downloaded to the 2025 data box file in a folder with the metric name. Mid - Nov
Initial Pull Request (set to draft) of all assigned metrics is created with requests for specific review from technical reviewer (if needed) Mid-Dec
Your code is completed and you outputted all your final metrics data files (e.g., county file, subgroups, city file, all years, etc.). Early/mid- Jan
Run “Final data expectations” function test Mid-Feb
All tasks in the Git Issue template checklist are completed and “final data expectations” test passed. Completed metric update is committed and review requested via the previously opened PR End - Feb
Final Pull request approved and merged into version2025 End - Feb

Mobility Metric Update Checklist

The checklist below outlines some of the key points every metric lead should finalize before completing a metric update. We encourage metric leads to follow this list as they work through their update.

Setup

  • Metric lead has cloned the Version2025 repository into a folder accessible to them (we recommend the C: drive)
  • Metric lead has read through their assigned issue and has a firm understanding of what the expectation is for this update
  • Metric lead has checked out a new branch from the Version2025 repo that is named after the relevant issue, i.e. iss###
  • Metric lead has filled out the final data expectations from located in the functions folder of the repo and saved this form in the metrics data folder for all relevant final output files
  • Metric lead has read through the existing version of the program and has located and overviewed the existing output files

Program Documentation

  • The update program includes a description at the start with the date, the latest changes made and the author of the metric lead that made them
  • If the program reads in raw data that is not available through an API, then the code includes a note on where this data is in Box (including the title of relevant files)
  • Each step taken in the calculation is clearly documented in the code using comments
  • The program is broken out into manageable steps and the code avoids using extensively long lines connected via pipes or pipe equivalents if not using R

Quality Control

  • The program includes visuals of the distribution of key analysis variables throughout the calculation steps
  • The program includes visualizations of the final data (histograms, scatter plots, etc.)
  • The program includes summary statistic and a selection of assumptions tests of the final data (including count of rows by year, missing values, etc.)
  • The program includes the creation of a quality variable for the metric and documents the method for assigning quality grades
  • NA values are consistently applied such that there are no cases where a metric value is missing but the quality flag is filled in and vice versa

Reproducibility

  • The program runs from start to finish without stopping due to errors or incompleteness
  • The program avoids hardcoding local file paths and instead uses global paths that will work regardless of where the program is being ran (i.e. here::here() for R users)
  • The program includes a “House Keeping” section which loads all necessary packages at the top of the program

Final Data

  • The program reads out a final file in the form of a CSV document or multiple CSVs into a data folder in the relevant metric folder
  • Final files include the relevant years in title if the metric has multiple files separated by year (example: neonatal_health_subgroup_2020.csv)
  • All final files being read out by the update program are put through the evaluate final data function

Review

  • When ready for review the metric lead has submitted a PR to Version2025 using the PR template