Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2023 county-level delinquent debt overall and by auto loan, medical debt, and student loan debt created #434

Open
wants to merge 2 commits into
base: version2025
Choose a base branch
from

Conversation

minglizhong
Copy link
Collaborator

All 2023 results are in mobility-from-poverty\01_financial-well-being\output. Previously created 2021 city-level data and 2022 county-level data are also in the same \output subfolder. The do file that needs to be reviewed is this: mobility-from-poverty\01_financial-well-being\county-debt-coll-shares-2023.do, mostly copied from the previous do file (county-debt-coll-shares-2022.do) with some revisions.
Note that in the previous 2022 county-level data, mobility-from-poverty\01_financial-well-being\output\county-debt-coll-shares-2022.csv, state and county names are numeric (1, 2, 3...). I don't know the actual values of these numeric values. In the new 2023 county-level data, I temporarily kept the state and county actual names, so they're string variables. Further guidance is needed whether we use numeric or string and how numeric values correspond with string values if we'd like to merge 2022 and 2023 county-level data.

… debt, and student loan debt were added to subfolder output. 2022 county-level and 2021 city-level data were created previously and in the same output subfolder. Note that county and state names in 2022 county-level csv file are numeric but are string variables with the actual state and county names in 2023 files. Further guidance is needed on whether actual state and county names are needed. Code to generate 2023 county-level data is county-debt-coll-shares-2023.do
@kmartinchek kmartinchek modified the milestone: z Dec 31, 2024
@kmartinchek
Copy link
Collaborator

Great work! Some comments from my review of the draft PR (focusing just on the 2023 file). The file ran without errors, but I have several suggestions:

  1. IIRC, only final data will be stored in the repo, and directions for download from Box should be incorporated (see https://github.com/UI-Research/mobility-from-poverty/wiki/6_Raw-Data for details). I think you’ll need to edit this file to reflect this guidance and add details on the data. Alternatively, I also think you will be able to download the DiA data from the Data Catalog directly, but I would reach out to Sam Cressman to confirm.
  2. You’ll need to note that if using direct file paths on line 23 and line 72 and line 112, for users to replace these. Or, I recommend setting a macro for the first part of the file path so it is relative – which can be done in a “file setup” section at the top of the do file.
  3. I would delete lines 25-31 and 149-160 if not needed (they are from prior do file).
  4. Add a comment explaining the reason for the drop on line 40 and 81 and 121.
  5. I would confirm with Claudia/JR, but I was expecting that each debt variable would be a separate variable (with distinct names and their own quality flag) in one output file – e.g., share_debt_coll, share_debt_med, share_debt_auto, and share_debt_stud. Just because debt in collections is a specific designation (and the other forms of debt don’t meet this same criteria – they are XX # of days past due vs. reflecting lenders acting on derogatory status). As of now, the output files are set up as one for each debt type (using the same variable name), which doesn’t match the data expectations guidelines on the Wiki.
  6. I would look closely at the Wiki on data structure, as the state/county variables are names in the current output vs. FIPS codes – but maybe this is okay with Claudia/JR. There should also be an overall file separate from the subgroup file, which was not generated that should be.
  7. The file has a lot of repetitive code. I think it could make sense to make changes to the code so that you are pulling in the files and making the rename adjustments first (which could be made easier if you are using distinct variable names for each form of debt), then pulling the updated files into a loop where you do the other formatting and variable generation.
  8. The data quality flag needs a detailed explanation of the 3 flags in DiA data (the n/a’s) so folks know why these are flagged as data quality 3 – but it is possible that these should be flagged as missing because the main metric is NA (see: https://github.com/UI-Research/mobility-from-poverty/wiki/2_Data-Structure-and-Expectations%E2%80%AF#missing-values) , although I would confirm this specifically with Claudia/JR.
  9. Also, should the final output have NA values as blank or as “NA”. I would confirm and adjust accordingly.
  10. IIRC, the final output should include all years of data (confirm with Claudia/JR)—so you might need to append the 2018, 2022, and 2023 data in this do file or another.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants