2023 county-level delinquent debt overall and by auto loan, medical debt, and student loan debt created #434

minglizhong · 2024-12-11T14:43:59Z

All 2023 results are in mobility-from-poverty\01_financial-well-being\output. Previously created 2021 city-level data and 2022 county-level data are also in the same \output subfolder. The do file that needs to be reviewed is this: mobility-from-poverty\01_financial-well-being\county-debt-coll-shares-2023.do, mostly copied from the previous do file (county-debt-coll-shares-2022.do) with some revisions.
Note that in the previous 2022 county-level data, mobility-from-poverty\01_financial-well-being\output\county-debt-coll-shares-2022.csv, state and county names are numeric (1, 2, 3...). I don't know the actual values of these numeric values. In the new 2023 county-level data, I temporarily kept the state and county actual names, so they're string variables. Further guidance is needed whether we use numeric or string and how numeric values correspond with string values if we'd like to merge 2022 and 2023 county-level data.

… debt, and student loan debt were added to subfolder output. 2022 county-level and 2021 city-level data were created previously and in the same output subfolder. Note that county and state names in 2022 county-level csv file are numeric but are string variables with the actual state and county names in 2023 files. Further guidance is needed on whether actual state and county names are needed. Code to generate 2023 county-level data is county-debt-coll-shares-2023.do

…do file that needs to be reviewed.

kmartinchek · 2024-12-31T15:24:44Z

Great work! Some comments from my review of the draft PR (focusing just on the 2023 file). The file ran without errors, but I have several suggestions:

IIRC, only final data will be stored in the repo, and directions for download from Box should be incorporated (see https://github.com/UI-Research/mobility-from-poverty/wiki/6_Raw-Data for details). I think you’ll need to edit this file to reflect this guidance and add details on the data. Alternatively, I also think you will be able to download the DiA data from the Data Catalog directly, but I would reach out to Sam Cressman to confirm.
You’ll need to note that if using direct file paths on line 23 and line 72 and line 112, for users to replace these. Or, I recommend setting a macro for the first part of the file path so it is relative – which can be done in a “file setup” section at the top of the do file.
I would delete lines 25-31 and 149-160 if not needed (they are from prior do file).
Add a comment explaining the reason for the drop on line 40 and 81 and 121.
I would confirm with Claudia/JR, but I was expecting that each debt variable would be a separate variable (with distinct names and their own quality flag) in one output file – e.g., share_debt_coll, share_debt_med, share_debt_auto, and share_debt_stud. Just because debt in collections is a specific designation (and the other forms of debt don’t meet this same criteria – they are XX # of days past due vs. reflecting lenders acting on derogatory status). As of now, the output files are set up as one for each debt type (using the same variable name), which doesn’t match the data expectations guidelines on the Wiki.
I would look closely at the Wiki on data structure, as the state/county variables are names in the current output vs. FIPS codes – but maybe this is okay with Claudia/JR. There should also be an overall file separate from the subgroup file, which was not generated that should be.
The file has a lot of repetitive code. I think it could make sense to make changes to the code so that you are pulling in the files and making the rename adjustments first (which could be made easier if you are using distinct variable names for each form of debt), then pulling the updated files into a loop where you do the other formatting and variable generation.
The data quality flag needs a detailed explanation of the 3 flags in DiA data (the n/a’s) so folks know why these are flagged as data quality 3 – but it is possible that these should be flagged as missing because the main metric is NA (see: https://github.com/UI-Research/mobility-from-poverty/wiki/2_Data-Structure-and-Expectations%E2%80%AF#missing-values) , although I would confirm this specifically with Claudia/JR.
Also, should the final output have NA values as blank or as “NA”. I would confirm and adjust accordingly.
IIRC, the final output should include all years of data (confirm with Claudia/JR)—so you might need to append the 2018, 2022, and 2023 data in this do file or another.

minglizhong added 2 commits December 11, 2024 09:33

Changed the do file name to -2023 to minimize confusion. This is the …

c996320

…do file that needs to be reviewed.

awunderground requested a review from kmartinchek December 12, 2024 20:28

kmartinchek modified the milestone: z Dec 31, 2024

cdsolari assigned kmartinchek Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2023 county-level delinquent debt overall and by auto loan, medical debt, and student loan debt created #434

2023 county-level delinquent debt overall and by auto loan, medical debt, and student loan debt created #434

minglizhong commented Dec 11, 2024

kmartinchek commented Dec 31, 2024

2023 county-level delinquent debt overall and by auto loan, medical debt, and student loan debt created #434

Are you sure you want to change the base?

2023 county-level delinquent debt overall and by auto loan, medical debt, and student loan debt created #434

Conversation

minglizhong commented Dec 11, 2024

kmartinchek commented Dec 31, 2024