Data Quality Comparison #152

gwenbeebe · 2021-04-08T01:57:22Z

At 585, we're calculating out the data quality flags with this chunk here

data_quality_flags_detail <- pe_validation_summary %>%
  left_join(dq_flags_staging, by = "AltProjectName") %>%
  mutate(General_DQ = if_else(GeneralFlagTotal/ClientsServed >= .02, 1, 0),
         Benefits_DQ = if_else(BenefitsFlagTotal/AdultsEntered >= .02, 1, 0),
         Income_DQ = if_else(IncomeFlagTotal/AdultsEntered >= .02, 1, 0),
         LoTH_DQ = if_else(LoTHFlagTotal/HoHsServed >= .02, 1, 0))

but the dq_flags_staging dataframe doesn't have the flags filtered by when folks entered. If we create dq_flags_staging but restrict the benefits and income flags just to clients entering program in that time period, we end up with fewer programs in the detail dataframe.

I think that means that we can have entries in the numerator that aren't necessarily included in the denominator--is that what we want? It seems like we might want to restrict our flags to the entries that we're flagging as relevant to the time period for that flag type.

The text was updated successfully, but these errors were encountered:

kiadso · 2021-04-08T15:22:23Z

I agree with you, and I think this is something I realized last year but didn't have time to correct and felt like it was a low priority thing to fix. I think I'm going to leave this open and if we're able to fix it this year then good, otherwise it's something we can fix for next year.

gwenbeebe · 2021-04-11T23:24:21Z

Would it cause problems elsewhere if we added entry exit IDs to our list of our variables to keep in the data quality script? If not, this should be a relatively easy fix and I can knock it out!

kiadso · 2021-07-16T12:51:23Z

I think it should be ok to add EnrollmentID to that. Sorry I did not see this question until today!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Quality Comparison #152

Data Quality Comparison #152

gwenbeebe commented Apr 8, 2021

kiadso commented Apr 8, 2021

gwenbeebe commented Apr 11, 2021

kiadso commented Jul 16, 2021

Data Quality Comparison #152

Data Quality Comparison #152

Comments

gwenbeebe commented Apr 8, 2021

kiadso commented Apr 8, 2021

gwenbeebe commented Apr 11, 2021

kiadso commented Jul 16, 2021