-
Notifications
You must be signed in to change notification settings - Fork 7
Schito SchistoInfectionWormBurdenEvent pandas style
Asif Tamuri edited this page Mar 10, 2020
·
7 revisions
Here I try to rewrite the code for SchistoInfectionWormBurdenEvent using more conventional Pandas operations, working on the entire (or larger) sets of data to avoid working on small subsets at a time and looping over data etc.
I'm assuming the following parameters are setup somewhere:
beta_by_age_group = pd.Series([0.3, 1.0, 0.05], index=['PSAC', 'SAC', 'Adults'])
beta_by_age_group.index.name = 'age_group'
R0 = pd.Series({'Balaka': 1.124254835,
'Blantyre': 1.129557175,
'Blantyre City': 1.125299162,
'Chikwawa': 1.124802165,
'Chiradzulu': 1.141412855,
'Chitipa': 1.125768303,
'Dedza': 1.124234464,
'Dowa': 0.0,
'Karonga': 1.124209139,
'Kasungu': 1.12470922,
'Likoma': 1.131439355,
'Lilongwe': 1.127042051,
'Lilongwe City': 1.1243717,
'Machinga': 1.124287956,
'Mangochi': 1.12420876,
'Mchinji': 1.124478478,
'Mulanje': 1.144959393,
'Mwanza': 1.125061605,
'Mzimba': 1.12535666,
'Mzuzu City': 1.130178098,
'Neno': 1.126314283,
'Nkhata Bay': 1.13232802,
'Nkhotakota': 1.133205764,
'Nsanje': 1.13558784,
'Ntcheu': 1.124211331,
'Ntchisi': 1.124331529,
'Phalombe': 1.184100577,
'Rumphi': 0.0,
'Salima': 1.12872694,
'Thyolo': 1.124214465,
'Zomba': 1.126197481,
'Zomba City': 1.126197481})
Then the following performs the apply
method:
df = population.props
where = df.is_alive
age_group = pd.cut(df.loc[where, 'age_years'], [0, 4, 14, 120], labels=['PSAC', 'SAC', 'Adults'], include_lowest=True)
age_group.name = 'age_group'
mean_count_burden_district_age_group = df.loc[where].groupby(['district_of_residence', age_group])['sh_aggregate_worm_burden'].agg([np.mean, np.size])
district_count = df.loc[where].groupby(df.district_of_residence)['district_of_residence'].count()
beta_contribution_to_reservoir = mean_count_burden_district_age_group['mean'] * beta_by_age_group
to_get_weighted_mean = mean_count_burden_district_age_group['size'] / district_count
age_worm_burden = beta_contribution_to_reservoir * to_get_weighted_mean
reservoir = age_worm_burden.groupby(['district_of_residence']).sum()
contact_rates = age_group.map(beta_by_age_group)
harbouring_rates = df.loc[where, 'sh_harbouring_rate']
rates = harbouring_rates * contact_rates
worms_total = reservoir * R0
draw_worms = pd.Series(np.random.poisson(df.loc[where, 'district_of_residence'].map(worms_total) * rates), index=df.index[where])
param_worm_fecundity = 0.005 # params['worms_fecundity']
established = np.random.random(size=sum(where)) < np.exp(df.loc[where, 'sh_aggregate_worm_burden'] * -param_worm_fecundity)
to_establish = pd.DataFrame({'new_worms': draw_worms[(draw_worms > 0) & established]})
sim_date = pd.Timestamp.now() # <-- a dummy bit of code for testing
to_establish['date_maturation'] = sim_date + pd.to_timedelta(np.random.randint(30, 55, size=len(to_establish)), unit='D')
for index, row in to_establish.iterrows():
self.sim.schedule(SchistoMatureWorms(self.module, person_id=index, new_worms=row.new_worms), row.date_maturation)
TLO Model Wiki