Skip to content

Commit 068ee0c

Browse files
tbhalletttdm32marghe-molaro
authored
Initial Version of Scenario definition and other associated amendments needed for the "Horizontal vs Vertical"-type Analysis, (#1288)
* updates from paper analyses for HIV, TB and malaria * remove unused import statements * fix imports * update filepath for malaria resource file * remove test_hiv_tb_scenarios.py * updated test_healthsystem.py: test_manipulation_of_service_availability as small population size over 7 days will not schedule VMMC through HIV module. Remove HIV_Prevention_Circumcision in assert statement for HIV services delivered in one week * change ipt_coverage in TB logger as conflicts with existing parameter * updated ResourceFile_Improved_Healthsystem_And_Healthcare_Seeking.xlsx with updated NTP2019 TB data * remove test code * remove test code * delete tmp resourcefiles * malaria code use person_id consistently instead of individual_id * use individual_id for demography.do_death() * style change to avoid conflicts with master * style change to avoid conflicts with master * fix conflicts with master * fix conflicts with master * merge in master * check property hv_date_last_ART correctly set * Manually add PostnatalCare_Comprehensive to policy priorities * edit fix * add schisto high infection as conditional predictor for bladder cancer * fix conditional predictor for active TB - should check presence of CardioMetabolicDisorders * add 'ss' prefix to properties from schisto module referenced in bladder_cancer.py * edit praziquantel code in schisto.py to use value from CMST in place of donated * add parameter rr_depr_hiv for risk of depression with HIV infection * tidy up linear models in depression, include conditional predictors for hiv infection and add comments * move hv_inf into conditional predictor for depression in initial population * convert lm for incident cancer (site_confined) to model with conditional predictors. Include HIV as risk factor. * add parameter rr_site_confined_hiv to other_adult_cancers.py * update other_adult_cancers write-up to include HIV as risk factor * update Depression.docx to include HIV as risk factor for depression * edit HIV in depression to include only HIV cases not virally suppressed * update other_adult_cancers.py linear model to include HIV as risk factor only if not virally suppressed * edit: HIV remains as risk factor for depression independent of treatment status * include HIV as risk factor for low grade dysplasia (oesophageal cancer). Update ResourceFile_Oesophageal_Cancer.xlsx * update linear model for low grade dysplasia to include HIV as conditional risk factor * update OesophagealCancer.docx write-up to include HIV risk * add condition hiv diagnosed for increased risk of depression * remove hiv as risk factor for oesophageal cancer * remove parameter for hiv as risk factor for oesophageal cancer * update OesophagealCancer.docx to remove hiv as risk factor * update value for weighted risk of other_adult_cancers with unsuppressed HIV * add rr_hiv to linear model. update ResourceFile_cmd_condition_onset.xlsx with rr_hiv, leave value=1.0 if no effect of hiv * update resourcefiles for CMD include rr_hiv for all, no effect of majority of processes * refactoring: * put the helper function for switching scenario into same file a ScenarioSwitcher class * put tests for class and helper function together (next step will be to rename and mock-up extended functionality) * add diabetes as risk for active TB and relapse. add params to ResourceFile_TB.xlsx replace linearmodels dict which was accidentally removed in depression.py * add diabetes as risk factor for tb death * add diabetes as risk factor for PLHIV with active TB and on TB treatment * add diabetes as risk factor for PLHIV with active TB and on TB treatment * set up run to check calibration of deaths and disability * add predictor high-intensity S. haematobium infection to risk of bladder cancer in initial population * add predictor high-intensity S. haematobium infection to risk of HIV acquisition * fix indenting in HSI_Hiv_StartOrContinueTreatment * add hv_date_treated abd hv_date_last_ART to baseline_art * convert linear model in CMD to include conditional predictors * delete resourcefile created in error * comment out path-specific changes to analysis_cause_of_death_and_disability_calibrations.py * fix CMD error if Hiv not registered * initial sketch of structure * tidy-up and fix tests * commments * remove parameter rr_bcg_inf from tb.py * edit comment in initialise_simulation * fix parameter name error * update parameters * test runs * edit and fix flake8 errors * fix failing test * update ResourceFile_Improved_Healthsystem_And_Healthcare_Seeking.xlsx * update ResourceFile_PriorityRanking_ALLPOLICIES.xlsx * updated ResourceFile_Improved_Healthsystem_And_Healthcare_Seeking.xlsx * scenario file * starting work on the scenario file * design and HR scenarios * scenario for HRH * roll back changes to ScenarioSwitcher * Revert "roll back changes to ScenarioSwitcher" This reverts commit 49d14e7. * sketch out of prposed changed for scenario_switcher * sketch out of prposed changed for scenario_switcher * sketch out of prposed changed for scenario_switcher * sketch out of prposed changed for scenario_switcher * sketch out of prposed changed for scenario_switcher * Add ResourceFile_Consumables_Item_Designations.csv * create special scenarios for consumables availability based on the designation of the consumable item * comment and fix imports * draft of scenario file * linting * increase length of scenarios and reduce reps * rename * correct error in specifiction of dynamic scaling scenario * remove trailing commas that casts return as tuple * Initially only consider baseline and perfect healthcare seeking scenarios, to get upper limit on RAM requirements * Submit remaining scenarios * direct to self.module (rather than self) * edit check for last ART when new dispensation occurs * update schisto risk on HIV to only include women * add scale-up parameter for htm programs add event to hiv module to switch parameters include new keywords in hiv module for scale-up separately for hiv, tb and malaria add new sheet in ResourceFile_HIV.xlsx containing new scale-up parameter values * add catch for malaria rdt_testing_rates post-2024 * malaria parameters scale-up included * restore `scenario_comparison_of_horizontal_and_vertical_programs` (removing comments used to select certain scenarios to be run). * first draft of figures * set up test for HTM scenario scale-up * add tests for HTM scale-up * check resourcefiles updated * check the usage of '' versus "" * reset to single quotes to match PR #1273 * remove unneeded resource files * remove unneeded resource files * edit filepaths * isort for imports * edit ResourceFile_Improved_Healthsystem_And_Healthcare_Seeking.xlsx to remove 2024 missing value in malaria Rate_rdt_testing * fix filename for test_HTMscaleup.py * update scenario file * update scenario file * include "scale_to_effective_capabilities": True * linting * linting * reduce number of draws * set scaleup parameters separately in each module use parameters to select scaleup instead of module arguments update resourcefiles * set scaleup parameters separately in each module use parameters to select scaleup instead of module arguments update resourcefiles * update resourcefiles * update resourcefiles * set up script to test scenarios - scale-up of HTM programs * test runs * add scenario switch to malaria.py * cherry-pick inadvertently updated files and revert * isort fixes * Delete resources/~$ResourceFile_HIV.xlsx * Delete resources/~$ResourceFile_TB.xlsx * Delete resources/malaria/~$ResourceFile_malaria.xlsx * fix failing tests * rollback changes to calibration_analyses/scenarios/long_run_all_diseases.py * fix error in filename * revert timedelta format to 'M' * add todos * merge in master * set up test runs for scale-up * edit scenario start date * change parameter scaleup_start_date to an integer value used in DateOffset, as timestamp is not json serializable for analysis runs * change scale-up event scheduling to use new DateOffset parameter * test runs * set up test runs * set up test runs * fix failing tests * Update tests/test_HTMscaleup.py Co-authored-by: Tim Hallett <[email protected]> * address comments on PR review * isort fixes * isort fixes * change np.timedelta in enhanced_lifestyle.py back to original * remove json file * call it 'htm_scenario_analysis' rather than just 'scenario_analysis' * update comment * roll back change in test_tb.py * remove .py extension for clarity * roll back incidental change * linting and editing string for clarity * roll back incidental changes * defaults for healthsystem ok -- no need to step through each option * remove inadvertent duplication in code * remove comment * use dict for ease of accessing * parameter to be the YEAR (int) of the change to fit with the convention used in other modules (instead of years since the beginning of the simulation) * remove comment * refactor module method for clarity * refactor to prevent same name being used for events specific to different modules * specify year for scale-up in analysis file * rename; add note; remove comments for HTM scenarios * refacotring * rename scenario class for hss elements * new scenario for combinations of vertical and horizontal programs * minor refactor * update docstring * renaming * add @Property decorator * plot DALYS averted relative to Baseline - broken down by major cause (HIV, TB, MALARIA) * define mini run scenario * rename scenario * lining isort * linting ruff * delete not-yet-used file * isort again! --------- Co-authored-by: tdm32 <[email protected]> Co-authored-by: Tara <[email protected]> Co-authored-by: Margherita Molaro <[email protected]>
1 parent c9e65ca commit 068ee0c

File tree

7 files changed

+1222
-2
lines changed

7 files changed

+1222
-2
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
version https://git-lfs.github.com/spec/v1
2-
oid sha256:af86c2c2af5c291c18c5d481681d6d316526b81806c8c8e898517e850160e6fd
3-
size 12465
2+
oid sha256:80651d157772a292bf9617c86e2616d8165a20385ada6d85a5244aca9c55aa0c
3+
size 21938
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,272 @@
1+
"""Produce plots to show the impact each the healthcare system (overall health impact) when running under different
2+
scenarios (scenario_impact_of_healthsystem.py)"""
3+
4+
import argparse
5+
import textwrap
6+
from pathlib import Path
7+
from typing import Tuple
8+
9+
import numpy as np
10+
import pandas as pd
11+
from matplotlib import pyplot as plt
12+
13+
from tlo import Date
14+
from tlo.analysis.utils import extract_results, make_age_grp_lookup, summarize
15+
16+
17+
def apply(results_folder: Path, output_folder: Path, resourcefilepath: Path = None):
18+
"""Produce standard set of plots describing the effect of each TREATMENT_ID.
19+
- We estimate the epidemiological impact as the EXTRA deaths that would occur if that treatment did not occur.
20+
- We estimate the draw on healthcare system resources as the FEWER appointments when that treatment does not occur.
21+
"""
22+
23+
TARGET_PERIOD = (Date(2020, 1, 1), Date(2030, 12, 31))
24+
25+
# Definitions of general helper functions
26+
make_graph_file_name = lambda stub: output_folder / f"{stub.replace('*', '_star_')}.png" # noqa: E731
27+
28+
_, age_grp_lookup = make_age_grp_lookup()
29+
30+
def target_period() -> str:
31+
"""Returns the target period as a string of the form YYYY-YYYY"""
32+
return "-".join(str(t.year) for t in TARGET_PERIOD)
33+
34+
def get_parameter_names_from_scenario_file() -> Tuple[str]:
35+
"""Get the tuple of names of the scenarios from `Scenario` class used to create the results."""
36+
from scripts.comparison_of_horizontal_and_vertical_programs.scenario_hss_elements import (
37+
HSSElements,
38+
)
39+
e = HSSElements()
40+
return tuple(e._scenarios.keys())
41+
42+
def get_num_deaths(_df):
43+
"""Return total number of Deaths (total within the TARGET_PERIOD)"""
44+
return pd.Series(data=len(_df.loc[pd.to_datetime(_df.date).between(*TARGET_PERIOD)]))
45+
46+
def get_num_dalys(_df):
47+
"""Return total number of DALYS (Stacked) by label (total within the TARGET_PERIOD).
48+
Throw error if not a record for every year in the TARGET PERIOD (to guard against inadvertently using
49+
results from runs that crashed mid-way through the simulation.
50+
"""
51+
years_needed = [i.year for i in TARGET_PERIOD]
52+
assert set(_df.year.unique()).issuperset(years_needed), "Some years are not recorded."
53+
return pd.Series(
54+
data=_df
55+
.loc[_df.year.between(*years_needed)]
56+
.drop(columns=['date', 'sex', 'age_range', 'year'])
57+
.sum().sum()
58+
)
59+
60+
def set_param_names_as_column_index_level_0(_df):
61+
"""Set the columns index (level 0) as the param_names."""
62+
ordered_param_names_no_prefix = {i: x for i, x in enumerate(param_names)}
63+
names_of_cols_level0 = [ordered_param_names_no_prefix.get(col) for col in _df.columns.levels[0]]
64+
assert len(names_of_cols_level0) == len(_df.columns.levels[0])
65+
_df.columns = _df.columns.set_levels(names_of_cols_level0, level=0)
66+
return _df
67+
68+
def find_difference_relative_to_comparison(_ser: pd.Series,
69+
comparison: str,
70+
scaled: bool = False,
71+
drop_comparison: bool = True,
72+
):
73+
"""Find the difference in the values in a pd.Series with a multi-index, between the draws (level 0)
74+
within the runs (level 1), relative to where draw = `comparison`.
75+
The comparison is `X - COMPARISON`."""
76+
return _ser \
77+
.unstack(level=0) \
78+
.apply(lambda x: (x - x[comparison]) / (x[comparison] if scaled else 1.0), axis=1) \
79+
.drop(columns=([comparison] if drop_comparison else [])) \
80+
.stack()
81+
82+
def do_bar_plot_with_ci(_df, annotations=None, xticklabels_horizontal_and_wrapped=False, put_labels_in_legend=True):
83+
"""Make a vertical bar plot for each row of _df, using the columns to identify the height of the bar and the
84+
extent of the error bar."""
85+
86+
substitute_labels = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
87+
88+
yerr = np.array([
89+
(_df['mean'] - _df['lower']).values,
90+
(_df['upper'] - _df['mean']).values,
91+
])
92+
93+
xticks = {(i + 0.5): k for i, k in enumerate(_df.index)}
94+
95+
# Define colormap (used only with option `put_labels_in_legend=True`)
96+
cmap = plt.get_cmap("tab20")
97+
rescale = lambda y: (y - np.min(y)) / (np.max(y) - np.min(y)) # noqa: E731
98+
colors = list(map(cmap, rescale(np.array(list(xticks.keys()))))) if put_labels_in_legend else None
99+
100+
fig, ax = plt.subplots(figsize=(10, 5))
101+
ax.bar(
102+
xticks.keys(),
103+
_df['mean'].values,
104+
yerr=yerr,
105+
alpha=0.8,
106+
ecolor='black',
107+
color=colors,
108+
capsize=10,
109+
label=xticks.values()
110+
)
111+
if annotations:
112+
for xpos, ypos, text in zip(xticks.keys(), _df['upper'].values, annotations):
113+
ax.text(xpos, ypos*1.15, text, horizontalalignment='center', rotation='vertical', fontsize='x-small')
114+
ax.set_xticks(list(xticks.keys()))
115+
116+
if put_labels_in_legend:
117+
# Update xticks label with substitute labels
118+
# Insert legend with updated labels that shows correspondence between substitute label and original label
119+
xtick_values = [letter for letter, label in zip(substitute_labels, xticks.values())]
120+
xtick_legend = [f'{letter}: {label}' for letter, label in zip(substitute_labels, xticks.values())]
121+
h, legs = ax.get_legend_handles_labels()
122+
ax.legend(h, xtick_legend, loc='center left', fontsize='small', bbox_to_anchor=(1, 0.5))
123+
ax.set_xticklabels(list(xtick_values))
124+
else:
125+
if not xticklabels_horizontal_and_wrapped:
126+
# xticklabels will be vertical and not wrapped
127+
ax.set_xticklabels(list(xticks.values()), rotation=90)
128+
else:
129+
wrapped_labs = ["\n".join(textwrap.wrap(_lab, 20)) for _lab in xticks.values()]
130+
ax.set_xticklabels(wrapped_labs)
131+
132+
ax.grid(axis="y")
133+
ax.spines['top'].set_visible(False)
134+
ax.spines['right'].set_visible(False)
135+
fig.tight_layout()
136+
137+
return fig, ax
138+
139+
# %% Define parameter names
140+
param_names = get_parameter_names_from_scenario_file()
141+
142+
# %% Quantify the health gains associated with all interventions combined.
143+
144+
# Absolute Number of Deaths and DALYs
145+
num_deaths = extract_results(
146+
results_folder,
147+
module='tlo.methods.demography',
148+
key='death',
149+
custom_generate_series=get_num_deaths,
150+
do_scaling=True
151+
).pipe(set_param_names_as_column_index_level_0)
152+
153+
num_dalys = extract_results(
154+
results_folder,
155+
module='tlo.methods.healthburden',
156+
key='dalys_stacked',
157+
custom_generate_series=get_num_dalys,
158+
do_scaling=True
159+
).pipe(set_param_names_as_column_index_level_0)
160+
161+
# %% Charts of total numbers of deaths / DALYS
162+
num_dalys_summarized = summarize(num_dalys).loc[0].unstack().reindex(param_names)
163+
num_deaths_summarized = summarize(num_deaths).loc[0].unstack().reindex(param_names)
164+
165+
name_of_plot = f'Deaths, {target_period()}'
166+
fig, ax = do_bar_plot_with_ci(num_deaths_summarized / 1e6)
167+
ax.set_title(name_of_plot)
168+
ax.set_ylabel('(Millions)')
169+
fig.tight_layout()
170+
ax.axhline(num_deaths_summarized.loc['Baseline', 'mean']/1e6, color='black', alpha=0.5)
171+
fig.savefig(make_graph_file_name(name_of_plot.replace(' ', '_').replace(',', '')))
172+
fig.show()
173+
plt.close(fig)
174+
175+
name_of_plot = f'All Scenarios: DALYs, {target_period()}'
176+
fig, ax = do_bar_plot_with_ci(num_dalys_summarized / 1e6)
177+
ax.set_title(name_of_plot)
178+
ax.set_ylabel('(Millions)')
179+
ax.axhline(num_dalys_summarized.loc['Baseline', 'mean']/1e6, color='black', alpha=0.5)
180+
fig.tight_layout()
181+
fig.savefig(make_graph_file_name(name_of_plot.replace(' ', '_').replace(',', '')))
182+
fig.show()
183+
plt.close(fig)
184+
185+
186+
# %% Deaths and DALYS averted relative to Status Quo
187+
num_deaths_averted = summarize(
188+
-1.0 *
189+
pd.DataFrame(
190+
find_difference_relative_to_comparison(
191+
num_deaths.loc[0],
192+
comparison='Baseline')
193+
).T
194+
).iloc[0].unstack().reindex(param_names).drop(['Baseline'])
195+
196+
pc_deaths_averted = 100.0 * summarize(
197+
-1.0 *
198+
pd.DataFrame(
199+
find_difference_relative_to_comparison(
200+
num_deaths.loc[0],
201+
comparison='Baseline',
202+
scaled=True)
203+
).T
204+
).iloc[0].unstack().reindex(param_names).drop(['Baseline'])
205+
206+
num_dalys_averted = summarize(
207+
-1.0 *
208+
pd.DataFrame(
209+
find_difference_relative_to_comparison(
210+
num_dalys.loc[0],
211+
comparison='Baseline')
212+
).T
213+
).iloc[0].unstack().reindex(param_names).drop(['Baseline'])
214+
215+
pc_dalys_averted = 100.0 * summarize(
216+
-1.0 *
217+
pd.DataFrame(
218+
find_difference_relative_to_comparison(
219+
num_dalys.loc[0],
220+
comparison='Baseline',
221+
scaled=True)
222+
).T
223+
).iloc[0].unstack().reindex(param_names).drop(['Baseline'])
224+
225+
# DEATHS
226+
name_of_plot = f'Additional Deaths Averted vs Baseline, {target_period()}'
227+
fig, ax = do_bar_plot_with_ci(
228+
num_deaths_averted.clip(lower=0.0),
229+
annotations=[
230+
f"{round(row['mean'], 0)} ({round(row['lower'], 1)}-{round(row['upper'], 1)}) %"
231+
for _, row in pc_deaths_averted.clip(lower=0.0).iterrows()
232+
]
233+
)
234+
ax.set_title(name_of_plot)
235+
ax.set_ylabel('Additional Deaths Averted')
236+
fig.tight_layout()
237+
fig.savefig(make_graph_file_name(name_of_plot.replace(' ', '_').replace(',', '')))
238+
fig.show()
239+
plt.close(fig)
240+
241+
# DALYS
242+
name_of_plot = f'Additional DALYs Averted vs Baseline, {target_period()}'
243+
fig, ax = do_bar_plot_with_ci(
244+
(num_dalys_averted / 1e6).clip(lower=0.0),
245+
annotations=[
246+
f"{round(row['mean'])} ({round(row['lower'], 1)}-{round(row['upper'], 1)}) %"
247+
for _, row in pc_dalys_averted.clip(lower=0.0).iterrows()
248+
]
249+
)
250+
ax.set_title(name_of_plot)
251+
ax.set_ylabel('Additional DALYS Averted \n(Millions)')
252+
fig.tight_layout()
253+
fig.savefig(make_graph_file_name(name_of_plot.replace(' ', '_').replace(',', '')))
254+
fig.show()
255+
plt.close(fig)
256+
257+
# todo: Neaten graphs
258+
# todo: Graph showing difference broken down by disease (this can be cribbed from the calcs about wealth from the
259+
# third set of analyses in the overview paper).
260+
# todo: other metrics of health
261+
# todo: other graphs, broken down by age/sex (this can also be cribbed from overview paper stuff)
262+
263+
if __name__ == "__main__":
264+
parser = argparse.ArgumentParser()
265+
parser.add_argument("results_folder", type=Path) # outputs/horizontal_and_vertical_programs-2024-05-16
266+
args = parser.parse_args()
267+
268+
apply(
269+
results_folder=args.results_folder,
270+
output_folder=args.results_folder,
271+
resourcefilepath=Path('./resources')
272+
)

0 commit comments

Comments
 (0)