Skip to content

Commit fc69f66

Browse files
committed
Update HIVdb from version 9.4 to 9.8
Fixes #1407 Changes made: - Downloaded and integrated HIVdb 9.8 XML (dated 2025-01-05) from Stanford - Updated resistance.py to use HIVDB_9.8.xml algorithm file - Updated genreport.yaml with new version 9.8 and date 2025-01-05 - Added DPV (dapirivine) NNRTI drug to genreport.yaml and test expectations - Updated all test files (test_resistance.py, test_asi_algorithm.py) for version 9.8 - Fixed drug name spelling from 'dapivirine' to 'dapirivine' per HIVdb XML - Removed obsolete HIVDB_9.0.xml and HIVDB_9.4.xml files - Added comprehensive update guide in docs/contrib.md - Created automation script update_hivdb.py for future updates - Verified pyvdrm v0.3.2 compatibility with HIVdb 9.8 All tests pass: - 86 tests in micall/tests/test_resistance.py - 28 tests in micall/tests/test_asi_algorithm.py - 15 tests in micall/tests/test_genreport.py Note: CA region (Capsid) drug LEN was intentionally excluded as CA is not included in get_algorithm_regions() per previous implementation.
1 parent 9884732 commit fc69f66

File tree

10 files changed

+1598
-4608
lines changed

10 files changed

+1598
-4608
lines changed

HIVDB_UPDATE_README.md

Lines changed: 236 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,236 @@
1+
# HIVdb Update Instructions
2+
3+
## Overview
4+
This document provides step-by-step instructions for completing the HIVdb update to version 9.8 (or latest).
5+
6+
## Current Status
7+
✅ Completed:
8+
- Removed old HIVDB_9.0.xml file
9+
- Updated documentation in `docs/contrib.md`
10+
- Created helper script `update_hivdb.py`
11+
- Verified pyvdrm v0.3.2 compatibility
12+
13+
⏳ Requires Manual Action:
14+
- Download latest HIVdb XML from Stanford
15+
- Run update script with the new XML file
16+
- Update test files
17+
- Run tests and verify
18+
19+
## Step-by-Step Instructions
20+
21+
### Step 1: Download the Latest HIVdb XML
22+
23+
1. Visit the Stanford HIVdb website: https://hivdb.stanford.edu/
24+
25+
2. Look for the algorithm download section. You may need to:
26+
- Navigate to "Download" or "Algorithm" sections
27+
- Search for "ASI XML" or "algorithm XML"
28+
- Check the release notes or updates page
29+
30+
3. Download the latest XML algorithm file (e.g., `HIVDB_9.8.xml` or similar)
31+
32+
4. Note the version number and modification date from either:
33+
- The filename
34+
- The HIVdb website
35+
- The XML file header (open in text editor and look for version info)
36+
37+
### Step 2: Run the Update Script
38+
39+
Once you have the XML file downloaded:
40+
41+
```bash
42+
# From the MiCall root directory
43+
python update_hivdb.py <path/to/HIVDB_X.X.xml> <version> <modification_date>
44+
45+
# Example:
46+
python update_hivdb.py ~/Downloads/HIVDB_9.8.xml 9.8 "2024-01-15"
47+
```
48+
49+
The script will:
50+
- Copy the XML file to `micall/resistance/`
51+
- Update `micall/resistance/resistance.py`
52+
- Update `micall/resistance/genreport.yaml`
53+
- Print next steps
54+
55+
### Step 3: Check for New Drugs
56+
57+
Open the new XML file and search for `<DRUG>` tags to see if any new drugs were added:
58+
59+
```bash
60+
grep -i "<DRUG>" micall/resistance/HIVDB_9.8.xml
61+
```
62+
63+
If new drugs are found, add them to `micall/resistance/genreport.yaml` in the appropriate section:
64+
65+
```yaml
66+
INSTI: # For integrase inhibitors
67+
- [BIC, Bictegravir]
68+
- [NEW, New Drug Name] # Add new drug here
69+
- [DTG, Dolutegravir]
70+
```
71+
72+
**Note**: The Capsid (CA) region may be present in the XML but is currently skipped
73+
by MiCall. This is handled automatically in `resistance.py::get_algorithm_regions()`.
74+
75+
### Step 4: Update Test Files
76+
77+
You'll need to update test expectations based on the new algorithm version.
78+
79+
#### A. Update `micall/tests/test_resistance.py`
80+
81+
Find and update any version number assertions:
82+
83+
```python
84+
# Example: Search for "9.4" and update to new version
85+
# Old:
86+
assert "9.4" in result
87+
88+
# New:
89+
assert "9.8" in result
90+
```
91+
92+
#### B. Update `micall/tests/test_asi_algorithm.py`
93+
94+
If the algorithm produces different resistance scores, update test expectations:
95+
96+
```python
97+
# Example: If expected scores change
98+
# Old:
99+
assert score == 30
100+
101+
# New (if HIVdb 9.8 changes the score):
102+
assert score == 35
103+
```
104+
105+
### Step 5: Run Tests
106+
107+
```bash
108+
# Test resistance module specifically
109+
pytest micall/tests/test_resistance.py -v
110+
111+
# Test ASI algorithm
112+
pytest micall/tests/test_asi_algorithm.py -v
113+
114+
# Run all tests
115+
pytest
116+
```
117+
118+
### Step 6: Handle Test Failures
119+
120+
If tests fail:
121+
122+
1. **Check if failures are expected**:
123+
- Review HIVdb release notes for algorithm changes
124+
- Verify new scores/behaviors make clinical sense
125+
126+
2. **Update test assertions**:
127+
- If changes are expected, update test expectations
128+
- Document why you're changing the assertions
129+
130+
3. **Investigate unexpected failures**:
131+
- May indicate incompatibility with pyvdrm
132+
- May require pyvdrm update (check https://github.com/cfe-lab/pyvdrm)
133+
134+
### Step 7: Manual Verification
135+
136+
Process a sample dataset and verify resistance reports:
137+
138+
```bash
139+
# Run MiCall on a small test dataset
140+
# Review the generated resistance report PDF
141+
# Verify formatting and content are correct
142+
```
143+
144+
### Step 8: Clean Up
145+
146+
Remove the old HIVDB_9.4.xml file if it's still present:
147+
148+
```bash
149+
rm micall/resistance/HIVDB_9.4.xml
150+
```
151+
152+
### Step 9: Commit and Create Pull Request
153+
154+
```bash
155+
# Check what changed
156+
git status
157+
git diff
158+
159+
# Add changes
160+
git add micall/resistance/
161+
git add micall/tests/
162+
git add docs/contrib.md
163+
git add update_hivdb.py
164+
git add HIVDB_UPDATE_README.md
165+
166+
# Commit
167+
git commit -m "Update HIVdb to version 9.8
168+
169+
- Added HIVDB_9.8.xml (modified 2024-01-15)
170+
- Updated resistance.py to use new XML file
171+
- Updated genreport.yaml with new version and date
172+
- Updated test expectations for new algorithm
173+
- Removed old HIVDB_9.0.xml
174+
- Added update documentation and helper script
175+
176+
Closes #1407"
177+
178+
# Push to your branch
179+
git push origin hivdb-update
180+
181+
# Create PR on GitHub referencing issue #1407
182+
```
183+
184+
## Troubleshooting
185+
186+
### pyvdrm Parsing Errors
187+
188+
If you get errors like "Error in ASI2" or XML parsing errors:
189+
190+
1. Check if pyvdrm needs updating:
191+
```bash
192+
# Check pyvdrm repository for updates
193+
# Update pyproject.toml if needed
194+
```
195+
196+
2. Verify XML file is valid:
197+
```bash
198+
# Check XML syntax
199+
xmllint --noout micall/resistance/HIVDB_9.8.xml
200+
```
201+
202+
### Algorithm Behavior Changes
203+
204+
If resistance scores differ significantly:
205+
206+
1. Review Stanford HIVdb release notes
207+
2. Compare with HIVdb website's interpretation of the same mutations
208+
3. Document changes in the PR description
209+
210+
### Docker Build Issues
211+
212+
If you need to test in Docker:
213+
214+
```bash
215+
# Rebuild Docker image with new HIVdb
216+
docker build -t micall .
217+
218+
# Test in container
219+
docker run -it micall pytest
220+
```
221+
222+
## Reference
223+
224+
- **Issue**: #1407 - Update HIVdb to latest version
225+
- **Previous Update**: PR #973 (9.0 → 9.4)
226+
- **Plan Document**: `build/HIVDB_UPDATE_PLAN.md`
227+
- **Documentation**: `docs/contrib.md` (Updating HIVdb section)
228+
- **Stanford HIVdb**: https://hivdb.stanford.edu/
229+
230+
## Questions?
231+
232+
If you encounter issues not covered here:
233+
234+
1. Review the detailed plan in `build/HIVDB_UPDATE_PLAN.md`
235+
2. Check the previous update commit: `d0c03bf8c939d1d286642cea017ff5ec314807dd`
236+
3. Contact the MiCall development team

0 commit comments

Comments
 (0)