@@ -17,8 +17,6 @@ The package includes a C extension (`_tsinfer`) built from `lib/` sources via se
1717uv run pytest tests/ -v # Run all tests
1818uv run pytest tests/test_matching.py # Run a single test file
1919uv run pytest tests/test_matching.py::TestFoo::test_bar -v # Run a single test
20- uv run pytest --skip-slow # Skip slow tests
21-
2220uv run ruff check --fix # Lint Python code (auto-fix)
2321uv run ruff format # Format Python code
2422```
@@ -81,7 +79,7 @@ Source in `lib/`. Three main classes exposed to Python:
8179- ` AncestorBuilder ` — builds inferred ancestors from genotype data
8280- ` AncestorMatcher ` — Li & Stephens HMM matching algorithm
8381
84- When changes are made to the C library, ensure that the `` _tskit `` module is rebuilt
82+ When changes are made to the C library, ensure that the `` _tsinfer `` module is rebuilt
8583before running Python tests.
8684
8785Vendored dependencies in ` lib/subprojects/ ` : tskit C library and kastore.
@@ -100,10 +98,20 @@ Sample VCZ → `infer_ancestors` → Ancestor VCZ → `match` → raw `tskit.Tre
10098 occur within the current codebase.
10199- Do not make production code more complex for the sake of minimising
102100 changes to the test suite. Simplicity and clarity of the production code
103- is imperitive.
101+ is imperative.
102+ - Do not combine multiple complex operations in a single statement. Prefer
103+ to keep a single operation per statement, and use intermediate variables
104+ as a form of documentation. For example:
105+ ``` python
106+ # Bad — multiple operations in one expression
107+ result = sorted (k for k, v in mapping.items() if v in set (x.name for x in sources))
108+
109+ # Good — intermediate variable makes intent clear
110+ source_names = {x.name for x in sources}
111+ result = sorted (k for k, v in mapping.items() if v in source_names)
112+ ```
104113- Prefer dataclasses over tuples when returning multiple values.
105114- Use explicit ` None ` comparisons: ` if x is not None ` not ` if x ` .
106- - Zarr v3 is now used (dependency: ` zarr>=3 ` ).
107115- Import all modules at the top of the file, not inside functions or methods.
108116- Prefer importing a module and using module.function instead of
109117 using `` from module import function `` . This applies to intra-package
@@ -115,6 +123,7 @@ Sample VCZ → `infer_ancestors` → Ancestor VCZ → `match` → raw `tskit.Tre
115123- When a parameter has a computed default derived from another parameter,
116124 compute it once at the point of use (the leaf function), not at every
117125 layer in the call chain. Pass ` None ` through intermediate layers.
126+ - Zarr v3 is used (dependency: ` zarr>=3 ` ). Do not use Zarr v2 APIs.
118127- Use PEP 604 union syntax: ` int | None ` , not ` Optional[int] ` .
119128- One ` logger = logging.getLogger(__name__) ` per module at top level.
120129
@@ -128,5 +137,5 @@ Sample VCZ → `infer_ancestors` → Ancestor VCZ → `match` → raw `tskit.Tre
128137- Test helpers are in ` tests/helpers.py ` (e.g., ` make_sample_vcz ` , ` make_ancestor_vcz ` )
129138- ` tests/algorithm.py ` contains Python reference implementations used to verify C code
130139- ` msprime ` is used to simulate test data
131- - Run the test suite with coverage after each change to ensure that new code is fully
132- covered by tests.
140+ - Run the test suite with coverage before committing to ensure that new code is
141+ fully covered by tests.
0 commit comments