Skip to content

Commit ec0ebb4

Browse files
committed
Bumps
1 parent 122a3af commit ec0ebb4

File tree

5 files changed

+611
-647
lines changed

5 files changed

+611
-647
lines changed

README.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,6 @@ Class of each CRAM record returned by this API.
171171
##### Parameters
172172

173173
- `$0` **any** 
174-
175174
- `$0.flags`  
176175
- `$0.cramFlags`  
177176
- `$0.readLength`  
@@ -313,7 +312,6 @@ the actual substituted and reference base pairs, and will make the
313312

314313
- `refRegion`
315314
**[object](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Object)** 
316-
317315
- `refRegion.start`
318316
**[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** 
319317
- `refRegion.end`
@@ -360,7 +358,6 @@ that show insertions, deletions, substitutions, etc.
360358

361359
- `args`
362360
**[object](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Object)** 
363-
364361
- `args.cram` **CramFile** 
365362
- `args.index` **Index-like** object that supports
366363
getEntriesForRange(seqId,start,end) -> Promise\[Array\[index entries]]
@@ -440,7 +437,6 @@ Returns
440437

441438
- `args`
442439
**[object](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Object)** 
443-
444440
- `args.path`
445441
**[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)?** 
446442
- `args.url`

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@
6565
"documentation": "^14.0.3",
6666
"eslint": "^9.29.0",
6767
"eslint-plugin-import": "^2.31.0",
68-
"eslint-plugin-unicorn": "^59.0.0",
68+
"eslint-plugin-unicorn": "^60.0.0",
6969
"mock-fs": "^5.2.0",
7070
"prettier": "^3.2.5",
7171
"rimraf": "^6.0.1",

src/cramFile/file.ts

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,12 +110,15 @@ export default class CramFile {
110110
const { maxLength, parser } = cramFileDefinition()
111111
const headbytes = await this.file.read(maxLength, 0)
112112
const definition = parser(headbytes).value
113-
if (definition.majorVersion !== 2 && definition.majorVersion !== 3) {
113+
if (definition.magic !== 'CRAM') {
114+
throw new Error('Not a CRAM file, does not match magic string')
115+
} else if (definition.majorVersion !== 2 && definition.majorVersion !== 3) {
114116
throw new CramUnimplementedError(
115117
`CRAM version ${definition.majorVersion} not supported`,
116118
)
119+
} else {
120+
return definition
117121
}
118-
return definition
119122
}
120123

121124
// memoize

test/data/hts-specs/cram/3.0/README.md

Lines changed: 0 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ Try to test in order, so early tests don't require correct interpretation of
44
later tests. This gives an ordering for software development and testing.
55

66
- Data types
7-
87
- ITF8
98
- Strings
109
- Arrays
@@ -65,7 +64,6 @@ An empty file to check the file definition can be read. We require a SAM header
6564
too, but this is also empty (using one block, see below).
6665

6766
- Empty CRAM file (failed/0000_empty_noref.cram)
68-
6967
- File definition
7068
- SAM header container (zero content)
7169
- [End of file; no EOF block so may emit warning]
@@ -79,7 +77,6 @@ too, but this is also empty (using one block, see below).
7977
warning or a hard error.)
8078

8179
- Empty CRAM file with EOF block (0001_empty_eof.cram)
82-
8380
- As above, but with official EOF block.
8481

8582
This EOF block can be decoded either by checking for a specific series of
@@ -127,23 +124,19 @@ c->num_landmarks=0 set c->curr_slice=0 set c->length=181 c ]
127124
Files with 1 or more sequences. These are all unmapped with no auxiliary tags.
128125

129126
- Single read (0300_unmapped.cram)
130-
131127
- Tests decoded data via EXTERNAL, HUFFMAN, BYTE_ARRAY_STOP and BYTE_ARRAY_LEN
132128
encodings.
133129
- 4 blocks in slice (CORE - empty, RN, QS, BA).
134130

135131
- Two unpaired reads, of differing length (0301_unmapped_cram)
136-
137132
- As above, but RL is no longer a constant and is in its own block.
138133

139134
- Three reads, including a pair (0302_unmapped_cram)
140-
141135
- Also contains BF and MF blocks. All still CF "detached".
142136
- BF 77 & 141 match the input SAM, but this is redundant as it's
143137
- also set in MF bit 2.
144138

145139
- Three reads, including a pair (0303_unmapped_cram)
146-
147140
- As above, but the SAM FLAGs of 77 and 141 are stored as 69 and 133 (clearing
148141
mate unmapped flag). BF + MF are sufficient to regenerate the correct FLAG
149142
field.
@@ -154,21 +147,18 @@ Files with 1 or more sequences. These are all unmapped with no auxiliary tags.
154147
## Slice basics, mapped reads, no reference
155148

156149
- Single read (0400_mapped.cram)
157-
158150
- Container ref id, pos and span, number of records and number of bases fields
159151
are changed.
160152
- Checks that mapped data can process MD5 0, provided container RR=0.
161153
- Additional data series in use: FN, FP, FC, MQ.
162154
- One feature of type 'b', with sequence stored in BB.
163155

164156
- Paired reads, but detached (0401_mapped.cram)
165-
166157
- RNEXT/PNEXT/TLEN of \*/0/0
167158
- Explicit TS, NP, NS with constant values as they would disagree with
168159
auto-computed values.
169160

170161
- Paired reads, but detached (0402_mapped.cram)
171-
172162
- RNEXT/PNEXT/TLEN filled out.
173163
- Explicit TS, NP, NS, with non-constant values [ Edit htslib to force
174164
bam_ins_size check to fail and hence "goto detached". ]
@@ -183,23 +173,20 @@ Testing of the FC (Feature Codes) data series types and their associated
183173
type-specific data series.
184174

185175
- External reference, CIGAR ops (0500_mapped.cram)
186-
187176
- No edits: entirely match reference
188177
- No FP/FC needed (and FN=0).
189178
- Sequence is implicitly assumed to entirely match reference
190179
- Header gains an @SQ UR: tag, although note this pathname is local and not
191180
transferable to other systems.
192181

193182
- External reference, CIGAR ops (0501_mapped.cram)
194-
195183
- Mismatching first and last base on first seq and first / last 3 bases on
196184
second seq.
197185
- Adds use of FC "X" and the BS (base substitution) data series. This tests
198186
the compression header "SM" preservation map. Note BB data series has an
199187
encoding in the compression header, but is not used here.
200188

201189
- As above, but R/Y bases (0502_mapped.cram).
202-
203190
- Test of the BA data series and FC "B" code. BS (base substitution) only
204191
applies for A, C, G, T, N.
205192

@@ -213,19 +200,16 @@ type-specific data series.
213200
s->block[12]->data[0].]
214201

215202
- As above with R/Y bases, using using "b" FC (0503_mapped.cram).
216-
217203
- Unlike FC "B", "b" is a string instead of a single character and doesn't
218204
require storing quality data.
219205

220206
[ Produced by changing the "if (0 && CRAM_MAJOR_VERS...)" line in
221207
process_one_read().]
222208

223209
- Soft/hard clips (0504_mapped.cram)
224-
225210
- FC codes S and H, with associated SC and HC data series.
226211

227212
- Indels (0505_mapped.cram)
228-
229213
- Tests FC codes and data-series: D (DL), I (IN) and i (BA). The table below
230214
shows cigar ops, with "m" being lowercase as it's not explicitly stored in
231215
CRAM. The FC row shows the associated CRAM feature code. REF
@@ -237,7 +221,6 @@ type-specific data series.
237221
FC D D I i
238222

239223
- As above, but explicit padding in the 5bp indel (0506_mapped.cram)
240-
241224
- Tests FC code P and data series PD REF
242225
ATTTTTCGGGTTTTTTGAAATGAATATCGTAGCTACAGAAACGGTTGTGCACTCATCTGAAAGTTTGTTT T
243226
TCTTGTTTTCTTGCACTTTGTGCAGAATT SEQ ATTTTTCGGGTTTTTTGAAA AT
@@ -405,15 +388,12 @@ examples produced by some current implementations.
405388
- BETA (already tested in 1101_BETA.cram)
406389

407390
- SUBEXPONENTIAL
408-
409391
- I have no code to write this data format. Exists in htsjdk though?
410392

411393
- GAMMA
412-
413394
- I have no code to write this data format. Exists in htsjdk though?
414395

415396
- GOLOMB (deprecated)
416-
417397
- I have no code to read nor write this data format
418398

419399
- GOLOMB-RICE (deprecated)
@@ -422,19 +402,16 @@ examples produced by some current implementations.
422402
## Index
423403

424404
- Simple mapped case (1400_index_simple.cram)
425-
426405
- 10bp reads starting one per base. Read name indicates bases covered.
427406
- 77 reads per container
428407
- Index query CHROMOSOME_I:333-444 should return 121 records, from s324-333 to
429408
s444-453
430409

431410
- Unmapped data (1401_index_unmapped.cram)
432-
433411
- As above, but all data is unmapped
434412
- Index query for unmapped (eg ref `*`) should return all 1000 records.
435413

436414
- Multiple references + unmapped (1402_index_3ref.cram)
437-
438415
- 300 for first ref, 10 for second, 300 for third, and 300 unmapped.
439416
- Only one reference per slice.
440417
- CHROMOSOME_I:100-200 returns 110 records
@@ -445,7 +422,6 @@ examples produced by some current implementations.
445422
- `*` (unmapped) returns 300 records
446423

447424
- Multi-ref mode (1403_index_multiref.cram)
448-
449425
- As above, but containers / slices use the RI data series with multiple
450426
references per container. The same queries will work as above.
451427
- Hence index reports reference IDs, but multiple references can occur at the
@@ -459,17 +435,14 @@ examples produced by some current implementations.
459435
container as the last reads in ref 2.
460436

461437
- Multi-slice containers (1404_index_multislice.cram)
462-
463438
- As 1402_index_3ref.cram, but 3 slices per container.
464439
- Same queries will work as above.
465440

466441
- Multi-slice multi-ref containers (1405_index_multisliceref.cram)
467-
468442
- As above, but with multiple references permitted per slice.
469443
- Same queries will work as above.
470444

471445
- Mix of long and short reads
472-
473446
- 10bp reads starting every position
474447
- 350bp reads starting every 300 positions
475448

0 commit comments

Comments
 (0)