You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The data structure needs to be a root folder, containing a file `metadata` following the [upload metadata schema](upload-metadata.schema.json). In addition, the folder shall contain files of type `fastq/fastq.gz` and/or `vcf/vcf.gz` and/or `GSvar/GSvar.gz`.
5
+
6
+
Incoming structure overview:
7
+
8
+
```
9
+
|-QTEST001AE (top level folder name)
10
+
|
11
+
|- file1.fastq.gz
12
+
|- file2.fastq.gz
13
+
|- metadata
14
+
|- ...
15
+
16
+
```
17
+
18
+
openBIS structure overview:
19
+
20
+
TODO: ER model.
21
+
22
+
## Expected metadata
23
+
Metadata is expected to be noted in JSON and following the [upload metadata schema](upload-metadata.schema.json). An example JSON entry can look like this:
24
+
25
+
```
26
+
{
27
+
"files": [
28
+
"reads.1.fastq.gz",
29
+
"reads.2.fastq.gz"
30
+
],
31
+
"type": "dna_seq",
32
+
"sample1": {
33
+
"genome": "GRCh37",
34
+
"id_genetics": "GS000000_01",
35
+
"id_qbic": "QTEST002AE",
36
+
"processing_system": "Test system",
37
+
"tumor": "no"
38
+
}
39
+
}
40
+
```
41
+
42
+
The sample code for `id_qbic` can be of type `Q_TEST_SAMPLE` or `Q_BIOLOGICAL_SAMPLE`. In the latter case, a new sample of type `Q_TEST_SAMPLE` is created and attached as child to the biological sample. The data-set will be registered under this test sample then.
"title": "Upload metadata for data registration at QBiC",
5
+
"description": "A full description of mandatory and optional metadata properties that need to/can be included for data registration via QBiC dropboxes.",
6
+
"type": "object",
7
+
"definitions": {
8
+
"qc": {
9
+
"type": "object",
10
+
"properties": {
11
+
"qcml_id": {
12
+
"type": "string",
13
+
"description": "A qcml id following the qzml specification",
14
+
"pattern": "^QC:[0-9]{7}$"
15
+
},
16
+
"name": {
17
+
"type": "string",
18
+
"description": "Name of the quality control",
19
+
"examples": ["read count", "target region read depth", "Q20 read percentage"]
20
+
},
21
+
"value": {
22
+
"type": "string",
23
+
"description": "The actual qc value"
24
+
}
25
+
}
26
+
},
27
+
"sample": {
28
+
"type": "object",
29
+
"properties": {
30
+
"genome": {
31
+
"type": "string",
32
+
"examples": ["GRCh37"]
33
+
},
34
+
"id_genetics": {
35
+
"type": "string",
36
+
"description": "A sample URI provided by the human genetics department",
37
+
"examples": ["GS000000_01"]
38
+
},
39
+
"id_qbic": {
40
+
"type": "string",
41
+
"pattern": "Q\\w{4}\\d{3}[A-X][A-X0-9]",
42
+
"description": "QBIC sample code of the analysed biological specimen",
# for each file in our dictionary that starts with the currently handled path, we add the known checksums and the paths, along with the asterisk we removed earlier
189
+
ifkey.startswith(relativePath):
190
+
f.write(value+' *'+key+'\n')
191
+
returnchecksumFilePath
192
+
193
+
# moves a subset of nanopore data to a new target path, needed to add fastq and fast5 subfolders to the same dataset
0 commit comments