Skip to content

Commit 414073b

Browse files
wow-such-codesven1103jenniferboedker
authored
Documentation/convert ms (#72)
* add readme for conversion and registration of mass spec data Co-authored-by: jnnfr <[email protected]> * move readme to the top folder * fix formatting Co-authored-by: Sven F <[email protected]> Co-authored-by: jnnfr <[email protected]>
1 parent f0ba260 commit 414073b

File tree

1 file changed

+68
-2
lines changed

1 file changed

+68
-2
lines changed

README.md

Lines changed: 68 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,9 @@ openBIS.
4040
Formats:
4141

4242
- [NGS single-end / paired-end data](#ngs-single-end--paired-end-data)
43-
- [NGS single-end / paired-end data with metadata (deprecated)](#ngs-single-end--paired-end-data-with-metadata-(deprecated))
43+
- [NGS single-end / paired-end data with metadata (deprecated)](#ngs-single-end--paired-end-data-with-metadata)
44+
- [Attachment Data](#attachment-data)
45+
- [Mass Spectrometry mzML conversion and registration](#mass-spectrometry-mzml-conversion-and-registration)
4446

4547
### NGS single-end / paired-end data
4648

@@ -91,7 +93,9 @@ look like this:
9193
|-- <QBIC sample code>.fastq.gz.sha256sum
9294
```
9395

94-
### NGS single-end / paired-end data with metadata (deprecated)
96+
97+
### NGS single-end / paired-end data with metadata
98+
(deprecated)
9599

96100
**Disclaimer!**
97101
This data format is targeted for a single use case and should not be
@@ -183,3 +187,65 @@ See code examples:
183187
https://github.com/qbicsoftware/attachi-cli/blob/master/attachi/attachi.py#L63
184188
https://github.com/qbicsoftware/projectwizard-portlet/blob/9c86f500b26af4cf2613cfae32e470bf5d50bf78/src/main/java/life/qbic/projectwizard/io/AttachmentMover.java#L145
185189

190+
191+
### Mass Spectrometry mzML conversion and registration
192+
193+
**Responsible dropbox:**
194+
[QBiC-convert-register-ms-vendor-format](drop-boxes/register-convert-ms-vendor-format)
195+
196+
**Resulting data model in openBIS**
197+
...Q_TEST_SAMPLE (-> Q_MHC_LIGAND_EXTRACT (Immunomics case)) -> Q_MS_RUN per data file --> 2 DataSets per data file, one for raw data, one converted to mzML
198+
199+
**Expected data structure**
200+
In every use case, the data structure needs to contain a top folder around the respective data in order to accommodate metadata files.
201+
202+
The sample code found in the top folder can be of type `Q_TEST_SAMPLE` or `Q_MS_RUN`. In the former case, a new sample of type `Q_MS_RUN` is created and attached as child to the test sample.
203+
204+
**Valid folder/file types**:
205+
- Thermo Fisher Raw file format
206+
- Waters Raw folder
207+
- Bruker .d folder
208+
209+
**Incoming structure overview for standard case without additional metadata file:**
210+
```
211+
QABCD102A5_20201229145526_20201014_CO_0976StSi_R05_.raw
212+
|-- QABCD102A5_20201229145526_20201014_CO_0976StSi_R05_.raw
213+
|-- QABCD102A5_20201229145526_20201014_CO_0976StSi_R05_.raw.sha256sum
214+
```
215+
In this case, existing mass spectrometry metadata is expected to be already stored and the dataset will be attached.
216+
217+
218+
**Incoming structure overview for the use case of Immunomics data with metadata file:**
219+
```
220+
QABCD090B7
221+
|-- QABCD090B7
222+
| |-- file1.raw
223+
| |-- file2.raw
224+
| |-- file3.raw
225+
| `-- metadata.tsv
226+
|-- QABCD090B7.sha256sum
227+
`-- source_dropbox.txt
228+
```
229+
The source_dropbox.txt currently has to indicate the source as one of the Immunomics data sources.
230+
231+
The `metadata.tsv` columns for the Immunomics case are tab-separated:
232+
```
233+
Filename Q_MS_DEVICE Q_MEASUREMENT_FINISH_DATE Q_EXTRACT_SHARE Q_ADDITIONAL_INFO Q_MS_LCMS_METHODS technical_replicate workflow_type
234+
file1.raw THERMO_QEXACTIVE 171010 10 QEX_TOP07_470MIN DDA_Rep1 DDA
235+
```
236+
237+
Filename - one of the (e.g. raw) file names found in the incoming structure
238+
239+
Q_MS_DEVICE - openBIS code from the vocabulary of Mass Spectrometry devices
240+
241+
Q_MEASUREMENT_FINISH_DATE - Date in YYMMDD format (ISO 8601:2000)
242+
243+
Q_EXTRACT_SHARE - the extract share
244+
245+
Q_ADDITIONAL_INFO - any optional comments
246+
247+
Q_MS_LCMS_METHODS - openBIS code from the vocabulary of LCMS methods
248+
249+
technical_replicate - free text to denote replicates
250+
251+
workflow_type - DDA or DIA

0 commit comments

Comments
 (0)