You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/formats.rst
+36-5Lines changed: 36 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,42 @@ Apart of this three main file formats, additionally, multiple file formats are u
14
14
Input formats
15
15
---------------------------
16
16
17
-
The quantms should receive three main inputs: Spectra data files (RAW or mzML); Protein database (Fasta); Experimental design (SDRF).
17
+
The quantms should receive three main inputs: Experimental design (SDRF); Spectra data files (RAW or mzML); Protein database (Fasta).
18
+
19
+
SDRF: experimental design
20
+
~~~~~~~~~~~~~~~~~~~~~~~~~~
21
+
22
+
The HUPO-PSI and ProteomeXchange recently developed the MAGE-TAB an standard file format for experimental design representation. Within the MAGE-TAB, the Sample and Data Relationship Format (SDRF) is a lightweight tab delimited format to represent the sample metadata and its relation with the data files (RAW or mzML files).
Multiple concepts from SDRF and **relevant and important** for the quantms pipeline:
30
+
31
+
**Peptide Search Parameters**:
32
+
33
+
- comment[cleavage agent details]: enzyme used in the experiment, including sites and positions.
34
+
- comment[modification parameters]: post-translation modifications that will be consider within the peptide/protein search
35
+
- comment[precursor mass tolerance], comment[fragment mass tolerance]: Precursor mass tolerance use for the peptide search. Both each engines Comet and MSGF+ use this parameter.
36
+
37
+
**Experimental Design**:
38
+
39
+
- factor value[disease]: The factor value is the variable under study. In a proteomics study it can be the disease, organism part, tumor location, etc. The study variable will have multiple values depending of the samples and conditions. For example, in the SDRF above, the variable under study **factor value[phenotype]** has to values (one for each sample), control (sample 1) and primary tumor (sample 2).
40
+
41
+
.. important:: When multiple conditions are under study, the user can create multiple SDRFs (one for each variable under study). This is needed because in the LFQ data analysis when match between runs is enable (MBR), the proteomicsLFQ quantification step needs to match samples that belongs to the same condition value.
42
+
43
+
- characteristics[biological replicate]: Biological replicates are samples that belongs to the same condition value and material source.
44
+
- comment[technical replicate]: Technical replicates are repetitions of measures of the same sample.
45
+
- comment[fraction identifier]: Fraction identifiers are use to numbered and identified each fraction (for any fractionation method).
46
+
- comment[label]: Label is used by quantms to associate samples to labels/channels in the experiment (e.g. TMT127).
47
+
48
+
Spectra Data
49
+
~~~~~~~~~~~~~~~~~~~~~~~~~~
50
+
51
+
The spectra data can be provided in RAW files (Thermo instruments) or preferably in mzML. If RAW files are provided, the first step of the identification pipeline `convert them into mzML <https://quantms.readthedocs.io/en/latest/identification.html#mass-spectra-processing-raw-conversion>`_.
52
+
18
53
19
54
Protein databases
20
55
~~~~~~~~~~~~~~~~~~
@@ -23,10 +58,6 @@ Protein databases can be download from multiple sources; the most common ones ar
23
58
24
59
.. hint:: Contaminants should be appended to the database. For each contaminant protein the prefix ``CONTAMINANT_`` should be added as prefix of the protein.
25
60
26
-
Spectra Data
27
-
~~~~~~~~~~~~~~~~~~~~~~~~~~
28
-
29
-
The spectra data can be provided in RAW files (Thermo instruments) or preferably in mzML. If RAW files are provided, the first step of the identification pipeline convert them into mzML, read :ref:`identification:Mass spectra processing: Raw conversion`.
Copy file name to clipboardExpand all lines: docs/index.rst
+6Lines changed: 6 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,10 @@ quantms: A cloud-based workflow for peptide and protein quantification.
3
3
4
4
Welcome to the `quantms workflow <https://github.com/bigbio/quantms>`_, a cloud-based workflow for quantitative proteomics analysis of large mass-spectrometric data sets. Several labeling techniques as well as label-free quantification are supported.
5
5
6
+
.. image:: images/ms-proteomics.png
7
+
:width:500
8
+
:align:center
9
+
6
10
Contents
7
11
--------
8
12
@@ -25,6 +29,8 @@ Contents
25
29
.. toctree::
26
30
:maxdepth: 2
27
31
32
+
|
33
+
28
34
The following links should be follow to get support and help with the quantms maintainers:
29
35
30
36
|Get help on Slack| |Report Issue| |Get help on GitHub Forum|
0 commit comments