Skip to content

Commit d097ade

Browse files
authored
Merge pull request #36 from ypriverol/readthedocs
Readthedocs
2 parents aa3945d + ac18f8e commit d097ade

File tree

3 files changed

+45
-6
lines changed

3 files changed

+45
-6
lines changed

docs/formats.rst

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,42 @@ Apart of this three main file formats, additionally, multiple file formats are u
1414
Input formats
1515
---------------------------
1616

17-
The quantms should receive three main inputs: Spectra data files (RAW or mzML); Protein database (Fasta); Experimental design (SDRF).
17+
The quantms should receive three main inputs: Experimental design (SDRF); Spectra data files (RAW or mzML); Protein database (Fasta).
18+
19+
SDRF: experimental design
20+
~~~~~~~~~~~~~~~~~~~~~~~~~~
21+
22+
The HUPO-PSI and ProteomeXchange recently developed the MAGE-TAB an standard file format for experimental design representation. Within the MAGE-TAB, the Sample and Data Relationship Format (SDRF) is a lightweight tab delimited format to represent the sample metadata and its relation with the data files (RAW or mzML files).
23+
24+
.. image:: https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/sdrf-proteomics/images/sdrf-nutshell.png
25+
:width: 900
26+
:align: center
27+
28+
|
29+
Multiple concepts from SDRF and **relevant and important** for the quantms pipeline:
30+
31+
**Peptide Search Parameters**:
32+
33+
- comment[cleavage agent details]: enzyme used in the experiment, including sites and positions.
34+
- comment[modification parameters]: post-translation modifications that will be consider within the peptide/protein search
35+
- comment[precursor mass tolerance], comment[fragment mass tolerance]: Precursor mass tolerance use for the peptide search. Both each engines Comet and MSGF+ use this parameter.
36+
37+
**Experimental Design**:
38+
39+
- factor value[disease]: The factor value is the variable under study. In a proteomics study it can be the disease, organism part, tumor location, etc. The study variable will have multiple values depending of the samples and conditions. For example, in the SDRF above, the variable under study **factor value[phenotype]** has to values (one for each sample), control (sample 1) and primary tumor (sample 2).
40+
41+
.. important:: When multiple conditions are under study, the user can create multiple SDRFs (one for each variable under study). This is needed because in the LFQ data analysis when match between runs is enable (MBR), the proteomicsLFQ quantification step needs to match samples that belongs to the same condition value.
42+
43+
- characteristics[biological replicate]: Biological replicates are samples that belongs to the same condition value and material source.
44+
- comment[technical replicate]: Technical replicates are repetitions of measures of the same sample.
45+
- comment[fraction identifier]: Fraction identifiers are use to numbered and identified each fraction (for any fractionation method).
46+
- comment[label]: Label is used by quantms to associate samples to labels/channels in the experiment (e.g. TMT127).
47+
48+
Spectra Data
49+
~~~~~~~~~~~~~~~~~~~~~~~~~~
50+
51+
The spectra data can be provided in RAW files (Thermo instruments) or preferably in mzML. If RAW files are provided, the first step of the identification pipeline `convert them into mzML <https://quantms.readthedocs.io/en/latest/identification.html#mass-spectra-processing-raw-conversion>`_.
52+
1853

1954
Protein databases
2055
~~~~~~~~~~~~~~~~~~
@@ -23,10 +58,6 @@ Protein databases can be download from multiple sources; the most common ones ar
2358

2459
.. hint:: Contaminants should be appended to the database. For each contaminant protein the prefix ``CONTAMINANT_`` should be added as prefix of the protein.
2560

26-
Spectra Data
27-
~~~~~~~~~~~~~~~~~~~~~~~~~~
28-
29-
The spectra data can be provided in RAW files (Thermo instruments) or preferably in mzML. If RAW files are provided, the first step of the identification pipeline convert them into mzML, read :ref:`identification:Mass spectra processing: Raw conversion`.
3061

3162
Output formats
3263
---------------------------

docs/index.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@ quantms: A cloud-based workflow for peptide and protein quantification.
33

44
Welcome to the `quantms workflow <https://github.com/bigbio/quantms>`_, a cloud-based workflow for quantitative proteomics analysis of large mass-spectrometric data sets. Several labeling techniques as well as label-free quantification are supported.
55

6+
.. image:: images/ms-proteomics.png
7+
:width: 500
8+
:align: center
9+
610
Contents
711
--------
812

@@ -25,6 +29,8 @@ Contents
2529
.. toctree::
2630
:maxdepth: 2
2731

32+
|
33+
2834
The following links should be follow to get support and help with the quantms maintainers:
2935

3036
|Get help on Slack| |Report Issue| |Get help on GitHub Forum|

docs/introduction.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ Bottom-up proteomics is a common method to identify proteins and characterize th
88
:width: 400
99
:align: center
1010

11+
|
1112
1213
.. sidebar:: Pipelines and tools
1314
:subtitle: **It can make your life easier** if you want to explore individual tools:
@@ -30,7 +31,8 @@ Mass spectrometry quantitative data analysis can be divided in three main steps:
3031
- downstream data analysis and quality control
3132

3233
.. image:: images/quantms.png
33-
:width: 350
34+
:width: 450
35+
:align: center
3436

3537
References
3638
--------------------------------

0 commit comments

Comments
 (0)