1.2.1 Data files

The user has to provide his data files within the data folder. Original data files themselves have to be placed in the dataset subfolder (within data/) along with a metadata tabular file that contains the experimental setup corresponding to the data.

Both quantification and metadata files must be provided as in .csv 'tab' delimited format.

Note: if you provide at least one type of measure, you can still run some of the analyses, by making sure that the data you provide is suitable for the analysis that you choose.

a. Metadata file

The structure of the tabular metadata file has to contain 6 columns named name_to_plot, timepoint, timenum, condition, compartment, original_name.

Explanation of the metadata format _{^{(click to show/hide)}}

Here is the semantics of the columns:

name_to_plot is the string that will appear on the figures produced by DIMet
condition is the experimental condition
timepoint is the sampling time as it is defined in your experimental setup (it is an arbitary string that can contain non numerical characters)
timenum is the numerical encoding of the timepoint
compartment is the name of the cellular compartment for which the measuring has been done (e.g. "endo", "endocellular", "cyto", etc)
original_name contains the column names that are provided in the quantification files

Example:

name_to_plot	condition	timepoint	timenum	compartment	original_name
Cond1 T0	cond1	T0	0	comp_name	T0_cond_1
Cond1 T24	cond1	T24	24	comp_name	T24_cond_1
Cond2 T0	cond2	T0	0	comp_name	T0_cond_2
Cond3 T24	cond2	T24	24	comp_name	T24_cond_2

b. Quantification files

Each quantification file is expected to correspond to one type of measure. Supported measure types are:

Isotopologue absolute values
Total metabolite abundances
Mean enrichment (also called Fractional contribution)
Isotopologue proportions

Expected format of the quantification files with examples _{^{(click to show/hide)}}

Each row in the quantification files contains measurements for a given metabolite. Expected columns are the following:

ID contains the molecule identifiers
All the other columns contain measures in numeric format (no letters or symbols, only numbers).

Note 1: quantification columns' names have to match with the column original_name in the metadata file. Note 2: For the isotopologues, the ID must follow the convention: metaboliteID_m+X (for example: AMP_m+4, cit_m+0, cit_m+1)

The total metabolites' Abundances file:

ID	T0_cond_1	T24_cond_1	T0_cond_2	T24_cond_2
PEP	3364610.46	10250098.25	1124772.29	1035932.25
citrate	5783654.51	5934305.65	3546334.99	3460334.88
fumarate	354387.74	360087.74	334287.74	350387.74
OA	9435186.33	9435186.33	9435186.33	9435186.33

The Mean enrichment (also called Fractional contribution) file:

ID	T0_cond_1	T24_cond_1	T0_cond_2	T24_cond_2
PEP	0.5603	0.6391	0.9591	0.9553
citrate	0.8057	0.8870	0.7809	0.6918
fumarate	0.001	0	0.1508	0.1511
OA	0.7030	0.7006	0.001	0

The Isotopologue absolute values file:

ID	T0_cond_1	T24_cond_1	T0_cond_2	T24_cond_2
PEP_m+0	357354.66	387054.66	0	0
PEP_m+1	965435.68	975030.68	668.91	568.87
PEP_m+2	1435050.95	7987654.66	136749.05	137709.05
PEP_m+3	606769.17	900358.25	987354.33	897654.33

The Isotopologue proportions file :

ID	T0_cond_1	T24_cond_1	T0_cond_2	T24_cond_2
PEP_m+0	0.106	0.038	0.000	0.000
PEP_m+1	0.287	0.095	0.001	0.001
PEP_m+2	0.427	0.779	0.122	0.133
PEP_m+3	0.180	0.088	0.878	0.867

c. Data files for the omics integration (optional)

DIMet offers the possitibilty of pathway-based integration of the metabolome and the transcriptome though metabolograms.

Data files required for omics integration^{_{(click to show/hide)}}

Two data types are required:

Metabolite quantification files in the dataset subfolder.
Results, provided by the user, of the differential analysis of the transcriptome data placed in the dataset subfolder

Together with the files with differentially expressed genes provided, the user must also provide the pathways files (details in item 2.2 of this subsection).

Thus the expected project data structure becomes:

MYPROJECT
  ├── config
  │   ├── analysis
  │   │   ├── dataset
  │   │   │   └── # --->'dataset configuration' yml files
  │   │   ├── # --->'analysis configuration' yml files
  │   ├── # ---> 'general configuration' yml files
  └── data
      └── DATASET1_data
          ├── # ---> tabular .csv files of metabolomics data
          ├── # ---> .csv files required for omics integration (genes and pathways)

2.1 Files for differentially expressed genes (DEGs)

Files for differentially expressed genes (DEGs) must be provided in the tab delimited .csv format. For each file:

The rows represent the genes (except the first one, which is the header having the names of the columns)
The columns provide the information to be integrated, two columns are compulsory:
1. the gene names, given as strings
2. the Fold-Changes (or the log2 Fold-Changes) in numeric format (no letters or symbols, only numbers)

Formatting example of differentially expressed genes files:

log2FoldChange	gene_symbol
-16.1660338229612	GPI
3.32192809488736	HK1
2.32192809488736	RPIA
0.807354922057604	PFKL

2.2 The metabolites per pathway and genes or transcripts per pathway files

These files contain the user-provided metabolites and genes for each pathway. It is allowed for a metabolite or gene to appear in several pathways. Identifiers must match with those appearing in the quantification files in the dataset subfolder. Gene names must match with those appearing in the DEGs file

Example for metabolites per pathway:

GLYCOLYSIS	PENTOSE_PHOSPHATE	...
Glucose_6P	Ribose_5P	...
Pyruvate	Xylulose_5P	...
PEP	Glucose_6P	...
...	...	...

Example for genes per pathway:

GLYCOLYSIS	PENTOSE_PHOSPHATE	...
GPI	RPIA	...
HK1	PGD	...
PKFL	RBKS	...
...	...	...

All these files must be provided in the tab delimited .csv format.

logo_footer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1.2.1 Data files

a. Metadata file

b. Quantification files

c. Data files for the omics integration (optional)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally