SAIGE is a popular tool for performing association tests with multiple types of genetic data, such as:
- Common Variants
- Rare Variants
- Gene Burden
Here are some SAIGE-specific resources:
- git
- Some text editor for updating the
nextflow.config
profiles such asvim
ornano
- Nextflow version 23.04.1.5866
- Singularity version 3.8.3 OR Docker version 4.30.0
- JDK version 11.0.5
Note: test data were obtained from the SAIGE github repo.
- Start your own project directory and go there
mkdir my_new_saige_project
cd my_new_saige_project
- Build the
saige.sif
singularity imagesingularity build saige.sif docker://pennbiobank/saige:latest
- you do not necessarily have to do this repeatedly, BUT it's good for versioning to keep it with the project
- if you want to reuse containers, you could build containers in a more general
containers
directory somewhere
- Download the source code by cloning from git
git clone https://github.com/PMBB-Informatics-and-Genomics/pmbb-nf-toolkit-saige-family.git
- You may do this in your project directory, but it often makes sense to clone into a general
tools
location
- Copy the contents of one of the test data directories
cp -r pmbb-nf-toolkit-saige-family/test_data/ExWAS/test_config_exwas_no_GRM/* .
- Fill out the
nextflow.config
file to make sure it matches your system- See
nextflow
executor information here. - From the command line, you can use the text editor
vim
withvim nextflow.config
- The profile's attribute
process.container
should be set to'/path/to/saige.sif'
(replace/path/to
with the location where you built the image in step 2)
- See
- Do a stub run to test
nextflow run /path/to/pmbb-nf-toolkit-saige-family/workflows/saige_exwas.nf -profile cluster -stub
- Note: Replace
/path/to/
with the actual path (full or relative) where you cloned the repository in step 3. - ^ make sure you select the correct
profile
you set up
- Now run the workflow on the actual data!
nextflow run /path/to/pmbb-nf-toolkit-saige-family/workflows/saige_exwas.nf -profile cluster
- Note: Replace
/path/to/
with the actual path (full or relative) where you cloned the repository in step 3.
- Start your own project directory and go there
mkdir my_new_saige_project
cd my_new_saige_project
- Build the
saige.sif
singularity imagesingularity build saige.sif docker://pennbiobank/saige:latest
- you do not necessarily have to do this repeatedly, BUT it can be good for versioning to keep it with the project in case it gets updated later
- if you want to reuse containers, you could build containers in a more general
containers
directory somewhere
- Download the source code by cloning from git
git clone https://github.com/PMBB-Informatics-and-Genomics/pmbb-nf-toolkit-saige-family.git
- You may do this in your project directory, but it often makes sense to clone into a general
tools
location
- Set up the appropriate config file for the pipeline you want to run
- Update any and all input files!
- Double-check parameters for running the software (QC filters, settings, etc)
- Refer to the READMEs directory with more info for every parameter and input file
- Fill out the
nextflow.config
file to make sure it matches your system- See
nextflow
executor information here. - From the command line, you can use the text editor
vim
withvim nextflow.config
- The profile's attribute
process.container
should be set to'/path/to/saige.sif'
(replace/path/to/
with the path where you built the image in step 2) - Make sure the
nextflow.config
file has anincludeConfig
statement
- See
- Do a stub run to test
nextflow run /path/to/pmbb-nf-toolkit-saige-family/workflows/saige_exwas.nf -profile cluster -stub
- Note: Replace
/path/to/
with the actual path (full or relative) where you cloned the repository in step 3. - ^ make sure you select the correct
profile
you set up
- Now run the workflow on the actual data!
nextflow run /path/to/pmbb-nf-toolkit-saige-family/workflows/saige_exwas.nf -profile cluster
- Note: Replace
/path/to/
with the actual path (full or relative) where you cloned the repository in step 3.
- You do not need to copy YOUR data to the Input/ folder (unless you want to). It usually makes sense to add full paths to your
configs/workflow_specific_config
as needed.