Skip to content

mblanche/aligner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

batchAlignment

takes SIMR Lims directory and send fastq from samples for alignment in parallel using SGE

SYNOPSIS

batchBowtie [options] 
            [--bowtie bowtie_index]
            [--directories flowcell_1 flowcell_2 ...]
            [-- any extra parameter to pass to bowtie]

 Options:
--help            brief help message
--man             full documentation

--destdir         destination dir to save the results (Def: ~/batchAlignement/aln_timeStamp)
--sub_selection   smaller list of sample to use
--excluded         sample to remove from the job

--queue           SGE queue to use (Def: all.q)

--dryrun          print the jobs that will be sent to SGE

 Every options names can be abreviated to their smaller unique value (ie: -dir/--directories, -de/--destdir)

REQUIRED ARGUMENTS

--directories

Path to the location of a flowcells containting the fastq and the .csv file describing the samples (Generated by the SIMR LIMS system). Multiple diretories can be passed (seperated by space)

--bowtie

Path to the root of the bowtie index. If the environment variable BOWTIE_TX_INDEXES exists and points to the directory containing the bowtie indexes, only the name of the index can be provide

OPTIONS

--help

Print a brief help message and exits.

--man

Prints the manual page and exits.

--destdir

Name and location of the directory where the final bam files will be located

--dryrun

Prints the jobs that will be run without exuting them

--sub_selection

smaller list of sample to use

--excluded

sample to remove from the job

--queue

SGE queue to use (Def: all.q)

--

Additional paramter to pass to bowtie. Will be taken literally. Need to be encloseed in double quotes with internal double quotes properly escaped. For instance:

-- "-N 2 -k5 --ignore-quals"

will be added as is to the bowtie parameter list. Currenlty, bowtie2 is run with no extra arguments.

DESCRIPTION

This program will read the SIMR Lims generated directories of flowcells barcodes in search of a file with the .csv extension (the Sample_Report.csv and other iteration previously used by the lims). It will then use the sample name (in column 1) and associates the coressponding fastq file(s) (in column 3), one sample to many fastq files, even if located accross flowcells (as long as the flowcell directories are passed as argument to --directories).

Then, the fastq file(s) will be split in a tmp directory and aligned in parallel using bowtie2 using jobs sent to the SGE queue. The mulitple bam files will then be merged and the results will be saved indivually as .bam files using the sample named in hte .csv files The --dest-dir can be use to define a location to save the results, otherwise, the ./bowtieBatch_DD/MM/YY:H:M:S directory will be created and will store the final bam files.

EXAMPLES

Dry Run: Printing what will be run without runing it ~/scripts/aligner/batchBowtie2 --dryrun --bowtie Drosophila_melanogaster.BDGP5.71.min --dir /n/analysis/Blanchette/sha/MOLNG-61/C05HTACXX /n/analysis/Blanchette/sha/MOLNG-61/C0K08ACXX --destdir Sex_n_Tudor'

Running: ~/scripts/aligner/batchBowtie2 --bowtie Drosophila_melanogaster.BDGP5.71.min --dir /n/analysis/Blanchette/sha/MOLNG-61/C05HTACXX /n/analysis/Blanchette/sha/MOLNG-61/C0K08ACXX --destdir Sex_n_Tudor'

debugging ~/scripts/aligner/batchBowtie2 --bowtie Drosophila_melanogaster.BDGP5.73.min --dir /n/analysis/Blanchette/sha/MOLNG-61/C05HTACXX --destdir temp --debug

~/scripts/aligner/batchBowtie2 --bowtie Drosophila_melanogaster.BDGP5.73.min --dir test --destdir temp

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages