Skip to content

KatharineME/FASTQ.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DNA and cDNA sequence analysis ❄️

🚧 This README is under construction 🚧

Use

cd FASTQ.jl

julia --project

julia> using FASTQ

# Concatenate fastq files
julia> FASTQ.command.concatenate_fastq()

# Germline DNA alignment and variant calling
julia> FASTQ.Command.call_variants_on_germline_dna()

# Somatic DNA alignment and variant calling
julia> FASTQ.Command.call_variants_on_somatic_dna()

# cDNA alignment and variant calling
julia> FASTQ.Command.call_variants_on_cdna()

# cDNA psuedoalignment
julia> FASTQ.Command.measure_gene_expression()

Prepare Environment

1. Run

Get Homebrew.

brew install fastp fastqc kallisto samtools bcftools

pip3 install multiqc

2. Get SnpEff

Download and put in tool/.

3. Get minimap2 and STAR

Mac

Unzip programs in FASTQ.jl/tool/ and link their exectuables to usr/local/bin/

If programs in tool/ fail then compile programs.

Linux

Get precompiled Linux versions from sources: minimap2, STAR.

4. Get Docker, Manta, and Strelka

Get docker.

Docker Desktop > Settings > Resources > File Sharing > Add /tmp and /var.

Download strelka-2.9.10.centos6_x86_64.tar.bz2 from strelka releases.

Download manta-1.6.0.centos6_x86_64.tar.bz2 from manta releases.

Put manta and strelka in tool/.

5. Set ulimit

Add ulimit -n 10000 to .zshrc.

Get FASTQ

git clone https://github.com/KatharineME/FASTQ.jl

cd FASTQ.jl

julia --project --eval "using Pkg; Pkg.instantiate()"

Test

julia --project --eval "using Pkg; Pkg.test()"

Keep in Mind

FASTQ.jl assumes the following

input/ should be like this

[ 160] input
β”œβ”€β”€ [ 128] Sample1
β”‚Β Β  β”œβ”€β”€ [ 0] R1.fastq.gz
β”‚Β Β  └── [ 0] R2.fastq.gz
└── [ 128] Sample2
β”œβ”€β”€ [ 0] R1.fastq.gz
└── [ 0] R2.fastq.gz

tool/ should contain needed tools of correct versions

[ 416] tool/
β”œβ”€β”€ [ 544] STAR-2.7.9a
β”œβ”€β”€ [ 15M] STAR-2.7.9a.zip
β”œβ”€β”€ [ 192] manta-1.6.0.centos6_x86_64
β”œβ”€β”€ [2.6K] minimap2-2.24
β”œβ”€β”€ [1.1M] minimap2-2.24.zip
β”œβ”€β”€ [ 416] rtg-tools-3.11
β”œβ”€β”€ [5.1M] rtg-tools-3.11-nojre.zip
β”œβ”€β”€ [ 352] snpEff
└── [ 192] strelka-2.9.10.centos6_x86_64

The reference genome should be adjacent to chromosome and gtf files.

[ 576] GRCh38/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set
β”œβ”€β”€ [1.4G] GCA_000001405.15_GRCh38_full_analysis_set.refseq_annotation.gtf
β”œβ”€β”€ [ 51M] GCA_000001405.15_GRCh38_full_analysis_set.refseq_annotation.gtf.gz
β”œβ”€β”€ [2.9G] GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna
β”œβ”€β”€ [120K] GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.fai
β”œβ”€β”€ [847M] GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz
β”œβ”€β”€ [120K] GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz.fai
β”œβ”€β”€ [754K] GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz.gzi
β”œβ”€β”€ [ 0] GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz.kallisto_index
β”œβ”€β”€ [6.8G] GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz.mmi
β”œβ”€β”€ [ 736] GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.sdf
β”œβ”€β”€ [ 576] StarIndex
β”œβ”€β”€ [ 202] chrn_n.tsv
β”œβ”€β”€ [ 426] chromosome.bed
β”œβ”€β”€ [ 252] chromosome.bed.gz
β”œβ”€β”€ [4.1K] chromosome.bed.gz.tbi
└── [ 227] n_chrn.tsv

Made by Kata πŸ₯‹