Skip to content

Commit c1a48a1

Browse files
committed
updated manual
1 parent d595482 commit c1a48a1

File tree

1 file changed

+40
-17
lines changed

1 file changed

+40
-17
lines changed

guide-to-R-scripts.md

+40-17
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,26 @@ General guide to R scripts for analyzing your transcriptome data
22

33
## Gene expression analysis
44

5-
### Compare gene expression using DESeq2
5+
### Normalize gene expression (DeSeq)
66
```
77
library(DESeq2)
8-
counts<-read.delim(‘counts.txt,row.names=1)
9-
cds<-DESeqDataSetFromMatrix(counts.txt,meta,~cluster)
10-
cds<-DESeq(cds)
11-
res<-results(cds)
12-
sig<-res[which(res$padj<0.05),]
13-
write.table(sig,file=‘DEcontigs.txt’,quote=F,sep=‘\t')
8+
#get merged_counts.txt from get-bam-counts.sh script
9+
counts<-read.delim("merged_counts.txt")
10+
11+
#make a meta file with samples in the same order as how they are listed in your directory
12+
meta<-read.delim("meta.txt")
13+
14+
#make contig labels into row names and remove that column
15+
rownames(counts)<-counts[,1]
16+
colnames(counts)<-meta$sample
17+
counts<-as.matrix(counts[,-1])
18+
19+
#normalize read counts and filter for high counts (i)e. more than 10 reads/site)
20+
normCounts<-t(counts)/estimateSizeFactorsForMatrix(counts)
21+
normCounts_10<-normCounts[,colMeans(normCounts)>10]
22+
23+
transposed<-t(normCounts_10)
24+
write.table(transposed,file="normCounts_10.txt", quote=FALSE,row.names=TRUE,col.names=TRUE,eol="\n")
1425
```
1526

1627
### WGCNA
@@ -26,29 +37,41 @@ write.table(sig,file=‘DEcontigs.txt’,quote=F,sep=‘\t')
2637

2738
### A simple way to format your 012 SNP matrix
2839
```
40+
#create 012 files from a your vcf with vcftools-012genotype-matrix.sh
2941
snps<-read.delim('file.012', header=F)
3042
pos<-read.delim('file.012.pos',header=F)
3143
indv<-read.delim<-('file.012.indv',header=F)
3244
45+
#the order of samples in your meta file should match how they are listed in your computer's directory. this example has a meta file with: sample, pop
46+
meta<-read.delim('meta.txt', header=TRUE)
47+
rownames(snps)<-meta$sample
3348
colnames(snps)<-paste(pos[,1],pos[,2],sep='-')
34-
rownames(snps)<-indv[,1]
35-
snps<-as.matrix(snps)
3649
37-
#PCA of SNPs
50+
#create a PCA of SNPs
3851
pc.out<-prcomp(snps)
3952
summary(pc.out)
40-
plot(pc.out$x[,1],pc.out$x[,2]) #PC1 v PC2
4153
54+
#plot with samples colored by population
55+
plot(pc.out$x[,1],pc.out$x[,2], col=meta$pop,main="Title of your PCA", xlab="PC1", ylab="PC2", pch=16, cex=1.5)
56+
57+
#add sample IDs to points
58+
text(pc.out$x[,1],pc.out$x[,2],labels=meta$id,pos=4,cex=0.7, offset=0.1)
59+
60+
#add a legend box
61+
legend(x="topright",legend=unique(meta$pop),fill=unique(meta$pop))
4262
```
4363

44-
### Add meta data to your SNP matrix
45-
- make a meta data file with info about individuals (location, date, etc.)
46-
- make sure your meta file is ordered the same as your vcfs! (i.e. ls your samples in the terminal to see their order)
47-
- script TBD
64+
65+
###Detect outliers with Bayescan
66+
- this is an FST based outlier detection method
67+
- warning: we do not use the FST outputs from this program
68+
69+
70+
### Detect outliers with PCAdapt
71+
- this detects outliers off the first and second principal component
72+
- it may be usefule to use both Bayescan and PCAdapt to remove potential loci under selection to create a neutral SNP dataset
4873

4974

50-
### Allow for missing SNP data with SNPrelate
51-
-
5275

5376
### Look at ancestry with admixture
5477
- Make a plink file from your VCF

0 commit comments

Comments
 (0)