Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.IndexOutOfBoundsException & Exception in thread "Thread-25" java.lang.StackOverflowError #38

Open
pswaminath opened this issue Nov 27, 2019 · 23 comments

Comments

@pswaminath
Copy link

Hi,
I am using JACUSA for my project.
I have gDNA (from bwa aligner) and cDNA (from STAR aligner) bam files sorted and indexed.
I used the following command with java 1.7.
(hs.bed contains chromosome 1 to Y)

java -Xmx50G -jar JACUSA_v1.3.0.jar call-2 -a H:1,D -b hs37d5.bed -p 26 -r rddsnov27aoption.out -s gDNA.bam cDNA.bam &> jacnov27aoption.log

then I got the following errors. The process was stuck and there was no output but only tmp.gz files

java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at java.util.Collections$UnmodifiableList.get(Collections.java:1211)
at jacusa.filter.storage.DistanceFilterStorage.processRecord(DistanceFilterStorage.java:39)
at jacusa.pileup.builder.AbstractPileupBuilder.processRecord(AbstractPileupBuilder.java:358)
at jacusa.pileup.builder.AbstractPileupBuilder.adjustWindowStart(AbstractPileupBuilder.java:178)
at jacusa.pileup.iterator.AbstractWindowIterator.adjustWindowStart(AbstractWindowIterator.java:155)
at jacusa.pileup.iterator.AbstractWindowIterator.adjustCurrentGenomicPosition(AbstractWindowIterator.java:148)
at jacusa.pileup.iterator.TwoSampleIterator.hasNext(TwoSampleIterator.java:38)
at jacusa.pileup.worker.AbstractWorker.processParallelPileupIterator(AbstractWorker.java:183)
at jacusa.pileup.worker.AbstractWorker.run(AbstractWorker.java:67)

Exception in thread "Thread-25" java.lang.StackOverflowError
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:446)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)

Complete error log attached as pdf below.

jacusaerrorforgitissue.pdf

Please suggest as how these errors could be resolved.
Thank you!
Priya

@piechottam
Copy link
Collaborator

piechottam commented Nov 28, 2019 via email

@pswaminath
Copy link
Author

Hi Michael,
Thank you for your response.
Here are the answers:

according to your Exception
[...]
at
jacusa.filter.storage.DistanceFilterStorage.processRecord(DistanceFilterStorage.java:39)
[...]
means that there are no alignment blocks for some read - this is strange.
This basically means, there is a read marked as mapped but
it does not have any alignment blocks.

Thanks for identifying this error.

"-b hs37d5.bed"
You are specifically looking for reads mapping to the decoy genome (hs37d5)?
** Yes. hs37d5.bed contains positions in chromosomes 1 though 23, X and Y**

Do gDNA.bam cDNA.bam contain all reads? Was the alignment against (human
genome + hs37d5) or first against human genome and then unmapped against hs37d5?

bam files were got after aligning with hs37d5 (GRch37+decoy).
**hs37d5.fa. GRCh37 primary assembly, mitochondrial sequences, GL000... seqs, Human herpesvirus
(NC_007605) and the decoy sequences **

Do you get the error if you run JACUSA on human chromosomes 1, - 23, X, Y,
MT?
Yes I got error when I ran chromosomes 1 through 23, X, Y (alignments restricted to chromosomes 1 through 23, X, Y in bam file) with hs37d5.bed and the process got stuck the same way. I did not check including MT.

Other test run that I did:
When I ran JACUSA confined to regions of interest (chromosome1 only, chromosome Y only) individually using -b option, run was completed.

How big are your files?
Bam files range in size from 6 - 8 GB

other data details ( it is paired end 2x150bp length)

It is hard to help without looking at the data.

Hope this helps. Please let me know if you need any other information.
I appreciate your help.

Priya

@pswaminath
Copy link
Author

Hi Michael,
More information about our data:
DNA library prep:
agilent clinical research exome V1, 2x125bp
RNA library prep:
Illumina RNA Access, 2x125 bp
unstranded

Thanks!
Priya

@piechottam
Copy link
Collaborator

piechottam commented Dec 2, 2019 via email

@pswaminath
Copy link
Author

Thanks Michael for checking!
I have explored the options that I knew but I haven't got it to work.
Any help will be appreciated.

Priya

@piechottam
Copy link
Collaborator

piechottam commented Dec 2, 2019 via email

@pswaminath
Copy link
Author

Hi Michael,

I ran JACUSA with the following command (complete bed file with positions in chr 1 to Y, bam file contains alignments chr 1 to Y)

Command used:

java -jar JACUSA_v1.3.0.jar call-2 -b hs37d5.bed -p 26 -r rddsdec2chr1to22XYNOaoption.out -s gDNAchr1to22andXY.bam TRNASTARwithMDchr1to22andXY.bam &> jacdec2chr1to22XYnooption.log

I have the complete log file, screenshot of tmp.gz files and the last lines of bed file attached here.

tmpgzfilesscreenshot
jacdec2chr1to22XYNOaoption.log

Please take a look.

I noticed that JACUSA run goes up to the positions in chr Y in the bed file provided by us and then it is stuck before wrapping up the result. Please confirm.
It did not stop anywhere inbetween.

I have copied few lines where the error started and then the end of the messages. Please see attached log and screenshots for more information

[ INFO ] 05:19:44 : Started screening contig Y:28680525-28780524
[ INFO ] 05:19:44 : Started screening contig Y:58999981-59034049
Exception in thread "Thread-21" java.lang.StackOverflowError
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:446)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)

hs37d5bedlastlinesscreenshot

Thank you!
Priya

@piechottam
Copy link
Collaborator

piechottam commented Dec 4, 2019 via email

@pswaminath
Copy link
Author

Michael,
I ran JACUSA initially
with the complete bam file with no -b option including chr 1 to Y along with MT and other sequences - got error. I don't have the log file for this run. Those were java exceptions and the same message where it gets stuck at the last.
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:446)

Yes I can share the bed file. Please see attached the txt file.

Exception in thread "Thread-21" java.lang.StackOverflowError .
Yes this is the only exception I get without -a H:1 option

hs37d5.txt
(I was not able to attach the bed file as is so I changed the file extension to .txt)

Thanks!
Priya

@pswaminath
Copy link
Author

Michael,
I have the log file attached after your suggestion of running the command with -a H:1 and without bed file ( no -b option).

Command used:

java -Xmx50G -jar JACUSA_v1.3.0.jar call-2 -a H:1 -p 26 -r rddsdec2chr1to22XYwithaoptionnob.out -s AVR085108gDNAchr1to22andXY.bam AVR085108TRNASTARwithMDchr1to22andXY.bam &> jacdec2chr1to22XYwithaoptionnob.log

I still get the error and it stops with the same message in the last after chromosome Y.
There is an exception.
See attached log file for complete details.

[ INFO ] 04:29:39 : Started screening contig Y:28680525-28780524
[ INFO ] 04:29:39 : Started screening contig Y:58999981-59099980
Exception in thread "Thread-20" java.lang.StackOverflowError
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:446)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)

jacdec2chr1to22XYwithaoptionnob.log

@piechottam
Copy link
Collaborator

piechottam commented Dec 5, 2019 via email

@pswaminath
Copy link
Author

Hi,
Thanks for your suggestions. I will check running with -Xss option.

Regarding D option
-D 1000
Can I check along with - a H:1 ?
Does the command look like this then?
java -Xss4m -jar JACUSA_v1.3.0.jar -a H:1 -D 1000 ........

Priya

@piechottam
Copy link
Collaborator

piechottam commented Dec 5, 2019 via email

@pswaminath
Copy link
Author

Michael,
Thanks for your suggestions. I checked chromosome Y only along with two other runs.

Here are my different runs and what I found. Please see the two error log files attached for the failed runs.

  1. with Xss4m and -a H:1 but no b option - Same Error (after chromosome Y - Exception and stuck at the library)
    java -Xss4m -Xmx50G -jar JACUSA_v1.3.0.jar call-2 -a H:1 -p 26 -r rddsdec5chr1to22XYwithaoptionnobXss.out -s AVR085108
    gDNAchr1to22andXY.bam AVR085108TRNASTARwithMDchr1to22andXY.bam &> jacdec5chr1to22XYwithaoptionnobXss.log

Error lines

[ INFO ] 04:48:45 : Started screening contig Y:28680525-28780524
[ INFO ] 04:48:45 : Started screening contig Y:58999981-59099980
Exception in thread "Thread-23" java.lang.StackOverflowError
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)

  1. with Xss4m, with b option for only chromsome Y in bed file
    it completed without any problem and no errors
    java -Xss4m -Xmx50G -jar JACUSA_v1.3.0.jar call-2 -a H:1 -b hs37d5chrynew.bed -p 26 -r rddsdec5chr1to22XYwithaoptionwithbXssonlyY.out -s AVR085108gDNAchr1to22andXY.bam AVR085108TRNASTARwithMDchr1to22andXY.bam &> jacdec5chr1to22XYwithaoptionwithbXssonlyY.log

3)with Xss4m and -a H:1 and b with bed file (chr 1 to Y)
java -Xss4m -Xmx50G -jar JACUSA_v1.3.0.jar call-2 -a H:1 -b hs37d5.bed -p 26 -r rddsdec5chr1to22XYwithaoptionwithbfullXss.out -s AVR085108gDNAchr1to22andXY.bam AVR085108TRNASTARwithMDchr1to22andXY.bam &> jacdec5chr1to22XYwithaoptionwithbfullXss.log

Error lines
[ INFO ] 05:10:06 : Started screening contig Y:28494610-28594609
[ INFO ] 05:10:06 : Started screening contig Y:28680525-28780524
[ INFO ] 05:10:06 : Started screening contig Y:58999981-59034049
Exception in thread "Thread-21" java.lang.StackOverflowError
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:446)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)

Log file for Run 1:
jacdec5chr1to22XYwithaoptionnobXss.log

Log file for Run 3:
jacdec5chr1to22XYwithaoptionwithbfullXss.log

Should I increase the stack size from 4m to a higher value? I am not sure what is the maximum should I keep with 26 threads? Will this help finish the run?
Problem seems to happen at the last step where it is processing in the library in all the runs involving all the chromosomes (1 ... Y).

Thanks!
Priya

@piechottam
Copy link
Collaborator

piechottam commented Dec 8, 2019 via email

@pswaminath
Copy link
Author

Thanks Michael!
I ran JACUSA with different stack sizes (Xss4m, Xmx10G and Xss20M,Xmx20G, Xss16M, Xmx25G. It did not work and stopped with StackOverflow error, same as before

Exception in thread "Thread-14" java.lang.StackOverflowError
at org.apache.commons.math3.special.Gamma.digamma(Gamma.java:461)

After all the runs that I checked, the problem seem to occur mostly after finishing chromosome Y when wrapping up the result in the library is where it is stuck.

With Xss1024m, Xmx50G - I got Outofmemory Error.

I completed running few chromosomes seperately chr 1, 11, 12, 20 ,21, X, Y and
chr20 and chr21 together. All the individual chromosome runs finished successfully. chr20 and chr21 also ran together successfully and got the output (stack size 8192KB).

When I ran chr 20, 21, 22, X and Y together it could not finish with the above stack size.

I have a run that is on with Xss64M and Xmx30G.

Priya

@piechottam
Copy link
Collaborator

piechottam commented Dec 12, 2019 via email

@pswaminath
Copy link
Author

Hi Michael,
Thank you again!
I am running the analysis and I will update you once done.

@pswaminath
Copy link
Author

Hi Michael,
Update regarding the analysis:

I ran JACUSA for each chromosome and then combined the results.

Thank you!
Priya

@pswaminath
Copy link
Author

Hi Michael,
Regarding "Start" and "End" in the following line from analysis?

My questions:

I am in the process of annotating the variants.
1) Are these potential rna editing sites based on 1-based coordinate system?
2) Is the RNA editing taking place at position 13477 or at 13478?
3) how to understand base1 and base2 columns? G and AG (Is it a single nucleotide that is changing?)

"chrom" "chromStart" "chromEnd" "name" "stat" "strand" "base11" "bases21" "info" "filter_info" "cov1" "covs1" "cov2" "covs2" "cov" "matrix1.A" "matrix1.C" "matrix1.G" "matrix1.T" "base1" "matrix2.A" "matrix2.C" "matrix2.G" "matrix2.T" "base2" "baseChange" "editingFreq"
"1" 13477 13478 "variant" 3.63902821722496 "." "0,0,36,0" "10,0,729,0" "" "" 36 36 739 739 775 0 0 36 0 "G" 10 0 729 0 "AG" "G->A" 0.013531799729364

Thank you!
Priya

@piechottam
Copy link
Collaborator

piechottam commented Feb 25, 2020 via email

@pswaminath
Copy link
Author

Michael,
I appreciate you clarifying my questions.
The output that I quoted is from JACUSAhelper.

You had mentioned:
In your DNA you have a G and in your RNA u might have A or AG.
Looking at base change column in the results, I thought that it was G->A only.

I am annotating these RNA editing sites using Annovar. Which one should I consider G->A or G->AG or both?

Thank you!
Priya

@piechottam
Copy link
Collaborator

piechottam commented Feb 25, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants