You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
Could you please explain how the refmap file is generated.
I tested insilico genome creation for 129S1_SvImJ with vcf file mgp.v5.merged.indels.dbSNP142.normed.vcf.gz, that includes only two indels for this specific strain:
3000019 G -> GA
3001236 A -> ATTTTT
For the rest of positions in this vcf file 129S1_SvImJ strain includes ./.:.:.:.:.:.:.:.:.:.:.:.:.:.. So, if I understood correctly, these positions should be skipped.
As reference genome I used chr1.fa from mm10 (includes only chr1) with >chr1 changed to >1 (file is not attached).
I thinks the problem is caused by differences between VCFv4.1 and VCFv4.2
For VCFv4.1 all strains that should be skipped are marked as . , so the following code works correct
if (genotypeString.equals(".")) {
continue; // skip. no variant for this sample
}
But for VCFv4.2 all the records, that should be skipped, are marked differently ./.:.:.:.:.:.:.:.:.:.:.:.:.:.
Hi Michael, thanks for all your comments! Unfortunately, MEA was designed with VCFv4.1 in mind, and the changes in VCFv4.2 do make some of the code wonky. For now we'll ask that users stick with VCFv4.1, but we'll keep an eye on making everything compatible with both. The user guide will be updated to match this.
Hi!
Could you please explain how the refmap file is generated.
I tested insilico genome creation for
129S1_SvImJ
with vcf file mgp.v5.merged.indels.dbSNP142.normed.vcf.gz, that includes only two indels for this specific strain:For the rest of positions in this vcf file
129S1_SvImJ
strain includes./.:.:.:.:.:.:.:.:.:.:.:.:.:.
. So, if I understood correctly, these positions should be skipped.As reference genome I used
chr1.fa
frommm10
(includes onlychr1
) with>chr1
changed to>1
(file is not attached).Command:
The output I got output.fa.refmap.gz
How did we calculate
3000109 3000100 149
and3000258 3000250 978
lines?I assume the correct output should be
Thanks!
The text was updated successfully, but these errors were encountered: