You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As bcf1_t is quite a big structure, it adds quite a lot of
overhead if the records being sorted are small (e.g. single sample
gVCF). This overhead can be reduced by storing the data in a
more compact form. Variable-length encoding is used for numbers
that aren't directly needed for sorting as values are usually
much smaller than the maximum possible. On a test file with
approx. 61 characters per VCF line, up to four times as many
records could be stored before having to spill them.
This change only affects the blocks of data sorted in memory
and then written out by buf_flush(). As the merge_blocks()
function writes bcf and needs far fewer records in memory at
any time, partially merged files are still written in that
format.
0 commit comments