You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The kputuw function is considerably faster as it encodes 2 digits at a
time and also utilises __builtin_clz. This changes kputll to use the
same 2 digits at a time trick. I have a __builtin_clzll variant too,
but with longer numbers it's not the main bottleneck and we fall back
to kputuw for small numbers. This avoids complicating the code with
builtin checks and alternate versions.
An alternative, purely for sam_format1_append would be something like:
static inline int kputll_fast(long long c, kstring_t *s) {
return c <= INT_MAX && c >= INT_MIN ? kputw(c, s) : kputll(c, s);
}
#define kputll kputll_fast
This works as BAM/CRAM only support 32-bit numbers for POS, PNEXT and
TLEN anyway, so ll vs w is an irrelevant distinction. However I chose
to modify the header file so it fixes other callers.
Overall compressed BAM to uncompressed SAM conversion is about 5%
quicker (tested on 10 million short-read seqs; it'll be minimal on
long seqs). This includes decode time and other functions too. The
sam_format1_append only component of that is about 15-25% quicker
depending on compiler and version.
0 commit comments