[cuda] Compact int4/int6 weight quant metadata (bf16 -> uint8 + per-row super-scale)#20571
Open
Gasoonjia wants to merge 4 commits into
Open
[cuda] Compact int4/int6 weight quant metadata (bf16 -> uint8 + per-row super-scale)#20571Gasoonjia wants to merge 4 commits into
Gasoonjia wants to merge 4 commits into