This repository was archived by the owner on Jul 21, 2025. It is now read-only.
Commit cc729e1
LightSeq QAT (#307)
* ls embedding support qat
* [WIP]ls transformer qat
* fix fairseq transformer cli shape bug of output projection
* ln_bw_i8 test passed!
* test with_mean of ln_i8
* ls encoder attn add qat
* dropout_relu_bias_i8 passed!
* dropout_gelu_bias unit test passed!
* dropout_relu_bias_bwd_i8 passed!
* dropout_gelu_bias_bwd_i8 unit test passed!
* format
* dropout_gelu_bias_bwd_i8 unit test passed!
* format
* polish unit test
* [WIP] ls encoder qat test
* quant_bias_add_transform_20314, quant_transform4d_0213 unit test passed!
* fix unit test bug
* [WIP] ls encoder qat unit test
* fix bug
* set default module to disable quant, fix bugs in examples
* fix encoder bug
* encoder qat test pass
* decoder qat forward test pass
* fix bug in encoder bw
* fix bug of cmax grad
* fix bug of act mask
* fix bug in tensor quantizer
* fix cmax grad bug
* [WIP] decoder support qat
* ls decoder qat pass
* ls encoder qat pass
* add unit test for quant bert encoder
* fix memory bug
* fix cmax grad bug in huggingface
* quant bert enc fw&bw test passed!
* fix hf cmax export bug
* fix fairseq out_proj bug
* fix fairseq shell bug
* fix decoder mem bug
* modify initial lr of fairseq quant training
* decoupled qat code
* modify huggingface training scripts
* add cmax grad
* delete enc_kv output quant
* modify ffn2gemm quant like inference
* fuse dequantize
* fix post ln mem bug
* add decoder self attn qkv cache quant
* export quant model (stage 1)
* export quant model (stage 2)
* export quant model (stage 3)
* support vit quant train
* add gradient clip
* fix hf export bug
* fix quant gpt bug
* support quant gpt training
* modify huggingface training scripts
* support ls bert, gpt export
* support custom quant transformer export
* optimizer ffn fake quant and dcmax
* support quant gpt export
* support quant vit export
* add quant linear layer
* fix quant linear layer bug
* support quant vit infer
* speedup cublass igemm on A100 (by huxingwu)
* optimize ls_quant_dropout_act_bias_bwd_kernel
* polish training gemm algo code
* support gemm best algo search on different GPUs and shapes
* search in the range (min_bsz, 512, 1) and (512, max_bsz, 32)
* add configs_sm75/h512_i2048_b1-10016.json
* support col32 igemm
* add configs_sm75/h768_i3072_b1-10016.json
* add configs_sm80/h512_i2048_b1-10016.json
* add configs_sm75/h1024_i4096_b1-10016.json
* add configs_sm80/h768_i3072_b1-10016.json
* fix syntax error
* configs_sm80/h1024_i4096_b1-10016.json
* modify gemm test config format
* merge all the configs to one
* support search all shapes which are not in the config
* polish the merged config
* add cublas_algo_map cpp code
* move get_sm func to lightseq kernels
* move gemm_test to lightseq ops
* modify default config dir, fix algo_map bug
* fix col32 bug
* col major igemm become default
* fix dcax kernel bug
* loosen cuda 11.6 requirement
* add vit cpp example
* fix bug from col32 gemm and a100 tuned col gemm
* support training encoder qkv_linear auto-tune gemm (in comment)
* add required header file
* dynamic use col32 or col4 in different GPUs
* fix multidefinition bug
* fix weight transform col32 bug
* add best algo for inference gemm (in comments)
* support easy benchmark for gpt and transformer
* support benmark huggingface
* fix embedding clip_max bug
* ls quant linear support more shape
* fix quant linear bug
* fix quant linear bug
* update pad function for older torch
* fix quant linear bug
* remove redundant code
* fix export bug
* fix format
* fix custom train&infer bug
* fix quant infer size overflow
* fix ls gpt export bug (extra_decode_length)
* fix hf bart cmax init and state
* fix max-batch-tokens bug of bart predict
Co-authored-by: Ying Xiong <xiongying.taka@bytedance.com>
Co-authored-by: duanrenchong <duanrenchong@bytedance.com>1 parent ae569c2 commit cc729e1
File tree
149 files changed
+33917
-922
lines changed- docs
- examples
- inference
- cpp
- python
- export
- fairseq
- huggingface
- test
- training
- custom
- fairseq
- huggingface
- bart/summarization
- bert
- task_glue
- task_ner
- task_qa
- gpt
- vit
- lightseq
- csrc
- kernels
- includes
- layers
- includes
- ops/includes
- pybind
- inference
- model
- proto
- pywrapper
- training
- cli/fs_modules
- csrc/ops/includes
- ops/pytorch
- builder
- pytorch_quantization
- nn/modules
- tests
- gemm_test
- configs
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
149 files changed
+33917
-922
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
6 | | - | |
7 | | - | |
8 | | - | |
9 | | - | |
10 | | - | |
11 | | - | |
12 | | - | |
| 4 | + | |
| 5 | + | |
13 | 6 | | |
14 | 7 | | |
15 | 8 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
127 | | - | |
| 126 | + | |
| 127 | + | |
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
23 | | - | |
| 21 | + | |
| 22 | + | |
24 | 23 | | |
25 | 24 | | |
26 | 25 | | |
| |||
39 | 38 | | |
40 | 39 | | |
41 | 40 | | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| 59 | + | |
| 60 | + | |
59 | 61 | | |
60 | | - | |
| 62 | + | |
61 | 63 | | |
62 | 64 | | |
63 | | - | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
64 | 70 | | |
65 | 71 | | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
66 | 75 | | |
67 | 76 | | |
68 | 77 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
14 | 13 | | |
15 | 14 | | |
16 | 15 | | |
17 | 16 | | |
18 | 17 | | |
19 | 18 | | |
20 | 19 | | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
21 | 23 | | |
22 | 24 | | |
23 | 25 | | |
| |||
39 | 41 | | |
40 | 42 | | |
41 | 43 | | |
| 44 | + | |
42 | 45 | | |
43 | 46 | | |
44 | 47 | | |
| |||
56 | 59 | | |
57 | 60 | | |
58 | 61 | | |
| 62 | + | |
| 63 | + | |
59 | 64 | | |
60 | | - | |
| 65 | + | |
61 | 66 | | |
62 | 67 | | |
63 | | - | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
64 | 73 | | |
65 | 74 | | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
66 | 78 | | |
67 | 79 | | |
68 | 80 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
16 | 15 | | |
17 | 16 | | |
18 | 17 | | |
19 | 18 | | |
20 | 19 | | |
21 | 20 | | |
22 | 21 | | |
23 | | - | |
24 | | - | |
25 | | - | |
| 22 | + | |
26 | 23 | | |
27 | 24 | | |
28 | 25 | | |
| |||
41 | 38 | | |
42 | 39 | | |
43 | 40 | | |
| 41 | + | |
44 | 42 | | |
45 | 43 | | |
46 | 44 | | |
| |||
58 | 56 | | |
59 | 57 | | |
60 | 58 | | |
| 59 | + | |
| 60 | + | |
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
65 | | - | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
66 | 70 | | |
67 | 71 | | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
68 | 75 | | |
69 | 76 | | |
70 | 77 | | |
| |||
76 | 83 | | |
77 | 84 | | |
78 | 85 | | |
79 | | - | |
| 86 | + | |
80 | 87 | | |
81 | | - | |
| 88 | + | |
82 | 89 | | |
83 | 90 | | |
84 | 91 | | |
| |||
0 commit comments