Skip to content

Commit 2e76737

Browse files
committed
Update on "int4 gptq shape fix"
Summary: redoing 5bf70c1 in a way that doesn't get reverted. note, needed to fix a device issue as well. Test Plan: export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5 python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
1 parent 749093e commit 2e76737

File tree

4 files changed

+5
-497
lines changed

4 files changed

+5
-497
lines changed

Diff for: GPTQ.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,7 @@ def __init__(
150150
}
151151

152152
# trace model for one input
153-
one_input = tuple([multi.values[0].cpu() for multi in inputs])
153+
one_input = [multi.values[0].cpu() for multi in inputs]
154154
exported_model = torch._dynamo.export(
155155
model.cpu(), aten_graph=True, pre_dispatch=True, tracing_mode="fake"
156156
)(*one_input)

0 commit comments

Comments
 (0)