Skip to content

Commit

Permalink
Update quantization.md (#735)
Browse files Browse the repository at this point in the history
Use comments as suggested by GitHub docs
  • Loading branch information
mikekgfb authored and malfet committed Jul 17, 2024
1 parent 5d83b0f commit d2a8da7
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion docs/quantization.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@

# Quantization

<!--
[shell default]: HF_TOKEN="${SECRET_HF_TOKEN_PERIODIC}" huggingface-cli login

[shell default]: TORCHCHAT_ROOT=${PWD} ./scripts/install_et.sh
-->

## Introduction
Quantization focuses on reducing the precision of model parameters and computations from floating-point to lower-bit integers, such as 8-bit integers. This approach aims to minimize memory requirements, accelerate inference speeds, and decrease power consumption, making models more feasible for deployment on edge devices with limited computational resources. For high-performance devices such as GPUs, quantization provides a way to reduce the required memory bandwidth and take advantage of the massive compute capabilities provided by today's server-based accelerators such as GPUs.
Expand Down

0 comments on commit d2a8da7

Please sign in to comment.