Skip to content

Commit 41c2fc6

Browse files
committed
Add warning for Llama models
1 parent b4dfa5d commit 41c2fc6

File tree

2 files changed

+28
-12
lines changed

2 files changed

+28
-12
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This repository contains instructions and examples for efficient neural architec
44

55
### :fire:[SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models](./SQFT/README.md)
66

7-
SQFT is a solution for low-precision sparse parameter-efficient fine-tuning (PEFT) of large models. It includes an innovative strategy that enables the merging of sparse weights with low-rank adapters without losing sparsity and accuracy, overcoming the limitations of previous approaches. SQFT also addresses the challenge of having quantized weights and adapters with different numerical precisions, enabling merging in the desired numerical format without sacrificing accuracy.
7+
SQFT is a solution for fine-tuning low-precision and sparse large models using parameter-efficient fine-tuning (PEFT). It includes an innovative strategy that enables the merging of sparse weights with low-rank adapters without losing sparsity and accuracy, overcoming the limitations of previous approaches. SQFT also addresses the challenge of having quantized weights and adapters with different numerical precisions, enabling merging in the desired numerical format without sacrificing accuracy.
88

99
### :fire:[Shears: Unstructured Sparsity with Neural Low-rank Adapter Search](./Shears/README.md)
1010

SQFT/README.md

+27-11
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,13 @@ We have released several foundation models (sparse or sparse-and-quantized) for
1515

1616
| Source Model | Sparsity | Sparse Model | Sparse-and-Quantized Model |
1717
|-----------------------------------------------------------------------------------|----------|------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|
18-
| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 50% | [IntelLabs/sqft-llama-3-8b-50-base](https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base) | [IntelLabs/sqft-llama-3-8b-50-base-gptq](https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base-gptq) |
1918
| [Mistral-7B-v0.3](https://huggingface.co/mistralai/Mistral-7B-v0.3) | 50% | [IntelLabs/sqft-mistral-7b-v0.3-50-base](https://huggingface.co/IntelLabs/sqft-mistral-7b-v0.3-50-base) | [IntelLabs/sqft-mistral-7b-v0.3-50-base-gptq](https://huggingface.co/IntelLabs/sqft-mistral-7b-v0.3-50-base-gptq) |
2019
| [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) | 50% | [IntelLabs/sqft-phi-3-mini-4k-50-base](https://huggingface.co/IntelLabs/sqft-phi-3-mini-4k-50-base) | [IntelLabs/sqft-phi-3-mini-4k-50-base-gptq](https://huggingface.co/IntelLabs/sqft-phi-3-mini-4k-50-base-gptq) |
20+
| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 50% | [IntelLabs/sqft-llama-3-8b-50-base]() | [IntelLabs/sqft-llama-3-8b-50-base-gptq]() |
21+
`*` **Llama-3 models are under review**
2122

23+
[//]: # (https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base)
24+
[//]: # (https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base-gptq)
2225

2326
## Setup
2427

@@ -304,14 +307,6 @@ lm_eval --model hf \
304307

305308
## Released Fine-tuned Models 🤗
306309

307-
- Meta-Llama-3-8B
308-
309-
| Base Model | Task | Method | Fine-tuned Model |
310-
|------------------------------------------------------------------------------------------------|--------|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------|
311-
| [sqft-llama-3-8b-50-base](https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base) | GSM8K | SQFT + SparsePEFT | [sqft-llama-3-8b-50-gptq-gsm8k-heu-adapter](https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-gptq-gsm8k-heu-adapter) |
312-
| [sqft-llama-3-8b-50-base-gptq](https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base-gptq) | GSM8K | SQFT | [sqft-sparsepeft-llama-3-8b-50-gsm8k-heu](https://huggingface.co/IntelLabs/sqft-sparsepeft-llama-3-8b-50-gsm8k-heu) |
313-
| [sqft-llama-3-8b-50-base-gptq](https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base-gptq) | GSM8K | SQFT + QA-SparsePEFT | [sqft-qa-sparsepeft-llama-3-8b-50-gptq-gsm8k-heu](https://huggingface.co/IntelLabs/sqft-qa-sparsepeft-llama-3-8b-50-gptq-gsm8k-heu ) |
314-
315310
- Mistral-7B-v0.3
316311

317312
| Base Model | Task | Method | Fine-tuned Model |
@@ -334,13 +329,34 @@ lm_eval --model hf \
334329
| [sqft-phi-3-mini-4k-50-base-gptq](https://huggingface.co/IntelLabs/sqft-phi-3-mini-4k-50-base-gptq) | CS | SQFT | [sqft-sparsepeft-phi-3-mini-4k-50-cs-heu](https://huggingface.co/IntelLabs/sqft-sparsepeft-phi-3-mini-4k-50-cs-heu) |
335330
| [sqft-phi-3-mini-4k-50-base-gptq](https://huggingface.co/IntelLabs/sqft-phi-3-mini-4k-50-base-gptq) | CS | SQFT + QA-SparsePEFT | [sqft-qa-sparsepeft-phi-3-mini-4k-50-gptq-cs-heu](https://huggingface.co/IntelLabs/sqft-qa-sparsepeft-phi-3-mini-4k-50-gptq-cs-heu) |
336331

332+
- Meta-Llama-3-8B
333+
334+
| Base Model | Task | Method | Fine-tuned Model |
335+
|------------------------------------------------------------------------------------------------|--------|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------|
336+
| [sqft-llama-3-8b-50-base]() | GSM8K | SQFT + SparsePEFT | [sqft-llama-3-8b-50-gptq-gsm8k-heu-adapter]() |
337+
| [sqft-llama-3-8b-50-base-gptq]() | GSM8K | SQFT | [sqft-sparsepeft-llama-3-8b-50-gsm8k-heu]() |
338+
| [sqft-llama-3-8b-50-base-gptq]() | GSM8K | SQFT + QA-SparsePEFT | [sqft-qa-sparsepeft-llama-3-8b-50-gptq-gsm8k-heu]() |
339+
`*` **Llama-3-8B fine-tuned models are under review**
340+
341+
[//]: # (https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base)
342+
343+
[//]: # (https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-gptq-gsm8k-heu-adapter)
344+
345+
[//]: # (https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base-gptq)
346+
347+
[//]: # (https://huggingface.co/IntelLabs/sqft-sparsepeft-llama-3-8b-50-gsm8k-heu)
348+
349+
[//]: # (https://huggingface.co/IntelLabs/sqft-llama-3-8b-50-base-gptq)
350+
351+
[//]: # (https://huggingface.co/IntelLabs/sqft-qa-sparsepeft-llama-3-8b-50-gptq-gsm8k-heu)
352+
337353
## Citation
338354
If you find SQFT's code and paper helpful, please kindly cite:
339355
```bibtex
340356
@article{munoz2024sqft,
341357
title = {SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models},
342-
author = {J. Pablo Munoz and Jinjie Yuan and Nilesh Jain},
343-
journal = {},
358+
author = {J. Pablo Muñoz and Jinjie Yuan and Nilesh Jain},
359+
journal = {The 2024 Conference on Empirical Methods in Natural Language Processing (Findings)},
344360
year = {2024},
345361
url = {}
346362
}

0 commit comments

Comments
 (0)