Skip to content

Commit 98e762e

Browse files
authored
Merge pull request #2 from descriptinc/ik/update-readme
Update README.md
2 parents a436c61 + 65ec6b2 commit 98e762e

File tree

3 files changed

+30
-5
lines changed

3 files changed

+30
-5
lines changed

README.md

Lines changed: 30 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,21 @@
1-
# Descript Audio Codec (.dac)
2-
3-
<!-- ![](https://static.arxiv.org/static/browse/0.3.4/images/icons/favicon-32x32.png) -->
4-
1+
# Descript Audio Codec (.dac): High-Fidelity Audio Compression with Improved RVQGAN
52

63
This repository contains training and inference scripts
74
for the Descript Audio Codec (.dac), a high fidelity general
8-
neural audio codec.
5+
neural audio codec, introduced in the paper titled **High-Fidelity Audio Compression with Improved RVQGAN**.
6+
7+
![](https://static.arxiv.org/static/browse/0.3.4/images/icons/favicon-16x16.png) [arXiv Paper: High-Fidelity Audio Compression with Improved RVQGAN
8+
](http://arxiv.org/abs/2306.06546) <br>
9+
📈 [Demo Site](https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5)<br>
10+
[Model Weights](https://github.com/descriptinc/descript-audio-codec/releases/download/0.0.1/weights.pth)
11+
12+
👉 With Descript Audio Codec, you can compress **44.1 KHz audio** into discrete codes at a **low 8 kbps bitrate**. <br>
13+
🤌 That's approximately **90x compression** while maintaining exceptional fidelity and minimizing artifacts. <br>
14+
💪 Our universal model works on all domains (speech, environment, music, etc.), making it widely applicable to generative modeling of all audio. <br>
15+
👌 It can be used as a drop-in replacement for EnCodec for all audio language modeling applications (such as AudioLMs, MusicLMs, MusicGen, etc.) <br>
16+
17+
<p align="center">
18+
<img src="./assets/comparsion_stats.png" alt="Comparison of compressions approaches. Our model achieves a higher compression factor compared to all baseline methods. Our model has a ~90x compression factor compared to 32x compression factor of EnCodec and 64x of SoundStream. Note that we operate at a target bitrate of 8 kbps, whereas EnCodec operates at 24 kbps and SoundStream at 6 kbps. We also operate at 44.1 kHz, whereas EnCodec operates at 48 kHz and SoundStream operates at 24 kHz." width=35%></p>
919

1020

1121
## Usage
@@ -17,6 +27,16 @@ cd descript-audio-codec
1727
pip install .
1828
```
1929

30+
### Weights
31+
Weights are released as part of this repo under MIT license.
32+
They are automatically downloaded when you first run `encode` or `decode` command. They can be cached locally with
33+
```
34+
python3 -m dac download
35+
```
36+
We provide a Dockerfile that installs all required dependencies for encoding and decoding. The build process caches model weights inside the image. This allows the image to be used without an internet connection. [Please refer to instructions below.](#docker-image)
37+
38+
39+
2040
### Compress audio
2141
```
2242
python3 -m dac encode /path/to/input --output /path/to/output/codes
@@ -93,3 +113,8 @@ tests. To launch these tests please run
93113
```
94114
python -m pytest tests
95115
```
116+
117+
## Results
118+
119+
<p align="left">
120+
<img src="./assets/objective_comparisons.png" width=75%></p>

assets/comparsion_stats.png

181 KB
Loading

assets/objective_comparisons.png

519 KB
Loading

0 commit comments

Comments
 (0)