Skip to content

Commit

Permalink
update lserve
Browse files Browse the repository at this point in the history
  • Loading branch information
ys-2020 committed Feb 24, 2025
1 parent e6c7f8b commit b463284
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion _data/publications.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,19 @@
main:

- title: "LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention"
authors: <strong>Shang Yang*</strong>, Junxian Guo*, Haotian Tang, Qinghao Hu, Guangxuan Xiao, Jiaming Tang, Yujun Lin, Zhijian Liu, Yao Lu, Song Han.
conference_short: MLSys
conference: The Eighth Annual Conference on Machine Learning and Systems <strong>(MLSys)</strong>, 2025.
paper: https://arxiv.org/abs/2502.14866
code: https://github.com/mit-han-lab/omniserve
image: ./assets/img/paper_teasers/LServe.png

- title: "QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving"
authors: Yujun Lin*, Haotian Tang*, <strong>Shang Yang*</strong>, Zhekai Zhang, Guangxuan Xiao, Chuang Gan, Song Han.
conference_short: MLSys
conference: The Eighth Annual Conference on Machine Learning and Systems <strong>(MLSys)</strong>, 2025.
paper: https://arxiv.org/abs/2405.04532
code: https://github.com/mit-han-lab/qserve
code: https://github.com/mit-han-lab/omniserve
image: ./assets/img/paper_teasers/QServe.png

- title: "AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration"
Expand Down
Binary file added assets/img/paper_teasers/LServe.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b463284

Please sign in to comment.