Improve `TransformersModel` UX #12785

hmellor · 2025-02-05T18:31:26Z

Improve user experience by warning instead of raising when Linear layer cannot be tensor parallelised.

(split from #12776)

Signed-off-by: Harry Mellor <[email protected]>

github-actions · 2025-02-05T18:31:39Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

hmellor · 2025-02-05T18:35:07Z

vllm/model_executor/models/transformers.py

        self.vocab_size = config.vocab_size
        self.unpadded_vocab_size = config.vocab_size

        self.model: PreTrainedModel = AutoModel.from_config(
            self.config,
            attn_implementation="vllm",
-            torch_dtype=vllm_config.model_config.dtype,


Note that the dtype of the loaded model is handled by a context manager:

vllm/vllm/model_executor/model_loader/loader.py

Lines 381 to 383 in bc1bdec

with set_default_torch_dtype(model_config.dtype):

with target_device:

model = _initialize_model(vllm_config=vllm_config)

Signed-off-by: Felix Marty <[email protected]>

Improve TransformersModel UX

05718fd

Signed-off-by: Harry Mellor <[email protected]>

hmellor mentioned this pull request Feb 5, 2025

Use RMSNorm in TransformersModel #12776

Closed

hmellor changed the title ~~Improve TransformersModel UX~~ Improve TransformersModel UX Feb 5, 2025

hmellor commented Feb 5, 2025

View reviewed changes

simon-mo approved these changes Feb 6, 2025

View reviewed changes

simon-mo merged commit 1a6fcad into vllm-project:main Feb 6, 2025
19 of 22 checks passed

hmellor deleted the warn-not-fail branch February 6, 2025 09:34

fxmarty-amd pushed a commit to fxmarty-amd/vllm that referenced this pull request Feb 7, 2025

Improve TransformersModel UX (vllm-project#12785)

c5d7865

Signed-off-by: Felix Marty <[email protected]>

AoyuQC pushed a commit to AoyuQC/vllm that referenced this pull request Feb 8, 2025

Improve TransformersModel UX (vllm-project#12785)

142560a

ShangmingCai pushed a commit to ShangmingCai/vllm that referenced this pull request Feb 10, 2025

Improve TransformersModel UX (vllm-project#12785)

2265167

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `TransformersModel` UX #12785

Improve `TransformersModel` UX #12785

hmellor commented Feb 5, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Feb 5, 2025

hmellor Feb 5, 2025

	with set_default_torch_dtype(model_config.dtype):
	with target_device:
	model = _initialize_model(vllm_config=vllm_config)

Improve TransformersModel UX #12785

Improve TransformersModel UX #12785

Conversation

hmellor commented Feb 5, 2025 • edited by github-actions bot Loading

github-actions bot commented Feb 5, 2025

hmellor Feb 5, 2025

Choose a reason for hiding this comment

Improve `TransformersModel` UX #12785

Improve `TransformersModel` UX #12785

hmellor commented Feb 5, 2025 •

edited by github-actions bot

Loading