Add LLaMA support to embed_to_distrib by yaswanth169 · Pull Request #235 · stanfordnlp/pyvene

yaswanth169 · 2026-03-07T20:41:31Z

Description

Implements the LLaMA branch in embed_to_distrib() in pyvene/models/basic_utils.py, removing the previous assert False so causal tracing and embed→vocab distribution work for LLaMA/LlamaForCausalLM.
Handles model.config.architectures being None when models are built from config (e.g. in tests) by falling back to type(model).__name__.
Keeps GPT-2 behavior and supports both GPT2Model and GPT2LMHeadModel by using model.wte or model.transformer.wte as appropriate.
Adds unit tests in tests/unit_tests/BasicUtilsTestCase.py for GPT-2 and LLaMA (logits=True and softmax output, shape and sum-to-one checks).

Testing Done

Ran: python -m unittest tests.unit_tests.BasicUtilsTestCase -v (with PYTHONPATH set to repo root).
All 4 tests pass: test_embed_to_distrib_gpt2_logits, test_embed_to_distrib_gpt2_softmax, test_embed_to_distrib_llama_logits, test_embed_to_distrib_llama_softmax.
Also ran tests.unit_tests.CausalModelTestCase with no regressions.

Checklist:

My PR title strictly follows the format
I have attached the testing log above
I provide enough comments to my code
I have changed documentations
I have added tests for my changes

Authors

Yashwanth Devavarapu
Rakshitha Ireddi

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d6a68f181

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-07T20:45:26Z

+            wte = model.wte if hasattr(model, "wte") else model.transformer.wte
+            vocab = torch.matmul(embed, wte.weight.t())


Use GPT-2 LM head for vocab projection

This branch now handles GPT2LMHeadModel, but it always projects through wte.weight; that gives incorrect logits whenever output embeddings are not tied to input embeddings (e.g., tie_word_embeddings=False or checkpoints where lm_head has diverged during fine-tuning). In those valid GPT-2 configurations, embed_to_distrib will silently return the wrong distribution, so the projection should prefer model.lm_head when present and only fall back to wte for models without an LM head.

Useful? React with 👍 / 👎.

Add LLaMA support to embed_to_distrib

5d6a68f

chatgpt-codex-connector Bot reviewed Mar 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLaMA support to embed_to_distrib#235

Add LLaMA support to embed_to_distrib#235
yaswanth169 wants to merge 1 commit into
stanfordnlp:mainfrom
yaswanth169:feature/embed-to-distrib-llama

yaswanth169 commented Mar 7, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		wte = model.wte if hasattr(model, "wte") else model.transformer.wte
		vocab = torch.matmul(embed, wte.weight.t())

Conversation

yaswanth169 commented Mar 7, 2026

Description

Testing Done

Checklist:

Authors

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant