Skip to content

[etLLM] extension/llm should have unit tests #8495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
larryliu0820 opened this issue Feb 14, 2025 · 0 comments · Fixed by #10228
Closed

[etLLM] extension/llm should have unit tests #8495

larryliu0820 opened this issue Feb 14, 2025 · 0 comments · Fixed by #10228
Assignees
Labels
module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@larryliu0820
Copy link
Contributor

larryliu0820 commented Feb 14, 2025

🚀 The feature, motivation and pitch

TSIA

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

cc @mergennachin @cccclai @helunwencser @jackzhxng

@larryliu0820 larryliu0820 self-assigned this Feb 14, 2025
@github-project-automation github-project-automation bot moved this to To triage in ExecuTorch Core Feb 14, 2025
@jackzhxng jackzhxng added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code labels Feb 14, 2025
@larryliu0820 larryliu0820 moved this from To triage to Ready in ExecuTorch Core Feb 18, 2025
facebook-github-bot pushed a commit that referenced this issue Apr 16, 2025
Summary:
Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 16, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
larryliu0820 added a commit that referenced this issue Apr 16, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 16, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
larryliu0820 added a commit that referenced this issue Apr 16, 2025
Summary:
Pull Request resolved: #10228

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`.

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
larryliu0820 added a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
larryliu0820 added a commit that referenced this issue Apr 17, 2025
Summary:
Pull Request resolved: #10228

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`.

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
larryliu0820 added a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
larryliu0820 added a commit that referenced this issue Apr 17, 2025
Summary:
Pull Request resolved: #10228

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`.

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
larryliu0820 added a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
larryliu0820 added a commit that referenced this issue Apr 17, 2025
Summary:
Pull Request resolved: #10228

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`.

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
facebook-github-bot pushed a commit that referenced this issue Apr 17, 2025
Summary:

Started to implement #9341
Started to fix #8495

This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. 

For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`.

Similarly we put `echo` and `warming` into the config.

We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens.

Reviewed By: iseeyuan

Differential Revision: D73091676
@github-project-automation github-project-automation bot moved this from Ready to Done in ExecuTorch Core Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: Done
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants