-
Notifications
You must be signed in to change notification settings - Fork 332
Deprecate start_chat() / finish_chat() API in LLM pipeline, update samples
#3217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Deprecate start_chat() / finish_chat() API in LLM pipeline, update samples
#3217
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR deprecates the start_chat() and finish_chat() API methods in the LLMPipeline in favor of using ChatHistory with the generate() method. Deprecation warnings have been added to both C++ and Python bindings.
Changes:
- Added deprecation warnings to
start_chat()andfinish_chat()methods in C++ and Python - Updated Python and C++ samples to use
ChatHistoryinstead of the deprecated chat API - Migrated structured output generation samples to the new chat history pattern
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/python/py_llm_pipeline.cpp | Added deprecation warnings to Python bindings for start_chat() and finish_chat() |
| src/cpp/src/llm/pipeline.cpp | Added deprecation warnings to C++ implementation |
| src/cpp/include/openvino/genai/llm_pipeline.hpp | Added OPENVINO_DEPRECATED macro to method declarations |
| samples/python/text_generation/structured_output_generation.py | Migrated sample to use ChatHistory instead of deprecated chat methods |
| samples/python/text_generation/structural_tags_generation.py | Migrated sample to use ChatHistory instead of deprecated chat methods |
| samples/cpp/text_generation/structured_output_generation.cpp | Migrated C++ sample to use ChatHistory instead of deprecated chat methods |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| json_response = decoded_results.texts[0] | ||
| res = json.loads(json_response) | ||
| pipe.finish_chat() | ||
| print(f"Generated JSON with item quantities: {json_response}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| json_response = decoded_results.texts[0] | |
| res = json.loads(json_response) | |
| pipe.finish_chat() | |
| print(f"Generated JSON with item quantities: {json_response}") | |
| res = json.loads(decoded_results.texts[0]) | |
| print(f"Generated JSON with item quantities: {res}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to print model response (not deserialized printed json) as this is compared with JS sample and should be aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -134,8 +137,11 @@ def main(): | |||
| ) | |||
| ) | |||
| config.do_sample = True | |||
| response = pipe.generate(args.prompt, config, streamer=streamer) | |||
| pipe.finish_chat() | |||
|
|
|||
| history.append({"role": "user", "content": args.prompt}) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
History creation can be done outside the loop now, because the system and user prompt is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| history.append({"role": "user", "content": args.prompt}) | ||
| decoded_results = pipe.generate(history, config, streamer=streamer) | ||
| response = decoded_results.texts[0] | ||
| history.append({"role": "assistant", "content": response}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| history.append({"role": "assistant", "content": response}) |
We don't need to add the assistant answer to the history, just parse it for the tool calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I wanted to highlight that user needs to add assistant's response to history manually (and to be aligned with other samples). But can remove if not needed.
And don't we need to clear history in loop iterations here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to modify the history in the loop. The idea is to pass the same history with and without structured output config and compare the outputs, so no modification to the [system, prompt] history is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| * @param system_message optional system message. | ||
| */ | ||
| OPENVINO_DEPRECATED( | ||
| "start_chat() / finish_chat() API is deprecated and will be removed in future releases. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's specify in which release explicitly and create a ticket for removal. @Wovchena should it be 2026.2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, updated to "in the next major release"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| console.warn( | ||
| "DEPRECATION WARNING: startChat() / finishChat() API is deprecated and will be removed in the next major release.", | ||
| "Please, use generate() with ChatHistory argument.", | ||
| ); |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation message is split across two separate string arguments to console.warn(). This will output them as separate items rather than a single cohesive message. Combine them into a single string for better readability.
| console.warn( | ||
| "DEPRECATION WARNING: startChat() / finishChat() API is deprecated and will be removed in the next major release.", | ||
| "Please, use generate() with ChatHistory argument.", | ||
| ); |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation message is split across two separate string arguments to console.warn(). This will output them as separate items rather than a single cohesive message. Combine them into a single string for better readability.
| std::cout << "\n----------\n" | ||
| "> "; | ||
| } | ||
| pipe.finish_chat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we show how to finally use (or can be useful) ChatHistory in a sample when it has been filled after generation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this is not the main purpose of this sample. ChatHistory details will be covered in docs and other samples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | ||
| "Please use generate() with ChatHistory argument.", |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation message should be more consistent across all languages. In C++ and Python the message says 'Please use generate() with ChatHistory argument' but in JavaScript it says 'Please, use generate() with ChatHistory argument' (with a comma after 'Please'). Remove the comma for consistency.
| "Please, use generate() with ChatHistory argument."); | ||
| m_pimpl->start_chat(system_message); | ||
| } | ||
|
|
||
| void ov::genai::LLMPipeline::finish_chat() { | ||
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | ||
| "Please, use generate() with ChatHistory argument."); |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation message should be more consistent across all languages. In C++ and Python the message says 'Please, use generate() with ChatHistory argument' (with a comma after 'Please') but in Python it says 'Please use generate() with ChatHistory argument' (without a comma). Remove the comma for consistency with the Python version.
| "Please, use generate() with ChatHistory argument."); | |
| m_pimpl->start_chat(system_message); | |
| } | |
| void ov::genai::LLMPipeline::finish_chat() { | |
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | |
| "Please, use generate() with ChatHistory argument."); | |
| "Please use generate() with ChatHistory argument."); | |
| m_pimpl->start_chat(system_message); | |
| } | |
| void ov::genai::LLMPipeline::finish_chat() { | |
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | |
| "Please use generate() with ChatHistory argument."); |
| "Please, use generate() with ChatHistory argument."); | ||
| m_pimpl->start_chat(system_message); | ||
| } | ||
|
|
||
| void ov::genai::LLMPipeline::finish_chat() { | ||
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | ||
| "Please, use generate() with ChatHistory argument."); |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation message should be more consistent across all languages. In C++ and Python the message says 'Please, use generate() with ChatHistory argument' (with a comma after 'Please') but in Python it says 'Please use generate() with ChatHistory argument' (without a comma). Remove the comma for consistency with the Python version.
| "Please, use generate() with ChatHistory argument."); | |
| m_pimpl->start_chat(system_message); | |
| } | |
| void ov::genai::LLMPipeline::finish_chat() { | |
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | |
| "Please, use generate() with ChatHistory argument."); | |
| "Please use generate() with ChatHistory argument."); | |
| m_pimpl->start_chat(system_message); | |
| } | |
| void ov::genai::LLMPipeline::finish_chat() { | |
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | |
| "Please use generate() with ChatHistory argument."); |
| pipe.start_chat(sys_message); | ||
| ov::genai::ChatHistory chat_history; | ||
|
|
||
| chat_history.push_back({{"role", "system"}, {"content", std::move(sys_message)}}); |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::move(sys_message) here prevents sys_message from being used later in the function. While the code works because sys_message is not used after this line, this creates a subtle maintenance issue. Consider using sys_message without std::move for clarity, as the string is relatively small and the optimization is negligible for a one-time system message.
| chat_history.push_back({{"role", "system"}, {"content", std::move(sys_message)}}); | |
| chat_history.push_back({{"role", "system"}, {"content", sys_message}}); |
|
|
||
| while (std::getline(std::cin, prompt)) { | ||
| pipe.generate(prompt, config, streamer); | ||
| chat_history.push_back({{"role", "user"}, {"content", std::move(prompt)}}); |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::move(prompt) here invalidates prompt which is still needed for the loop condition std::getline(std::cin, prompt) in the next iteration. While std::getline will overwrite the string, this creates an unnecessary dependency on that behavior. Remove std::move to make the code more robust.
| chat_history.push_back({{"role", "user"}, {"content", std::move(prompt)}}); | |
| chat_history.push_back({{"role", "user"}, {"content", prompt}}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async startChat(systemMessage: string = "") { | ||
| console.warn( | ||
| "DEPRECATION WARNING: startChat() / finishChat() API is deprecated and will be removed in the next major release.", | ||
| "Please, use generate() with ChatHistory argument.", |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra comma after 'Please' is inconsistent with the C++ and Python deprecation messages which use 'Please use' without a comma.
| async finishChat() { | ||
| console.warn( | ||
| "DEPRECATION WARNING: startChat() / finishChat() API is deprecated and will be removed in the next major release.", | ||
| "Please, use generate() with ChatHistory argument.", |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra comma after 'Please' is inconsistent with the C++ and Python deprecation messages which use 'Please use' without a comma.
| "Please, use generate() with ChatHistory argument."); | ||
| m_pimpl->start_chat(system_message); | ||
| } | ||
|
|
||
| void ov::genai::LLMPipeline::finish_chat() { | ||
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | ||
| "Please, use generate() with ChatHistory argument."); |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra comma after 'Please' is inconsistent with the Python deprecation message and standard English usage.
| "Please, use generate() with ChatHistory argument."); | |
| m_pimpl->start_chat(system_message); | |
| } | |
| void ov::genai::LLMPipeline::finish_chat() { | |
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | |
| "Please, use generate() with ChatHistory argument."); | |
| "Please use generate() with ChatHistory argument."); | |
| m_pimpl->start_chat(system_message); | |
| } | |
| void ov::genai::LLMPipeline::finish_chat() { | |
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | |
| "Please use generate() with ChatHistory argument."); |
| "Please, use generate() with ChatHistory argument."); | ||
| m_pimpl->start_chat(system_message); | ||
| } | ||
|
|
||
| void ov::genai::LLMPipeline::finish_chat() { | ||
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | ||
| "Please, use generate() with ChatHistory argument."); |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra comma after 'Please' is inconsistent with the Python deprecation message and standard English usage.
| "Please, use generate() with ChatHistory argument."); | |
| m_pimpl->start_chat(system_message); | |
| } | |
| void ov::genai::LLMPipeline::finish_chat() { | |
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | |
| "Please, use generate() with ChatHistory argument."); | |
| "Please use generate() with ChatHistory argument."); | |
| m_pimpl->start_chat(system_message); | |
| } | |
| void ov::genai::LLMPipeline::finish_chat() { | |
| GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | |
| "Please use generate() with ChatHistory argument."); |
| */ | ||
| OPENVINO_DEPRECATED( | ||
| "start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | ||
| "Please, use generate() with ChatHistory argument.") |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra comma after 'Please' is inconsistent with the Python deprecation message and standard English usage.
| */ | ||
| OPENVINO_DEPRECATED( | ||
| "start_chat() / finish_chat() API is deprecated and will be removed in the next major release. " | ||
| "Please, use generate() with ChatHistory argument.") |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra comma after 'Please' is inconsistent with the Python deprecation message and standard English usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| json_strs = pipe.generate(prompt, config) | ||
| decoded_results = pipe.generate(history, config) | ||
| json_strs = decoded_results.texts[0] | ||
| # Validate generated JSON |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment does not add value as the code below (json.loads(json_strs)) is self-explanatory. Consider removing the comment or making it more descriptive about what validation errors are expected or how failures are handled.
| # Validate generated JSON | |
| # Validate that the model output is well-formed JSON; this will raise if parsing fails |
| ov::genai::generation_config(generation_config), | ||
| ov::genai::streamer(print_subword)); | ||
|
|
||
| history.push_back({{"role", "user"}, {"content", std::move(prompt)}}); |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::move on prompt here makes it unavailable for later use. While this works in the current code flow, it could lead to subtle bugs if the code is refactored. Consider whether this optimization is necessary given that prompt is reassigned in the loop anyway.
| ov::genai::generation_config(generation_config), | ||
| ov::genai::streamer(print_subword)); | ||
|
|
||
| history.push_back({{"role", "user"}, {"content", std::move(prompt)}}); |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::move on prompt here makes it unavailable for later use. While this works in the current code flow, it could lead to subtle bugs if the code is refactored. Consider whether this optimization is necessary given that prompt is reassigned in the loop anyway.
| pipe.start_chat(sys_message); | ||
| ov::genai::ChatHistory chat_history; | ||
|
|
||
| chat_history.push_back({{"role", "system"}, {"content", std::move(sys_message)}}); |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::move on sys_message makes it unavailable after this line. This is problematic because if the loop iterates, sys_message would be in an undefined state. Ensure this variable is not used again, or avoid using std::move here.
Description
This PR deprecates
start_chat()/finish_chat()API in LLM pipeline with respect toChatHistoryusage ingenerate()method.It also updates remaining samples to use
ChatHistory.CVS-170885
Checklist: