Deprecate `start_chat()` / `finish_chat()` API in LLM pipeline, update samples #3217

yatarkan · 2026-01-22T14:04:50Z

Description

This PR deprecates start_chat() / finish_chat() API in LLM pipeline with respect to ChatHistory usage in generate() method.
It also updates remaining samples to use ChatHistory.

CVS-170885

Checklist:

Tests have been updated or added to cover the new code.
This patch fully addresses the ticket.
I have made corresponding changes to the documentation.

Copilot

Pull request overview

This PR deprecates the start_chat() and finish_chat() API methods in the LLMPipeline in favor of using ChatHistory with the generate() method. Deprecation warnings have been added to both C++ and Python bindings.

Changes:

Added deprecation warnings to start_chat() and finish_chat() methods in C++ and Python
Updated Python and C++ samples to use ChatHistory instead of the deprecated chat API
Migrated structured output generation samples to the new chat history pattern

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
src/python/py_llm_pipeline.cpp	Added deprecation warnings to Python bindings for `start_chat()` and `finish_chat()`
src/cpp/src/llm/pipeline.cpp	Added deprecation warnings to C++ implementation
src/cpp/include/openvino/genai/llm_pipeline.hpp	Added OPENVINO_DEPRECATED macro to method declarations
samples/python/text_generation/structured_output_generation.py	Migrated sample to use `ChatHistory` instead of deprecated chat methods
samples/python/text_generation/structural_tags_generation.py	Migrated sample to use `ChatHistory` instead of deprecated chat methods
samples/cpp/text_generation/structured_output_generation.cpp	Migrated C++ sample to use `ChatHistory` instead of deprecated chat methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/python/text_generation/structured_output_generation.py

apaniukov · 2026-01-22T14:27:33Z

samples/python/text_generation/structured_output_generation.py

+        json_response = decoded_results.texts[0]
        res = json.loads(json_response)
-        pipe.finish_chat()
        print(f"Generated JSON with item quantities: {json_response}")


Suggested change

json_response = decoded_results.texts[0]

res = json.loads(json_response)

pipe.finish_chat()

print(f"Generated JSON with item quantities: {json_response}")

res = json.loads(decoded_results.texts[0])

print(f"Generated JSON with item quantities: {res}")

We need to print model response (not deserialized printed json) as this is compared with JS sample and should be aligned.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/python/text_generation/structured_output_generation.py

apaniukov · 2026-01-22T14:34:11Z

samples/python/text_generation/structural_tags_generation.py

@@ -134,8 +137,11 @@ def main():
                )
            )
            config.do_sample = True
-        response = pipe.generate(args.prompt, config, streamer=streamer)
-        pipe.finish_chat()
+
+        history.append({"role": "user", "content": args.prompt})


History creation can be done outside the loop now, because the system and user prompt is the same.

apaniukov · 2026-01-22T14:35:30Z

samples/python/text_generation/structural_tags_generation.py

+        history.append({"role": "user", "content": args.prompt})
+        decoded_results = pipe.generate(history, config, streamer=streamer)
+        response = decoded_results.texts[0]
+        history.append({"role": "assistant", "content": response})


Suggested change

history.append({"role": "assistant", "content": response})

We don't need to add the assistant answer to the history, just parse it for the tool calls.

Here I wanted to highlight that user needs to add assistant's response to history manually (and to be aligned with other samples). But can remove if not needed.
And don't we need to clear history in loop iterations here?

We don't need to modify the history in the loop. The idea is to pass the same history with and without structured output config and compare the outputs, so no modification to the [system, prompt] history is needed.

as-suvorov · 2026-01-22T16:11:06Z

src/cpp/include/openvino/genai/llm_pipeline.hpp

    * @param system_message optional system message.
    */
+    OPENVINO_DEPRECATED(
+        "start_chat() / finish_chat() API is deprecated and will be removed in future releases. "


Let's specify in which release explicitly and create a ticket for removal. @Wovchena should it be 2026.2?

As discussed, updated to "in the next major release"

This reverts commit 6a3a6fe.

This reverts commit 535ba05.

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-23T12:49:51Z

src/js/lib/pipelines/llmPipeline.ts

+    console.warn(
+      "DEPRECATION WARNING: startChat() / finishChat() API is deprecated and will be removed in the next major release.",
+      "Please, use generate() with ChatHistory argument.",
+    );


The deprecation message is split across two separate string arguments to console.warn(). This will output them as separate items rather than a single cohesive message. Combine them into a single string for better readability.

Copilot · 2026-01-23T12:49:52Z

src/js/lib/pipelines/llmPipeline.ts

+    console.warn(
+      "DEPRECATION WARNING: startChat() / finishChat() API is deprecated and will be removed in the next major release.",
+      "Please, use generate() with ChatHistory argument.",
+    );


The deprecation message is split across two separate string arguments to console.warn(). This will output them as separate items rather than a single cohesive message. Combine them into a single string for better readability.

rkazants · 2026-01-26T07:55:24Z

samples/cpp/text_generation/structured_output_generation.cpp

        std::cout << "\n----------\n"
            "> ";
    }
-    pipe.finish_chat();


Shall we show how to finally use (or can be useful) ChatHistory in a sample when it has been filled after generation?

I suppose this is not the main purpose of this sample. ChatHistory details will be covered in docs and other samples.

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-26T08:29:57Z

src/python/py_llm_pipeline.cpp

+                         "start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "
+                         "Please use generate() with ChatHistory argument.",


The deprecation message should be more consistent across all languages. In C++ and Python the message says 'Please use generate() with ChatHistory argument' but in JavaScript it says 'Please, use generate() with ChatHistory argument' (with a comma after 'Please'). Remove the comma for consistency.

Copilot · 2026-01-26T08:29:57Z

src/cpp/src/llm/pipeline.cpp

+               "Please, use generate() with ChatHistory argument.");
    m_pimpl->start_chat(system_message);
 }

 void ov::genai::LLMPipeline::finish_chat() {
+    GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "
+               "Please, use generate() with ChatHistory argument.");


The deprecation message should be more consistent across all languages. In C++ and Python the message says 'Please, use generate() with ChatHistory argument' (with a comma after 'Please') but in Python it says 'Please use generate() with ChatHistory argument' (without a comma). Remove the comma for consistency with the Python version.

Suggested change

"Please, use generate() with ChatHistory argument.");

m_pimpl->start_chat(system_message);

}

void ov::genai::LLMPipeline::finish_chat() {

GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "

"Please, use generate() with ChatHistory argument.");

"Please use generate() with ChatHistory argument.");

m_pimpl->start_chat(system_message);

}

void ov::genai::LLMPipeline::finish_chat() {

GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "

"Please use generate() with ChatHistory argument.");

Copilot · 2026-01-26T08:29:58Z

src/cpp/src/llm/pipeline.cpp

+               "Please, use generate() with ChatHistory argument.");
    m_pimpl->start_chat(system_message);
 }

 void ov::genai::LLMPipeline::finish_chat() {
+    GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "
+               "Please, use generate() with ChatHistory argument.");


The deprecation message should be more consistent across all languages. In C++ and Python the message says 'Please, use generate() with ChatHistory argument' (with a comma after 'Please') but in Python it says 'Please use generate() with ChatHistory argument' (without a comma). Remove the comma for consistency with the Python version.

Suggested change

"Please, use generate() with ChatHistory argument.");

m_pimpl->start_chat(system_message);

}

void ov::genai::LLMPipeline::finish_chat() {

GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "

"Please, use generate() with ChatHistory argument.");

"Please use generate() with ChatHistory argument.");

m_pimpl->start_chat(system_message);

}

void ov::genai::LLMPipeline::finish_chat() {

GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "

"Please use generate() with ChatHistory argument.");

Copilot · 2026-01-26T08:29:58Z

samples/cpp/text_generation/structured_output_generation.cpp

-    pipe.start_chat(sys_message);
+    ov::genai::ChatHistory chat_history;
+
+    chat_history.push_back({{"role", "system"}, {"content", std::move(sys_message)}});


Using std::move(sys_message) here prevents sys_message from being used later in the function. While the code works because sys_message is not used after this line, this creates a subtle maintenance issue. Consider using sys_message without std::move for clarity, as the string is relatively small and the optimization is negligible for a one-time system message.

Suggested change

chat_history.push_back({{"role", "system"}, {"content", std::move(sys_message)}});

chat_history.push_back({{"role", "system"}, {"content", sys_message}});

Copilot · 2026-01-26T08:29:58Z

samples/cpp/text_generation/structured_output_generation.cpp


    while (std::getline(std::cin, prompt)) {
-        pipe.generate(prompt, config, streamer);
+        chat_history.push_back({{"role", "user"}, {"content", std::move(prompt)}});


Using std::move(prompt) here invalidates prompt which is still needed for the loop condition std::getline(std::cin, prompt) in the next iteration. While std::getline will overwrite the string, this creates an unnecessary dependency on that behavior. Remove std::move to make the code more robust.

Suggested change

chat_history.push_back({{"role", "user"}, {"content", std::move(prompt)}});

chat_history.push_back({{"role", "user"}, {"content", prompt}});

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-27T04:52:04Z

src/js/lib/pipelines/llmPipeline.ts

  async startChat(systemMessage: string = "") {
+    console.warn(
+      "DEPRECATION WARNING: startChat() / finishChat() API is deprecated and will be removed in the next major release.",
+      "Please, use generate() with ChatHistory argument.",


Extra comma after 'Please' is inconsistent with the C++ and Python deprecation messages which use 'Please use' without a comma.

Copilot · 2026-01-27T04:52:05Z

src/js/lib/pipelines/llmPipeline.ts

  async finishChat() {
+    console.warn(
+      "DEPRECATION WARNING: startChat() / finishChat() API is deprecated and will be removed in the next major release.",
+      "Please, use generate() with ChatHistory argument.",


Extra comma after 'Please' is inconsistent with the C++ and Python deprecation messages which use 'Please use' without a comma.

Copilot · 2026-01-27T04:52:05Z

src/cpp/src/llm/pipeline.cpp

+               "Please, use generate() with ChatHistory argument.");
    m_pimpl->start_chat(system_message);
 }

 void ov::genai::LLMPipeline::finish_chat() {
+    GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "
+               "Please, use generate() with ChatHistory argument.");


Extra comma after 'Please' is inconsistent with the Python deprecation message and standard English usage.

Suggested change

"Please, use generate() with ChatHistory argument.");

m_pimpl->start_chat(system_message);

}

void ov::genai::LLMPipeline::finish_chat() {

GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "

"Please, use generate() with ChatHistory argument.");

"Please use generate() with ChatHistory argument.");

m_pimpl->start_chat(system_message);

}

void ov::genai::LLMPipeline::finish_chat() {

GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "

"Please use generate() with ChatHistory argument.");

Copilot · 2026-01-27T04:52:05Z

src/cpp/src/llm/pipeline.cpp

+               "Please, use generate() with ChatHistory argument.");
    m_pimpl->start_chat(system_message);
 }

 void ov::genai::LLMPipeline::finish_chat() {
+    GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "
+               "Please, use generate() with ChatHistory argument.");


Extra comma after 'Please' is inconsistent with the Python deprecation message and standard English usage.

Suggested change

"Please, use generate() with ChatHistory argument.");

m_pimpl->start_chat(system_message);

}

void ov::genai::LLMPipeline::finish_chat() {

GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "

"Please, use generate() with ChatHistory argument.");

"Please use generate() with ChatHistory argument.");

m_pimpl->start_chat(system_message);

}

void ov::genai::LLMPipeline::finish_chat() {

GENAI_WARN("start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "

"Please use generate() with ChatHistory argument.");

Copilot · 2026-01-27T04:52:05Z

src/cpp/include/openvino/genai/llm_pipeline.hpp

    */
+    OPENVINO_DEPRECATED(
+        "start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "
+        "Please, use generate() with ChatHistory argument.")


Extra comma after 'Please' is inconsistent with the Python deprecation message and standard English usage.

Copilot · 2026-01-27T04:52:06Z

src/cpp/include/openvino/genai/llm_pipeline.hpp

    */
+    OPENVINO_DEPRECATED(
+        "start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "
+        "Please, use generate() with ChatHistory argument.")


Extra comma after 'Please' is inconsistent with the Python deprecation message and standard English usage.

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T21:18:11Z

samples/python/text_generation/structured_output_generation.py

-                json_strs = pipe.generate(prompt, config)
+                decoded_results = pipe.generate(history, config)
+                json_strs = decoded_results.texts[0]
+                # Validate generated JSON


This comment does not add value as the code below (json.loads(json_strs)) is self-explanatory. Consider removing the comment or making it more descriptive about what validation errors are expected or how failures are handled.

Suggested change

# Validate generated JSON

# Validate that the model output is well-formed JSON; this will raise if parsing fails

Copilot · 2026-01-28T21:18:12Z

samples/cpp/visual_language_chat/visual_language_chat.cpp

-                  ov::genai::generation_config(generation_config),
-                  ov::genai::streamer(print_subword));
+
+    history.push_back({{"role", "user"}, {"content", std::move(prompt)}});


Using std::move on prompt here makes it unavailable for later use. While this works in the current code flow, it could lead to subtle bugs if the code is refactored. Consider whether this optimization is necessary given that prompt is reassigned in the loop anyway.

Copilot · 2026-01-28T21:18:12Z

samples/cpp/visual_language_chat/video_to_text_chat.cpp

-                  ov::genai::generation_config(generation_config),
-                  ov::genai::streamer(print_subword));
+
+    history.push_back({{"role", "user"}, {"content", std::move(prompt)}});


Using std::move on prompt here makes it unavailable for later use. While this works in the current code flow, it could lead to subtle bugs if the code is refactored. Consider whether this optimization is necessary given that prompt is reassigned in the loop anyway.

Copilot · 2026-01-28T21:18:12Z

samples/cpp/text_generation/structured_output_generation.cpp

-    pipe.start_chat(sys_message);
+    ov::genai::ChatHistory chat_history;
+
+    chat_history.push_back({{"role", "system"}, {"content", std::move(sys_message)}});


Using std::move on sys_message makes it unavailable after this line. This is problematic because if the loop iterates, sys_message would be in an undefined state. Ensure this variable is not used again, or avoid using std::move here.

yatarkan added 4 commits January 22, 2026 17:13

Deprecate start_chat / finish_chat in LLM pipeline

a4a7028

Use chat history in structured_output_generation C++ sample

78bb556

Use chat history in structured_output_generation Python sample

1af0600

Use chat history in structural_tags_generation Python sample

25b3a7e

yatarkan requested review from Wovchena, apaniukov and pavel-esir January 22, 2026 14:04

yatarkan requested a review from as-suvorov as a code owner January 22, 2026 14:04

Copilot AI review requested due to automatic review settings January 22, 2026 14:04

github-actions bot added category: LLM LLM pipeline (stateful, static) category: Python API Python API for GenAI category: LLM samples GenAI LLM samples category: CPP API Changes in GenAI C++ public headers category: Structured Output samples labels Jan 22, 2026

Copilot AI reviewed Jan 22, 2026

View reviewed changes

samples/python/text_generation/structured_output_generation.py Show resolved Hide resolved

apaniukov reviewed Jan 22, 2026

View reviewed changes

samples/python/text_generation/structured_output_generation.py Show resolved Hide resolved

Print deserialized json response

535ba05

apaniukov reviewed Jan 22, 2026

View reviewed changes

Review comment

6a3a6fe

Copilot AI review requested due to automatic review settings January 22, 2026 14:29

Copilot AI reviewed Jan 22, 2026

View reviewed changes

samples/python/text_generation/structured_output_generation.py Outdated Show resolved Hide resolved

yatarkan requested a review from apaniukov January 22, 2026 14:31

apaniukov reviewed Jan 22, 2026

View reviewed changes

apaniukov approved these changes Jan 22, 2026

View reviewed changes

Switch Python & C++ VLM chat samples to chat history usage

504973c

as-suvorov reviewed Jan 22, 2026

View reviewed changes

as-suvorov approved these changes Jan 22, 2026

View reviewed changes

yatarkan added 2 commits January 22, 2026 20:39

Revert "Review comment"

6eecfbf

This reverts commit 6a3a6fe.

Revert "Print deserialized json response"

979f61e

This reverts commit 535ba05.

yatarkan added 4 commits January 23, 2026 16:23

Align structural_tags_generation python and JS tests

c8a74d9

Align structured_output_generation python and JS samples

b8ea549

Update deprecation message

832d9d2

Merge branch 'master' into yt/deprecate-start-chat-llm

7e062be

yatarkan requested review from Wovchena, apaniukov and Copilot January 23, 2026 12:49

Copilot AI reviewed Jan 23, 2026

View reviewed changes

Add move

fe1b990

rkazants reviewed Jan 26, 2026

View reviewed changes

rkazants approved these changes Jan 26, 2026

View reviewed changes

Add move for history in samples

3be8ace

Copilot AI review requested due to automatic review settings January 26, 2026 08:29

Copilot AI reviewed Jan 26, 2026

View reviewed changes

yatarkan and others added 2 commits January 26, 2026 20:45

Reset KV cache on chat history divergence in stateful LLM pipeline

991f119

Merge branch 'master' into yt/deprecate-start-chat-llm

20229f0

Copilot AI review requested due to automatic review settings January 27, 2026 04:51

Copilot AI reviewed Jan 27, 2026

View reviewed changes

Merge branch 'master' into yt/deprecate-start-chat-llm

8d75579

Wovchena approved these changes Jan 28, 2026

View reviewed changes

Wovchena added this pull request to the merge queue Jan 28, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 28, 2026

yatarkan added this pull request to the merge queue Jan 28, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 28, 2026

yatarkan added this pull request to the merge queue Jan 28, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 28, 2026

Merge branch 'master' into yt/deprecate-start-chat-llm

9de61f1

Copilot AI review requested due to automatic review settings January 28, 2026 21:17

yatarkan enabled auto-merge January 28, 2026 21:17

Copilot AI reviewed Jan 28, 2026

View reviewed changes

		"start_chat() / finish_chat() API is deprecated and will be removed in the next major release. "
		"Please use generate() with ChatHistory argument.",

	chat_history.push_back({{"role", "system"}, {"content", std::move(sys_message)}});
	chat_history.push_back({{"role", "system"}, {"content", sys_message}});

	chat_history.push_back({{"role", "user"}, {"content", std::move(prompt)}});
	chat_history.push_back({{"role", "user"}, {"content", prompt}});

	# Validate generated JSON
	# Validate that the model output is well-formed JSON; this will raise if parsing fails

Deprecate start_chat() / finish_chat() API in LLM pipeline, update samples #3217

Are you sure you want to change the base?

Deprecate start_chat() / finish_chat() API in LLM pipeline, update samples #3217

Conversation

yatarkan commented Jan 22, 2026

Description

Checklist:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yatarkan Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

rkazants Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Deprecate `start_chat()` / `finish_chat()` API in LLM pipeline, update samples #3217

Deprecate `start_chat()` / `finish_chat()` API in LLM pipeline, update samples #3217

yatarkan Jan 23, 2026 •

edited

Loading

rkazants Jan 26, 2026 •

edited

Loading