-
Notifications
You must be signed in to change notification settings - Fork 11.5k
server : use std::move whenever possible #12936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
examples/server/server.cpp
Outdated
@@ -2105,7 +2105,7 @@ struct server_context { | |||
return true; | |||
} | |||
|
|||
bool launch_slot_with_task(server_slot & slot, const server_task & task) { | |||
bool launch_slot_with_task(server_slot & slot, const server_task && task) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool launch_slot_with_task(server_slot & slot, const server_task && task) { | |
bool launch_slot_with_task(server_slot & slot, server_task && task) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also move the params
and prompt_tokens
from task to slot, so I think this is currently not possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is not possible? With the const
present, the moves of params
and prompt_tokens
are actually copies. Removing the const
from the argument will make them proper moves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the prompt_tokens will soon contains image tokens as unique_ptr, so it will require a proper move.
And even without image (the current case where we have only text tokens), I think a proper move still make sense here. This funtion is the last place where the task want to go (i.e. we expect task to be destroyed here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah now looking back at this, I realized that I'm thinking the reverted (I thought that you want to add the const
) ; Sorry for the confusion 🥲
examples/server/server.cpp
Outdated
server_task task(SERVER_TASK_TYPE_METRICS); | ||
task.id = ctx_server.queue_tasks.get_new_id(); | ||
ctx_server.queue_results.add_waiting_task_id(task.id); | ||
ctx_server.queue_tasks.post(task, true); // high-priority task | ||
ctx_server.queue_tasks.post(std::move(task), true); // high-priority task | ||
|
||
// get the result | ||
server_task_result_ptr result = ctx_server.queue_results.recv(task.id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task
should not be referenced after being moved. We can make this safer like this:
const auto id = ctx_server.queue_tasks.get_new_id();
{
server_task task(SERVER_TASK_TYPE_METRICS);
task.id = id;
ctx_server.queue_results.add_waiting_task_id(task.id);
ctx_server.queue_tasks.post(std::move(task), true); // high-priority task
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a good idea. I also spotted some places where task
is used after the move. It should be fixed in my last commit: 9487165
Co-authored-by: Georgi Gerganov <[email protected]>
examples/server/server.cpp
Outdated
queue_tasks.push_front(task); | ||
} else { | ||
queue_tasks.push_back(std::move(task)); | ||
queue_tasks.push_back(task); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without these moves, the task will be copied here. Better to keep them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, seems like I mistakenly removed this move. Added them back in 240ea24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few updates and good to merge:
diff --git a/examples/server/server.cpp b/examples/server/server.cpp
index 528d4e6a..c580ec12 100644
--- a/examples/server/server.cpp
+++ b/examples/server/server.cpp
@@ -1563,14 +1563,15 @@ struct server_queue {
if (task.type == SERVER_TASK_TYPE_CANCEL) {
cleanup_pending_task(task.id_target);
}
- QUE_DBG("new task, id = %d, front = %d\n", task.id, front);
+ const int task_id = task.id;
+ QUE_DBG("new task, id = %d, front = %d\n", task_id, front);
if (front) {
queue_tasks.push_front(std::move(task));
} else {
queue_tasks.push_back(std::move(task));
}
condition_tasks.notify_one();
- return task.id;
+ return task_id;
}
// multi-task version of post()
@@ -2105,7 +2106,7 @@ struct server_context {
return true;
}
- bool launch_slot_with_task(server_slot & slot, const server_task && task) {
+ bool launch_slot_with_task(server_slot & slot, server_task && task) {
slot.reset();
slot.id_task = task.id;
slot.index = task.index;
@@ -2113,10 +2114,10 @@ struct server_context {
slot.params = std::move(task.params);
slot.prompt_tokens = std::move(task.prompt_tokens);
- if (!are_lora_equal(task.params.lora, slot.lora)) {
+ if (!are_lora_equal(slot.params.lora, slot.lora)) {
// if lora is changed, we cannot reuse cached tokens
slot.cache_tokens.clear();
- slot.lora = task.params.lora;
+ slot.lora = slot.params.lora;
}
bool can_detokenize = can_be_detokenized(ctx, slot.prompt_tokens);
@@ -3952,44 +3953,42 @@ int main(int argc, char ** argv) {
auto completion_id = gen_chatcmplid();
std::unordered_set<int> task_ids;
- {
+ try {
std::vector<server_task> tasks;
- try {
- const auto & prompt = data.at("prompt");
- // TODO: this log can become very long, put it behind a flag or think about a more compact format
- //SRV_DBG("Prompt: %s\n", prompt.is_string() ? prompt.get<std::string>().c_str() : prompt.dump(2).c_str());
-
- std::vector<llama_tokens> tokenized_prompts = tokenize_input_prompts(ctx_server.vocab, prompt, true, true);
- tasks.reserve(tokenized_prompts.size());
- for (size_t i = 0; i < tokenized_prompts.size(); i++) {
- server_task task = server_task(type);
-
- task.id = ctx_server.queue_tasks.get_new_id();
- task.index = i;
-
- task.prompt_tokens = std::move(tokenized_prompts[i]);
- task.params = server_task::params_from_json_cmpl(
- ctx_server.ctx,
- ctx_server.params_base,
- data);
- task.id_selected_slot = json_value(data, "id_slot", -1);
-
- // OAI-compat
- task.params.oaicompat = oaicompat;
- task.params.oaicompat_cmpl_id = completion_id;
- // oaicompat_model is already populated by params_from_json_cmpl
-
- tasks.push_back(std::move(task));
- }
- } catch (const std::exception & e) {
- res_error(res, format_error_response(e.what(), ERROR_TYPE_INVALID_REQUEST));
- return;
+ const auto & prompt = data.at("prompt");
+ // TODO: this log can become very long, put it behind a flag or think about a more compact format
+ //SRV_DBG("Prompt: %s\n", prompt.is_string() ? prompt.get<std::string>().c_str() : prompt.dump(2).c_str());
+
+ std::vector<llama_tokens> tokenized_prompts = tokenize_input_prompts(ctx_server.vocab, prompt, true, true);
+ tasks.reserve(tokenized_prompts.size());
+ for (size_t i = 0; i < tokenized_prompts.size(); i++) {
+ server_task task = server_task(type);
+
+ task.id = ctx_server.queue_tasks.get_new_id();
+ task.index = i;
+
+ task.prompt_tokens = std::move(tokenized_prompts[i]);
+ task.params = server_task::params_from_json_cmpl(
+ ctx_server.ctx,
+ ctx_server.params_base,
+ data);
+ task.id_selected_slot = json_value(data, "id_slot", -1);
+
+ // OAI-compat
+ task.params.oaicompat = oaicompat;
+ task.params.oaicompat_cmpl_id = completion_id;
+ // oaicompat_model is already populated by params_from_json_cmpl
+
+ tasks.push_back(std::move(task));
}
task_ids = server_task::get_list_id(tasks);
ctx_server.queue_results.add_waiting_tasks(tasks);
ctx_server.queue_tasks.post(std::move(tasks));
+ } catch (const std::exception & e) {
+ res_error(res, format_error_response(e.what(), ERROR_TYPE_INVALID_REQUEST));
+ return;
}
bool stream = json_value(data, "stream", false);
Co-authored-by: ggerganov <[email protected]>
Thanks @ggerganov for helping! I underestimated this change. Applied your suggestions on my last commit 👍 |
* server : use std::move whenever possible * use r-value ref * Apply suggestions from code review Co-authored-by: Georgi Gerganov <[email protected]> * make task creation scoped * restore std::move * fix task_id not set correctly * apply changes from suggestion Co-authored-by: ggerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>
Ref comment : #12898 (comment)
This PR is made by simply adding
std::unique_ptr<int> dummy
to the struct that we want to investigate, then remove that unique_ptr after allstd::move
being added.