continuous interaction #322

DKormann · 2023-03-20T11:49:39Z

DKormann
Mar 20, 2023

when the bot interaction reaches the context length the programm exits.
As far as I understand how the transformer architecture is used here it stores the hidden states of the transformer when new tokens come in. So when the context length is reached one would need to discard some tokens at the beginning and restart the inference. Can we implement something like this?
I understand that this would mean a noticable delay since we would need to make a big inference about a good portion of the chat. Can we maybe start this inference in the background while user is still chatting within allowed context length?
If not can we at least start the new inference at a point where user input is requested so as to hide at least some of the delay from the user?

Piezoid · 2023-03-20T12:14:23Z

Piezoid
Mar 20, 2023

So when the context length is reached one would need to discard some tokens at the beginning and restart the inference. Can we implement something like this?

Issue #71 is looking into that.

I understand that this would mean a noticable delay since we would need to make a big inference about a good portion of the chat. Can we maybe start this inference in the background while user is still chatting within allowed context length?
If not can we at least start the new inference at a point where user input is requested so as to hide at least some of the delay from the user?

If I'm getting it right, your suggestion is to reinitialize the model and use the last set of generated tokens as a prompt to resume the interaction. This approach might work but it will consume a lot of computation time.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

continuous interaction #322

{{title}}

Replies: 1 comment

{{title}}

Select a reply

continuous interaction #322

DKormann Mar 20, 2023

Replies: 1 comment

Piezoid Mar 20, 2023

DKormann
Mar 20, 2023

Piezoid
Mar 20, 2023