Public tokenizer errors, hasChatTemplate #171

pcuenca · 2025-01-31T15:30:34Z

See ml-explore/mlx-swift-examples#181 (comment), ml-explore/mlx-swift-examples#150

In addition to making the tokenizer errors public, this PR adds a new hasChatTemplate property. We could also create an accessor for the chat template itself if needed, but we still need to support array values that are in the process of being deprecated anyway.

Would these changes suffice for mlx-swift, @davidkoski? Do you think we'd need something like Python's tokenize=False argument, like @awni mentioned?

davidkoski · 2025-01-31T15:35:51Z

Would tokenize=False be a method that just applied the template and returned the string? The only thing I can think we need it for today is if somebody wanted to log it for debug purposes -- perhaps the llmtool would call it and log it.

Right now the only problem I know of is the caller not knowing if they can safely call the applyChatTemplate method itself, so public error and hasChatTemplate seems sufficient.

pcuenca · 2025-01-31T15:38:58Z

Yes, I think there are interesting use cases for tokenize=False, I've seen people decoding the ids as a workaround.

I'll merge this then and we can incorporate tokenize=False later. Thank you!

pcuenca added 2 commits January 31, 2025 16:19

Make tokenizer errors public, add hasChatTemplate

e1697c6

Tests, better message.

9e70ff8

pcuenca merged commit 55710dd into main Jan 31, 2025
1 check passed

pcuenca deleted the template-helpers branch January 31, 2025 15:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Public tokenizer errors, hasChatTemplate #171

Public tokenizer errors, hasChatTemplate #171

Uh oh!

pcuenca commented Jan 31, 2025

Uh oh!

davidkoski commented Jan 31, 2025

Uh oh!

pcuenca commented Jan 31, 2025

Uh oh!

Uh oh!

Uh oh!

Public tokenizer errors, hasChatTemplate #171

Public tokenizer errors, hasChatTemplate #171

Uh oh!

Conversation

pcuenca commented Jan 31, 2025

Uh oh!

davidkoski commented Jan 31, 2025

Uh oh!

pcuenca commented Jan 31, 2025

Uh oh!

Uh oh!

Uh oh!