-
Notifications
You must be signed in to change notification settings - Fork 3k
feat(lmstudio): add generateObject override for OpenAI-compatible APIs #966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat(lmstudio): add generateObject override for OpenAI-compatible APIs #966
Conversation
LM Studio and most OpenAI-compatible APIs don't support OpenAI's structured output feature (zodResponseFormat / chat.completions.parse). This adds a generateObject override that: - Uses standard chat.completions.create() with response_format JSON mode - Includes the JSON schema in the prompt for guidance - Integrates repairJson for robust parsing - Falls back to prompt-only JSON instruction for models without JSON mode This enables features like follow-up suggestions to work with LM Studio, vLLM, LocalAI, and other OpenAI-compatible inference servers. Contributed by The Noble House™ AI Lab (https://thenoblehouse.ai)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 1 file
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/lib/models/providers/lmstudio/lmstudioLLM.ts">
<violation number="1" location="src/lib/models/providers/lmstudio/lmstudioLLM.ts:33">
P2: Accessing `input.messages[input.messages.length - 1]` will crash if `messages` array is empty. Consider adding a guard clause at the start of the method.</violation>
<violation number="2" location="src/lib/models/providers/lmstudio/lmstudioLLM.ts:75">
P2: Fallback condition `err.status === 400` is too broad. A 400 error can occur for many reasons (invalid model, malformed request, etc.), not just unsupported JSON mode. Consider checking for more specific error messages or codes to avoid masking unrelated errors.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| throw new Error('No response from LM Studio'); | ||
| } catch (err: any) { | ||
| // If JSON mode isn't supported, try without it | ||
| if (err.message?.includes('response_format') || err.status === 400) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Fallback condition err.status === 400 is too broad. A 400 error can occur for many reasons (invalid model, malformed request, etc.), not just unsupported JSON mode. Consider checking for more specific error messages or codes to avoid masking unrelated errors.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/models/providers/lmstudio/lmstudioLLM.ts, line 75:
<comment>Fallback condition `err.status === 400` is too broad. A 400 error can occur for many reasons (invalid model, malformed request, etc.), not just unsupported JSON mode. Consider checking for more specific error messages or codes to avoid masking unrelated errors.</comment>
<file context>
@@ -1,5 +1,139 @@
+ throw new Error('No response from LM Studio');
+ } catch (err: any) {
+ // If JSON mode isn't supported, try without it
+ if (err.message?.includes('response_format') || err.status === 400) {
+ return this.generateObjectWithoutJsonMode<T>(input, jsonSchema);
+ }
</file context>
| const messagesWithJsonInstruction = [ | ||
| ...input.messages.slice(0, -1), // All messages except the last | ||
| { | ||
| ...input.messages[input.messages.length - 1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Accessing input.messages[input.messages.length - 1] will crash if messages array is empty. Consider adding a guard clause at the start of the method.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/models/providers/lmstudio/lmstudioLLM.ts, line 33:
<comment>Accessing `input.messages[input.messages.length - 1]` will crash if `messages` array is empty. Consider adding a guard clause at the start of the method.</comment>
<file context>
@@ -1,5 +1,139 @@
+ const messagesWithJsonInstruction = [
+ ...input.messages.slice(0, -1), // All messages except the last
+ {
+ ...input.messages[input.messages.length - 1],
+ content: `${input.messages[input.messages.length - 1].content}\n\nRespond with a valid JSON object matching this schema:\n${JSON.stringify(jsonSchema, null, 2)}`,
+ },
</file context>
|
@cubic-dev-ai Thank you for the thorough review! Both issues have been addressed in commit Issue 1: Empty Array Vulnerability (Line 33)Fixed with: New private validateAndGetLastMessage(
messages: GenerateObjectInput['messages'],
): GenerateObjectInput['messages'][number] {
if (!messages || messages.length === 0) {
throw new Error(
'LMStudioLLM: Cannot generate object with empty messages array. ' +
'At least one message is required.',
);
}
return messages[messages.length - 1];
}This is now called at the start of both Issue 2: Overly Broad Error Handling (Line 75)Fixed with: New private static readonly JSON_MODE_UNSUPPORTED_PATTERNS = [
'response_format',
'json_object',
'json_mode',
'structured output',
'not supported',
'invalid.*format',
] as const;
private isJsonModeUnsupportedError(err: unknown): boolean {
// Pattern matching against error messages, not status codes
// Only falls back when error specifically indicates JSON mode unsupported
}This ensures we only trigger the fallback for actual JSON mode incompatibility, not unrelated 400 errors. Additional Improvements
Please re-review when ready. 🙏 |
@teknomage8 I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 1 file
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/lib/models/providers/lmstudio/lmstudioLLM.ts">
<violation number="1" location="src/lib/models/providers/lmstudio/lmstudioLLM.ts:33">
P1: Accessing `input.messages[input.messages.length - 1]` without checking if `messages` is empty will cause a TypeError if an empty array is passed. Consider adding a guard check or validating that messages has at least one element.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| const messagesWithJsonInstruction = [ | ||
| ...input.messages.slice(0, -1), // All messages except the last | ||
| { | ||
| ...input.messages[input.messages.length - 1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Accessing input.messages[input.messages.length - 1] without checking if messages is empty will cause a TypeError if an empty array is passed. Consider adding a guard check or validating that messages has at least one element.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/models/providers/lmstudio/lmstudioLLM.ts, line 33:
<comment>Accessing `input.messages[input.messages.length - 1]` without checking if `messages` is empty will cause a TypeError if an empty array is passed. Consider adding a guard check or validating that messages has at least one element.</comment>
<file context>
@@ -1,5 +1,139 @@
+ const messagesWithJsonInstruction = [
+ ...input.messages.slice(0, -1), // All messages except the last
+ {
+ ...input.messages[input.messages.length - 1],
+ content: `${input.messages[input.messages.length - 1].content}\n\nRespond with a valid JSON object matching this schema:\n${JSON.stringify(jsonSchema, null, 2)}`,
+ },
</file context>
Summary
This PR adds proper
generateObjectsupport for LM Studio and other OpenAI-compatible APIs (vLLM, LocalAI, etc.) that don't support OpenAI's structured output feature (zodResponseFormat/chat.completions.parse).Problem
The current
LMStudioLLMclass extendsOpenAILLMwithout any overrides. WhengenerateObjectis called, it uses the parent's implementation which relies on OpenAI's proprietary structured output API:This results in errors like:
chat.completions.parse is not a functionresponse_format.json_schema is not supportedSolution
Override
generateObjectinLMStudioLLMto use the standard chat completion endpoint with JSON mode:response_format: { type: 'json_object' }which is widely supported@toolsycc/json-repairlibrary to handle malformed JSONKey Features
repairJsonlibrary for robust parsingTesting
Tested with:
generateObject)Credits
Contributed by The Noble House™ AI Lab (https://thenoblehouse.ai)
Summary by cubic
Adds generateObject support for LM Studio and other OpenAI-compatible APIs by overriding LMStudioLLM to use JSON mode and manual parsing. Fixes structured output errors and re-enables features like suggestions and search on LM Studio, vLLM, and LocalAI.
Written for commit 7adac48. Summary will update on new commits.