Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Incorrect Markdown breaks the chat window #913

Closed
3 tasks done
f-leeske opened this issue Feb 13, 2025 · 4 comments
Closed
3 tasks done

[Bug]: Incorrect Markdown breaks the chat window #913

f-leeske opened this issue Feb 13, 2025 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@f-leeske
Copy link
Contributor

f-leeske commented Feb 13, 2025

Your minimal.lua config

---@diagnostic disable: missing-fields

--[[
NOTE: Set the config path to enable the copilot adapter to work.
It will search the following paths for a token:
  - "$CODECOMPANION_TOKEN_PATH/github-copilot/hosts.json"
  - "$CODECOMPANION_TOKEN_PATH/github-copilot/apps.json"
--]]
vim.env["CODECOMPANION_TOKEN_PATH"] = vim.fn.expand("~/.config")

vim.env.LAZY_STDPATH = ".repro"
load(vim.fn.system("curl -s https://raw.githubusercontent.com/folke/lazy.nvim/main/bootstrap.lua"))()

-- Your CodeCompanion setup
local plugins = {
  {
    "olimorris/codecompanion.nvim",
    dependencies = {
      { "nvim-treesitter/nvim-treesitter", build = ":TSUpdate", },
      { "nvim-lua/plenary.nvim", },
      -- Test with blink.cmp
      {
        "saghen/blink.cmp",
        lazy = false,
        version = "*",
        opts = {
          keymap = {
            preset = "enter",
            ["<S-Tab>"] = { "select_prev", "fallback", },
            ["<Tab>"] = { "select_next", "fallback", },
          },
          sources = {
            default = { "lsp", "path", "buffer", "codecompanion", },
            cmdline = {}, -- Disable sources for command-line mode
          },
        },
      },
      -- Test with nvim-cmp
      -- { "hrsh7th/nvim-cmp" },
    },
    opts = {
      --Refer to: https://github.com/olimorris/codecompanion.nvim/blob/main/lua/codecompanion/config.lua
      adapters = {
        openai = function()
          return require("codecompanion.adapters").extend("openai", {
            schema = {
              model = {
                default = "gpt-4o-mini",
              },
            },
          })
        end,
      },
      strategies = {
        --NOTE: Change the adapter as required
        chat = { adapter = "openai", },
        inline = { adapter = "copilot", },
      },
      opts = {
        log_level = "DEBUG",
      },
    },
  },
}

require("lazy.minit").repro({ spec = plugins, })

-- Setup Tree-sitter
local ts_status, treesitter = pcall(require, "nvim-treesitter.configs")
if ts_status then
  treesitter.setup({
    ensure_installed = { "lua", "markdown", "markdown_inline", "yaml", },
    highlight = { enable = true, },
  })
end

-- Setup nvim-cmp
-- local cmp_status, cmp = pcall(require, "cmp")
-- if cmp_status then
--   cmp.setup({
--     mapping = cmp.mapping.preset.insert({
--       ["<C-b>"] = cmp.mapping.scroll_docs(-4),
--       ["<C-f>"] = cmp.mapping.scroll_docs(4),
--       ["<C-Space>"] = cmp.mapping.complete(),
--       ["<C-e>"] = cmp.mapping.abort(),
--       ["<CR>"] = cmp.mapping.confirm({ select = true }),
--       -- Accept currently selected item. Set `select` to `false` to only confirm explicitly selected items.
--     }),
--   })
-- end

Error messages

None

Log output

[INFO] 2025-02-14 00:15:21
Chat request started
[INFO] 2025-02-14 00:15:21
ELAPSED TIME: 18.491409
[INFO] 2025-02-14 00:15:21
Request body file: /tmp/nvim.REDACTED/ecEo3S/0.json
[DEBUG] 2025-02-14 00:15:21
Request:
{ "-sSL", "-D", "/run/user/1000/plenary_curl_add540a4.headers", "-X", "POST", "-H", "Authorization: Bearer REDACTED", "-H", "Content-Type: application/json", "-d", "@/tmp/nvim.REDACTED/ecEo3S/0.json", "--retry", "3", "--retry-delay", "1", "--keepalive-time", "60", "--connect-timeout", "10", "--tcp-nodelay", "--no-buffer", "https://api.openai.com/v1/chat/completions" }
[INFO] 2025-02-14 00:15:25
Chat request finished
[INFO] 2025-02-14 00:15:35
Chat request started
[INFO] 2025-02-14 00:15:35
ELAPSED TIME: 19.261913
[INFO] 2025-02-14 00:15:35
Request body file: /tmp/nvim.REDACTED/ecEo3S/1.json
[DEBUG] 2025-02-14 00:15:35
Request:
{ "-sSL", "-D", "/run/user/1000/plenary_curl_57d44467.headers", "-X", "POST", "-H", "Authorization: Bearer REDACTED", "-H", "Content-Type: application/json", "-d", "@/tmp/nvim.REDACTED/ecEo3S/1.json", "--retry", "3", "--retry-delay", "1", "--keepalive-time", "60", "--connect-timeout", "10", "--tcp-nodelay", "--no-buffer", "https://api.openai.com/v1/chat/completions" }
[INFO] 2025-02-14 00:15:39
Chat request finished

I had a look at /tmp/.../1.json afterward, it looks like this (on another attempt, slightly different in/output than below)

{
  "temperature": 1,
  "presence_penalty": 0,
  "messages": [
    {
      "role": "system",
      "content": "You are an AI programming assistant named \"CodeCompanion\". You are currently plugged into the Neovim text editor on a user's machine.\n\nYour core tasks include:\n- Answering general programming questions.\n- Explaining how the code in a Neovim buffer works.\n- Reviewing the selected code in a Neovim buffer.\n- Generating unit tests for the selected code.\n- Proposing fixes for problems in the selected code.\n- Scaffolding code for a new workspace.\n- Finding relevant code to the user's query.\n- Proposing fixes for test failures.\n- Answering questions about Neovim.\n- Running tools.\n\nYou must:\n- Follow the user's requirements carefully and to the letter.\n- Keep your answers short and impersonal, especially if the user's context is outside your core tasks.\n- Minimize additional prose unless clarification is needed.\n- Use Markdown formatting in your answers.\n- Include the programming language name at the start of each Markdown code block.\n- Avoid including line numbers in code blocks.\n- Avoid wrapping the whole response in triple backticks.\n- Only return code that's directly relevant to the task at hand. You may omit code that isn’t necessary for the solution.\n- Use actual line breaks in your responses; only use \"\\n\" when you want a literal backslash followed by 'n'.\n- All non-code text responses must be written in the English language indicated.\n\nWhen given a task:\n1. Think step-by-step and, unless the user requests otherwise or the task is very simple, describe your plan in detailed pseudocode.\n2. Output the final code in a single code block, ensuring that only relevant code is included.\n3. End your response with a short suggestion for the next user turn that directly supports continuing the conversation.\n4. Provide exactly one complete reply per conversation turn."
    },
    {
      "role": "user",
      "content": "Create a simple query for the obsidian datavaiew"
    },
    {
      "role": "assistant",
      "content": "To create a simple query for the Obsidian Dataview plugin, follow these steps:\n\n1. **Identify the data type**: Determine what kind of notes you want to query.\n2. **Define the query structure**: Use the Dataview query syntax.\n\nHere's a pseudocode step-by-step plan:\n\n1. Start with the `table` or `list` keyword to specify the output format.\n2. Select the fields you want to display from your notes.\n3. Specify the `from` clause to pinpoint which notes to include in the query.\n4. (Optional) Add filters using the `where` clause.\n\nExample of a simple query to display all notes with their titles and tags:\n\n```markdown\n```dataview\ntable tags\nfrom \"folder\/name\" \n```\n```\n\nThis query displays a table of notes from the specified folder along with their tags.\n\nNext, let me know if you need help with a more complex query or specific functionalities!"
    }
  ],
  "stream_options": { "include_usage": true },
  "stream": true,
  "model": "gpt-4o-mini",
  "top_p": 1,
  "frequency_penalty": 0
}

Describe the bug

When the LLM response contains incorrect markdown that prevents the tier 2 heading between LLM and user message from being parsed, any further messages from the user don't seem to make it to the LLM. I stumpled upon this when I wanted the LLM to generate some queries for the obsidian plugin dataview. I recreated it using the given minimal setup, with openai gpt-4o-mini:

Image

After the LLMs response that includes nested triple backtick blocks in the markdown, any request I send does get a response from the model, but its output always responds to the initial request. Other adapters I tested with did not include the outer triple backtick markdown block for this specific query, but there will inevitably be model outputs that are parsed as incorrect markdown.

Deleting the problematic part in the buffer does not change anything.

Reproduce the bug

Steps to reproduce:

  1. Query LLM such that it produces incorrect markdown and the heading splitting LLM and user messages is no longer recognised
  2. Query LLM again

Adapter and model

openai and gpt-4o-mini

Final checks

  • I have made sure this issue exists in the latest version of the plugin
  • I have tested with the minimal.lua file from above and have shared this
  • I have shared the contents of the log file
@f-leeske f-leeske added the bug Something isn't working label Feb 13, 2025
Copy link
Contributor

Important

If your issue does NOT contain a valid minimal.lua then this issue may be closed without a response.
Thanks for respecting my time and efforts.

Thanks @f-leeske. I'll get to this as soon as I can.

In the meantime, please ensure:

  • This is a plugin related issue and not an issue with your configuration
  • You've searched for similar issues (try the discussions too)
  • You've checked out the documentation
  • The issue title is accurate
  • There is a valid minimal.lua file included so I can try and recreate the issue

@olimorris
Copy link
Owner

olimorris commented Feb 13, 2025

To summarize:

  1. You write a query which essentially asks for markdown to be generated
  2. The LLM messes this up
  3. You try to reply and can't

Is that correct? No error messages or anything in the logs? (Might be worth turning log level to trace for everything but beware leaving your API keys in).

Can you open the debug window in the chat buffer and paste the message table after you've tried to reply to the LLM?

Also, what happens if the markdown isn't incorrect? Say with a newer LLM?

I'm perplexed because this shouldn't be causing an issue. We only parse ranges in the chat buffer. So after the LLM request completes, the chat buffer class is updated to know where in the buffer the response ended. This becomes the start of the range, so when you submit a user prompt, we run a TS query to the end of the buffer and look for H2 headings that match the user role name you defined in the config. So the chat buffer doesn't actually care about what the LLM wrote and hence it shouldn't break anything. It might mess up the formatting but it shouldn't break anything.

Just to confirm, you've been able to recreate it with the minimal file you supplied? And you've not edited the LLMs response to remove any lines?

@olimorris
Copy link
Owner

olimorris commented Feb 14, 2025

Can confirm I can recreate this with the following prompt:

Can you generate a python code block which is incorrect because it contains some nested triple backticks? Just respond with this, only.

Especially if it leads to:

Here is a broken markdown example:

# Title

Some text with ```nested ``` backticks``` in it.

Another paragraph with ````multiple levels```` of `````backticks````` scattered about.

```python
def code_with_nested():
    print("This has ```nested``` backticks")
    ```
    This shouldn't be here
    ```

@olimorris
Copy link
Owner

I'm being dumb. Of course this going to be an issue because the markdown LanguageTree (:h LanguageTree) applies to the whole buffer. Whilst I'm restricting my TS queries to a specified range (:h LanguageTree:parse()), it still needs the whole of the LanguageTree in order to know where to apply the range.

Unfortunately, I think the solution to this will be via prompt engineering. Perhaps ask the LLM to ensure it closes any markdown code block and avoids triple backtick errors.

@olimorris olimorris closed this as not planned Won't fix, can't repro, duplicate, stale Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants