Skip to content

fix(bash): correctly highlight doctags in comments again (#4234) #4239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ New Grammars:

Core Grammars:

- fix(bash) resolve regression in comment doctag highlighting [wolfgang42][]
- enh(csp) add missing directives / keywords from MDN (7 more) [Max Liashuk][]
- enh(ada) add new `parallel` keyword, allow `[]` for Ada 2022 [Max Reznik][]

Expand Down
13 changes: 1 addition & 12 deletions src/languages/bash.js
Original file line number Diff line number Diff line change
Expand Up @@ -38,18 +38,7 @@ export default function(hljs) {
end: /\)/,
contains: [ hljs.BACKSLASH_ESCAPE ]
};
const COMMENT = hljs.inherit(
hljs.COMMENT(),
{
match: [
/(^|\s)/,
/#.*$/
],
scope: {
2: 'comment'
}
}
);
const COMMENT = hljs.COMMENT(/(?<=^|\s)#/, /$/);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use look-behind until v12 because it's a breaking change in older Safari.

How did we do this previously? I also think we do not want any spacing in front of the comment to be scoped as part of the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use look-behind until v12 because it's a breaking change in older Safari.

Yeah, I thought as much. This is not a particularly pressing bug for me so I’d be fine waiting for whenever that happens, unless we can find another way to implement this first.

How did we do this previously?

Prior to d78749a it was just using hljs.HASH_COMMENT_MODE directly, but that doesn’t handle the whitespace requirement.

I also think we do not want any spacing in front of the comment to be scoped as part of the comment.

Agreed; if it was OK to scope it as part of the comment we wouldn’t need the lookbehind, the regex could just be /(^|\s)#/. There are also some existing test cases which cover this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was breaking if we just matched it as /#/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #3918 and associated tests:

<span class="hljs-built_in">echo</span> asdf#qwert yuiop
<span class="hljs-built_in">echo</span> asdf <span class="hljs-comment">#qwert yuiop</span>

For example, echo foo#bar will print foo#bar; the #bar part is not a comment.

Presumably the author of that change had a specific scenario in mind, but I went looking for some real-world examples in the Oil Shell test corpus, and instead found another example of something that it has broken the highlighting for:

		branch_name=HEAD ;# detached

...which ought to be highlighted as a comment (since it's not inside a command), even though it doesn't have whitespace before it.

So, while that change was well-meaning, on closer inspection I think very careful reading of POSIX §2.3 would be needed to figure out what the actual requirements are here; at a glance I see at least |;<>&; sometimes () can also be immediately followed by a comment, but (I think) only when indicating a standalone subshell; I haven’t looked at it closely but I think in the general case you may need a full-on stateful parser to determine exactly where in the grammar you are and what should be done with a # character in that state.

Given that, this is maybe a question of determining which kinds of brokenness are most acceptable? I certainly don't figure like wrangling the grammar to make sure that every edge case is covered correctly...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also cc'ing @iFreilicht as author of that PR in case they have any more insight into their use case)

Copy link
Contributor

@iFreilicht iFreilicht Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wolfgang42 Thank you! The important part for me is that a command like nix run nixpkgs#btop should be correctly highlighted, i.e. #btop is not a comment in this case, and nixpkgs#btop is a single token. POSIX-compatible shells require a comment to be at the beginning of a line or be preceeded by whitespace, or at least I thought that was the whole of it at the time.

const HERE_DOC = {
begin: /<<-?\s*(?=\w+)/,
starts: { contains: [
Expand Down
2 changes: 2 additions & 0 deletions test/markup/bash/not-comments.expect.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
<span class="hljs-built_in">echo</span> asdf#qwert yuiop

<span class="hljs-built_in">echo</span> asdf <span class="hljs-comment">#qwert yuiop</span>

<span class="hljs-comment"># <span class="hljs-doctag">TODO:</span> this *is* a comment</span>
2 changes: 2 additions & 0 deletions test/markup/bash/not-comments.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
echo asdf#qwert yuiop

echo asdf #qwert yuiop

# TODO: this *is* a comment