Skip to content

Font locking (syntax highlighting) doesn't work with keywords starting with a number #581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BrunoBonacci opened this issue Jan 18, 2021 · 6 comments · Fixed by #628
Closed

Comments

@BrunoBonacci
Copy link
Contributor

Expected behavior

The following are both valid keywords, however, the font locking doesn't work on the second one

:foo
:1foo 

Actual behavior

Keywords starting with a number are not colorised as keywords

Steps to reproduce the problem

see example above

Environment & Version information

clojure-mode version

Include here the version string displayed by M-x clojure-mode-display-version. Here's an example:

clojure-mode (version 20201126.1558)
clojure-mode (version 5.13.0-snapshot)

Emacs version

GNU Emacs 27.1

Operating system

Mac OS 11.1

@bbatsov
Copy link
Member

bbatsov commented Jan 18, 2021

Probably the regexp for font-locking keywords needs to be tweaked.

@yuhan0
Copy link
Contributor

yuhan0 commented Feb 27, 2021

I don't think this is a bug, such keywords are not syntactically correct even though the reader currently accepts them. https://stackoverflow.com/a/39193032

@bbatsov
Copy link
Member

bbatsov commented Feb 28, 2021

Yeah, that's technically true, but on the other hand I also know this is never going to be fixed, as it will break backwards compatibility. That's why probably it makes sense to acknowlege this weird case as valid.

@BrunoBonacci
Copy link
Contributor Author

Maybe the bug is in the clojuredocs.org description as the official doc doesn't mention this constraint
https://clojure.org/reference/data_structures#Keywords

Keywords are symbolic tokens that express a concept/value by themselves like: :blue, :green, :slow, :fast.
From this point of view it makes sense that I should be able to express things like: :30sec, :1h, :1mb, :3XL if they make sense for someone's program.

I know that there are other ways to express such values, but the point is that the language should allow you to express it like this is you wish so.

The same consideration could be valid for the symbols themselves. The only reason why symbols have to start with a non-numerical character is to avoid ambiguity with the various numerical qualifiers, such as: 3e5, 25N, and 1.3M. Without this constraint, it would be impossible for the parser to understand whether they are to be considered symbols or numerical values. But such a problem is not present with keywords as they are unambiguously starting with :.
Anyway, that's my opinion.

I had a look at the source, and the keyword pattern reuses the definition of the symbol, which makes the fix slightly more complicated.

@OknoLombarda
Copy link
Contributor

This happens because clojure-font-lock-keywords reuses regexp for Clojure symbol when looking for keywords, which disallows numbers as the first character

(,(concat "\\(:\\{1,2\\}\\)\\(" clojure--sym-regexp "?\\)\\(/\\)\\(" clojure--sym-regexp "\\)")

It can be fixed by introducing a separate regexp specifically for Clojure keyword symbol
(defconst clojure--sym-regexp
(concat "[^" clojure--sym-forbidden-1st-chars "][^" clojure--sym-forbidden-rest-chars "]*")
"A regexp matching a Clojure symbol or namespace alias.
Matches the rule `clojure--sym-forbidden-1st-chars' followed by

 Matches the rule `clojure--sym-forbidden-1st-chars' followed by
+any number of matches of `clojure--sym-forbidden-rest-chars'.")
+  (defconst clojure--keyword-sym-forbidden-1st-chars
+    (concat clojure--sym-forbidden-rest-chars ":'")
+    "A list of chars that a Clojure keyword symbol cannot start with.")
+  (defconst clojure--keyword-sym-regexp
+    (concat "[^" clojure--keyword-sym-forbidden-1st-chars "]"
+            "[^" clojure--sym-forbidden-rest-chars "]*")
+    "A regexp matching a Clojure keyword name or keyword namespace.
+Matches the rule `clojure--keyword-sym-forbidden-1st-chars' followed by
 any number of matches of `clojure--sym-forbidden-rest-chars'."))

I tried to copy the way the symbol regexp is written (save the formatting), though it looks a little too verbose to me. If it looks okay, I'll send a PR

@bbatsov
Copy link
Member

bbatsov commented Aug 4, 2022

Looks reasonable to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants