Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lexer does not recognize URI with empty path #32

Open
pieterbos opened this issue Jul 3, 2018 · 1 comment
Open

Lexer does not recognize URI with empty path #32

pieterbos opened this issue Jul 3, 2018 · 1 comment

Comments

@pieterbos
Copy link
Contributor

pieterbos commented Jul 3, 2018

The following URI in ODIN is not recognized as an URI by the lexer:

http://www.test.example
http://www.test.example/

They are however both valid URIs.

The lexer does not recognize this because of the following lexer rules:

URI : URI_SCHEME SYM_COLON URI_HIER_PART ( '?' URI_QUERY )? ;
fragment URI_HIER_PART : ( '//' URI_AUTHORITY )? URI_PATH ;
fragment URI_PATH   : ( '/' URI_XPALPHA+ )+ ;

On first glance it looks like this can be fixed with a simple URI_PATH?. However, this clashes with the labels of the expression grammar.
So I tried:

fragment URI_HIER_PART : ( '//' URI_AUTHORITY ) | URI_PATH | ( '//' URI_AUTHORITY ) URI_PATH ;

Which is better, but it still clashes with the following rule statement:

label:/path/to/value + /other_path = 3

because it matches label:/path/to/value as an URI.

So the remaining fixes are:

  1. Require the URI_AUTHORITY: fragment URI_HIER_PART : ( '//' URI_AUTHORITY ) URI_PATH? ;

  2. Match the <>-characters that must always surround a URL in the lexer

  3. Find a way to implement different lexer modes for different parts of the archetype

  4. would be best I think. however, there is no easy way in the current ADL language design to implement lexer mode switching without resorting to rather complicated target language constructions. So I stuck with the first solution for now for archie, which is at least better than the alternatives. A better fix would be good though!

@pieterbos
Copy link
Contributor Author

In addition, to match the trailing slashes in https://www.test.example/ and https://www.test.example/aa/bb/ correctly, we need:

fragment URI_PATH   : '/' | ( '/' URI_XPALPHA+ )+ ('/')?;

probably still not perfect

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant