-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Add unstable frontmatter support #137193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add unstable frontmatter support #137193
Conversation
r? @fee1-dead rustbot has assigned @fee1-dead. Use |
This comment has been minimized.
This comment has been minimized.
Might be worth adding a test case to |
This comment has been minimized.
This comment has been minimized.
Some changes occurred in src/tools/rustfmt cc @rust-lang/rustfmt |
ed5aad9
to
64cecab
Compare
compiler/rustc_lexer/src/lib.rs
Outdated
return None; | ||
} | ||
let (fence_pattern, rest) = rest.split_at(fence_end); | ||
let (info, rest) = rest.split_once("\n").unwrap_or((rest, "")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is a frontmatter like ---cargo---
going to be parsed? or ---cargo ---
?
It looks like they aren't considered to be valid. Consider making that a bit clearer (maybe here? to say that if there is no newline it will be invalid) and add tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything after the 3+ -
is the infostring, including any dashes that appear after something else. We check if its an identifier on the following lines which will reject that. We have a test case for invalid identifiers.
Or is the concern more with
strip_frontmatter("---")
(ie a test case where there is no newline after the opening)
☔ The latest upstream changes (presumably #137927) made this pull request unmergeable. Please resolve the merge conflicts. |
With that direction, we are either
As I said, it was a valid option to be considered but we decided to go a different direction in the RFC (referring specifically to the "container format solution", not "defer implementation"). |
|
Pushing the front matter parsing out of rustc is basically the world we've been living in the last two years but instead of telling rustc how to skip the content, Cargo strips it from a The parallel development isn't just in Cargo but in rustfmt, rustdoc, r-a, etc. |
That's the part I don't get. There seem to be many conflicting requirements that led to this confusing situation that we're trying to understand and you're trying to explain to us.
Adding a new -Z flag doesn't need an MCP
@fee1-dead volunteered above
you already have to do all the work, emitting another -Z flag should not be more work than just removing the new file generation which you were going to do anyway
this was gonna happen anyway as they'll want their own front matter some time in the future. And you can just pass the same flag to them for now instead of implementing parser logic for frontmatter
The change may be just to add another
We can just keep around the -Z flag along with the future -C flag for a while to ease the transition |
There is one definition of "rust file syntax" (being specific to leave off things like rustdoc, etc). A header is included in this syntax that external tools may choose to read or write without having to have knowledge of all of the "rust file syntax". Any tools expected to work with "rust file syntax" (syntax highlighers, The expected path for implementation was that rustc would parse the header. For any tool that doesn't need special support for it, this is all they need.
If rustc won't accept this header, is it actually part of the Rust syntax or is this a Cargo container format around Rust? If the header is Rust syntax but not supported in rustc, what does that mean for the Rust Reference, the Rust Spec, and third-parties implementing support (e.g. If this is Rust syntax but not supported in rustc, that means it could be supported in the future. If rustc defers support for this syntax, will T-compiler be able to provide as meaningful feedback before it gets stabilized and locked-in? If this is Rust syntax, why are we not supporting it every place Rust syntax is expected to work (e.g. in So within that context
Yes but for some, that processing is just "use rustc's API".
No, this is the first of a linear multi-step process for all of the Rust Project tools to support this, not to defer their implementation.
In the short to medium term, I expect this header to only be used for Cargo. Cargo will be expected to do any error reporting before rustc gets to it. There are cases where a user could use this within files within a package that has a
I did not say "rustc shouldn't have good diagnostics". What it shouldn't be is a blocker adding unstable support for this feature. I question whether its a blocker for stabilizing an MVP but I recognize that I have shaky ground to stand on for that.
Ah, I misunderstood as I've seen others have one.
My comment was written from the perspective of us coming back to take the approach from this PR.
We had an RFC and it said that the Rust language doesn't care whats inside of he header. |
Okay, so it looks like I misunderstood the motivation behind this PR. My original thought was that this PR wanted it so that a tool that needs parsing the frontmatter and is already parsing the frontmatter does not have to do the work of either pre-stripping the file to a temporary file or something else. Hence we make rustc support just parsing and then skipping it.
What I said above about requiring diagnostics is a nice to have if the primary motivation is the one I said above. And when I received pushback, I (begrudgingly) accepted that this PR can land, based on the assumption that the intention for this PR is to make cargo's life better. However, if the intention from the very beginning is to add syntax support for rustc, or to implement the lang RFC. There's a higher bar for that. And I cannot support this PR being the starting point for properly landing syntax support, i.e. it must have good diagnostics, that's an absolute minimum for me. Syntax support under no contexts should mean rustc should ignore it when the user has done the syntax wrong. But look, I already offered to implement it, so unless you want to actually write the code that are deemed important by at least two compiler maintainers now, I'm always up for doing the work. I also don't want to make anything look like a hostile takeover of your work. We're way past the time this PR would have landed. It would have landed if diagnostics were actually implemented. It would have landed if my nits were applied and there are no outstanding comments left to be responded to. But now we're just running in circles, putting countless words into this PR where none is needed if the code actually meets the standards of the compiler. And that is the part that confuses me the most. Feel free to privately or publicly reach out to any compiler team member, or reroll this PR assignment (I don't speak for oli though, but it looks like he felt the diagnostics are important too) if you really believe that this code in this form is good to land, and that reviewers just need to be convinced that this code is already good. I just don't think that strategy is beneficial. |
Ok, so wrt code review. I would like it to be implemented in the lexer ( I think that would simplify a few things as you
|
If you are willing to implement syntax support for frontmatters in rustc, I'm fine with you taking this over. I must have missed that offer. I only remember you offering to implement the byte stripping though I lost track of that through the conversation
I think I might be missing some of what you are intending because it appears you can have What would be the recommended Token break down for this? Rustfmt would eventually need to track open, info string, body, close, and maybe any of the whitespace between any of that. Some options I see:
I switched to this PR's approach at the recommendation of others when working through these kinds of problems |
This list doesn't seem quite right to me but perhaps I'm missing some of the details here. After reading the RFC, my understanding is that:
The RFC says:
I agree that is true of errors originating from the content of the frontmatter block, but the block itself is a syntactic element of the language now and should be properly treated for the purposes of error reporting just like block comments are. We will want to have proper error reporting for things like unclosed frontmatter sections even if Cargo explicitly told rustc where the file content should start because a user could always place one later in the file and we want to give them a useful error message. Therefore, I think the core requirement from the compiler side is that the frontend should recognize the frontmatter block and produce errors if the block is not immediately at the start of the file, the file contains more than one block at any position or the block is not closed. Whether that happens in the lexer or the parser, I'm not sure of the best approach. Since Cargo also needs to locate and parse the frontmatter block, it seems wasteful to implement duplicate error reporting for malformed frontmatter. I would suggest that if Cargo cannot successfully parse the frontmatter block itself (eg it does not find 3 or more I agree that we need proper error reporting for syntactically invalid frontmatter blocks, but I don't personally think that has to be done in this very PR provided one or more bugs are opened to track that work and marked as open issues on the tracking issue and there is no regression in quality of other error messages. cc @Noratrieb since you also suggested an implementation for this feature on that Zulip thread 🙂 |
So if I'm understanding, you see reporting of meanigful errors for misplaced, malformed, and multiple frontmatters as a blocker for stabilizing an MVP?
This approach was initially discussed at #137193 (comment) |
For me, it's a blocker for stabilization of the feature. The helpfulness/friendliness of the errors is certainly debatable and I wouldn't block stabilization on having them be absolutely perfect but the status quo does not reach the bar for me. I would say the errors at least need to describe what element the compiler thinks it saw and some useful suggestions for resolving common issues (eg, the unclosed block case should tell you how to close the frontmatter block). As I said, I wouldn't consider that to be a blocker for landing an initial PR though as long we have issues tracking the necessary improvements to error messages and the changes do not regress the UX for any other cases. In case it is helpful, let me provide concrete suggestions for what I think some of the errors should look like before stabilization:
---
//~^ ERROR unclosed frontmatter block
//~| HELP use `---` to close the block
#![feature(frontmatter)]
fn main() {
}
---cargo,hello-world
//~^ ERROR invalid infostring identifier
//~| NOTE identifiers may not contain `,` characters
---
#![feature(frontmatter)]
fn main() {
}
---cargo
---
---buck
//~^ ERROR only a single frontmatter may appear in a Rust file
---
#![feature(frontmatter)]
fn main() {
} |
Just let me know when I can start the work, either after this lands or now :) |
r? @wesleywiser |
I think we should also detect
|
Some circumstances have changed at work such that I will be unavailable to give this any of my attention, even to discuss this, for the next week. If you are able to move anything forward in that time, feel free to do so. Otherwise, the exact order depends on what direction the implementation for this PR needs to go. Either way, I appreciate the help! |
☔ The latest upstream changes (presumably #139301) made this pull request unmergeable. Please resolve the merge conflicts. |
Update: I got a considerable amount of progress today implementing it in the lexer -> parser level with proper diagnostics (WIP commit). I expect to have a PR ready early next week. |
Whatever PR wins out, please make sure that frontmatter is not observable via Potential UI test (draft): extern proc_macro;
use proc_macro::TokenStream;
#[proc_macro]
pub fn check(_: TokenStream) -> TokenStream {
assert!("---\n---".parse::<TokenStream>().unwrap().is_empty());
Default::default()
} //@ check-pass
//@ proc-macro: makro.rs
//@ edition: 2021
makro::check!();
fn main() {} |
This is an implementation for #136889
This strips the frontmatter, like shebang, because nothing within rustc will be using this. rustfmt folks have said this should be good enough for now as rustfmt should successfully work with this (ie not remove it) even though proper formatting would require AST support. I had considered making this part of lexing of tokens but that led to problems with ambiguous tokens. See also zulip.
This does not do any fancy error reporting for invalid syntax. Besides keeping this first PR simple, we can punt on this because Cargo will be the front line of parsing in the majority of cases and it will need to do have as good or better error reporting than rustc, for this syntax, to avoid masking the quality of rustc's error reporting.
This includes supporting the syntax in
This leaves to the future