Skip to content

Wrong line numbers produces in rustc_span::SourceFile::lookup_file_pos* #84127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
klensy opened this issue Apr 12, 2021 · 11 comments
Open

Wrong line numbers produces in rustc_span::SourceFile::lookup_file_pos* #84127

klensy opened this issue Apr 12, 2021 · 11 comments
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@klensy
Copy link
Contributor

klensy commented Apr 12, 2021

Function comments noted, that they returns 1-based line number, but can actually return 0-based, #77080 @richkadel

/// Looks up the file's (1-based) line number and (0-based `CharPos`) column offset, for a
/// given `BytePos`.
pub fn lookup_file_pos(&self, pos: BytePos) -> (usize, CharPos) {
let chpos = self.bytepos_to_file_charpos(pos);
match self.lookup_line(pos) {
Some(a) => {
let line = a + 1; // Line numbers start at 1
let linebpos = self.lines[a];
let linechpos = self.bytepos_to_file_charpos(linebpos);
let col = chpos - linechpos;
debug!("byte pos {:?} is on the line at byte pos {:?}", pos, linebpos);
debug!("char pos {:?} is on the line at char pos {:?}", chpos, linechpos);
debug!("byte is on line: {}", line);
assert!(chpos >= linechpos);
(line, col)
}
None => (0, chpos),
}
}
/// Looks up the file's (1-based) line number, (0-based `CharPos`) column offset, and (0-based)
/// column offset when displayed, for a given `BytePos`.
pub fn lookup_file_pos_with_col_display(&self, pos: BytePos) -> (usize, CharPos, usize) {
let (line, col_or_chpos) = self.lookup_file_pos(pos);
if line > 0 {
let col = col_or_chpos;
let linebpos = self.lines[line - 1];
let col_display = {
let start_width_idx = self
.non_narrow_chars
.binary_search_by_key(&linebpos, |x| x.pos())
.unwrap_or_else(|x| x);
let end_width_idx = self
.non_narrow_chars
.binary_search_by_key(&pos, |x| x.pos())
.unwrap_or_else(|x| x);
let special_chars = end_width_idx - start_width_idx;
let non_narrow: usize = self.non_narrow_chars[start_width_idx..end_width_idx]
.iter()
.map(|x| x.width())
.sum();
col.0 - special_chars + non_narrow
};
(line, col, col_display)
} else {
let chpos = col_or_chpos;
let col_display = {
let end_width_idx = self
.non_narrow_chars
.binary_search_by_key(&pos, |x| x.pos())
.unwrap_or_else(|x| x);
let non_narrow: usize =
self.non_narrow_chars[0..end_width_idx].iter().map(|x| x.width()).sum();
chpos.0 - end_width_idx + non_narrow
};
(0, chpos, col_display)
}
}
}

Perhaps this can be prevented via using different types for 0- and 1- based line numbers.

@klensy klensy changed the title Wrong line numbers produces in rustc_span::SourceFile::lookup_file_pos Wrong line numbers produces in rustc_span::SourceFile::lookup_file_pos* Apr 12, 2021
@klensy
Copy link
Contributor Author

klensy commented Apr 12, 2021

@rustbot label +A-code-coverage

@rustbot rustbot added the A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) label Apr 12, 2021
@richkadel
Copy link
Contributor

You're absolutely right that this doc comment is incomplete, though I would not say it "0-based" (valid values are always 1-based).

But I should correct the comment. The source uses lookup_line(), which can return None. Adapting the description from that function, we should add the following sentence to the doc comment for lookup_file_pos():

    ///   ...  If the source_file is empty or the position is located before the
    /// first line, a line number of `0` is returned.

@klensy
Copy link
Contributor Author

klensy commented Apr 12, 2021

There also lookup_file_pos_with_col_display.

@richkadel
Copy link
Contributor

Yes, thanks. The same applies in that function's comment as well. We should update both.

@klensy
Copy link
Contributor Author

klensy commented Apr 12, 2021

lookup_file_pos_with_col_display called from lookup_char_pos which can assign struct Loc field line value 0 that annotated as

pub struct Loc {
    /// Information about the original source.
    pub file: Lrc<SourceFile>,
    /// The (1-based) line number.
    pub line: usize,
    /// The (0-based) column offset.
    pub col: CharPos,
    /// The (0-based) column offset when displayed.
    pub col_display: usize,
}

which will invalidate expectations in that struct.

@richkadel
Copy link
Contributor

True (and again, thanks for recognizing these inconsistencies! They should be fixed.)

But if you look at the implementation of lookup_char_pos() prior to my change, if lookup_line(pos) returned an error, it also returned 0 for the line number. So my changes were consistent with the prior implementation.

So I guess the Loc struct line field's doc comment is also incomplete, and should also be updated to reflect the possibility that line can be zero.

The alternative is to change all of these line variables and the struct field to an Option<usize>, but since this would have a lot of ripple effects, I'm assuming we don't want to do that.

@klensy
Copy link
Contributor Author

klensy commented Apr 12, 2021

Yes, i forgot to check previous revisions, sorry.
I'll try to see how lookup_line can return None, perhaps this should emit error instead silently using 0 value.

@richkadel
Copy link
Contributor

richkadel commented Apr 12, 2021

Just to clarify, lookup_line() already returns an Option. Since it can return None for an empty file, that might not be an Err for that function. I don't know if the downstream calls should return None or Err (versus 0) given their expected usages.

@richkadel
Copy link
Contributor

@klensy - Is there a reason you've labeled this related to A-code-coverage? Are you seeing code coverage issues related to empty files or other reasons that the line could be zero?

@richkadel
Copy link
Contributor

This is not a bug in code coverage. Code coverage works correctly, based on the current behavior of SourceMap and Span code. This issue is related to the fact that the SourceMap and Span code behavior behave in a way that might be confusing to users of that API, and that the API documentation is inconsistent with the actual behavior.

@rustbot label -A-code-coverage

@rustbot rustbot removed the A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) label Apr 24, 2021
@Dylan-DPC Dylan-DPC added C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 13, 2023
@Enselic Enselic added A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools and removed C-bug Category: This is a bug. labels May 28, 2024
@hkBst
Copy link
Member

hkBst commented Jan 28, 2025

@richkadel, @klensy, is this still relevant, and if so did either of you want to propose a fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants