Skip to content

New lint: Give a warning when calling File::bytes, TcpStream::bytes(), etc #14087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JonathanBrouwer opened this issue Jan 27, 2025 · 2 comments · Fixed by #14089
Closed

New lint: Give a warning when calling File::bytes, TcpStream::bytes(), etc #14087

JonathanBrouwer opened this issue Jan 27, 2025 · 2 comments · Fixed by #14089
Assignees
Labels
A-lint Area: New lints

Comments

@JonathanBrouwer
Copy link
Contributor

JonathanBrouwer commented Jan 27, 2025

What it does

There should be a lint that gives a warning when calling File::bytes(), TcpStream::bytes(), and maybe others.
These functions (almost?) should never be called since File and TcpStream are unbuffered and will perform a syscall everytime .next() is called. This is also noted on the docs page of the function.

I'm not sure an automatic fix will be possible, since if the bytes() is used only partially, the file is only read partially. Inserting the bufreader would change this behaviour.

Advantage

  • I measured the rewritten code to be ~1000x faster for large file

Drawbacks

  • Theoretically a user could write file.bytes().next(), then dropping the iterator. In this case, adding the BufReader would make no difference. I expect this to be a rare situation though.

Example

let file = File::open("./bytes.txt").unwrap();
let zero_bytes_count = file.bytes().map(|b| b.unwrap()).filter(|b| *b == 0).count();

Could be written as:

let file = BufReader::new(File::open("./bytes.txt").unwrap());
let zero_bytes_count = file.bytes().map(|b| b.unwrap()).filter(|b| *b == 0).count();
@JonathanBrouwer JonathanBrouwer added the A-lint Area: New lints label Jan 27, 2025
@JonathanBrouwer
Copy link
Contributor Author

I asked for feedback on the zulip here as well https://rust-lang.zulipchat.com/#narrow/channel/257328-clippy/topic/Implementing.20a.20new.20clippy.20lint/near/496167132
I'll give implementing this a try

@JonathanBrouwer JonathanBrouwer changed the title New lint: New lint: Give a warning when calling File::bytes, TcpStream::bytes(), etc Jan 27, 2025
@JonathanBrouwer
Copy link
Contributor Author

@rustbot claim

github-merge-queue bot pushed a commit that referenced this issue Feb 12, 2025
Checks for `Read::bytes()` on an unbuffered Read type.
The default implementation calls `read` for each byte, which can be very
inefficient for data that’s not in memory, such as `File`.

Considerations which I'd like to have feedback on:
* Currently this lint triggers when `.bytes()` is called on any type
that implements `std::io::Read` but not `std::io::BufRead`. This is
quite aggressive and in and may result in false positives. Alternatives:
* Only trigger on concrete types, not generic types. This does mean that
cases where a function is generic over a `R: Read` and calls `.bytes()`
are not caught by the lint, which could be quite a nasty case of this
bug.
  * Only trigger on an allowlist of stdlib types
* Compromise: Is it possible to make this lint `pedantic` on types that
are not on a allowlist?
* Theoretically, a trait implementation of `Read` could override
`.bytes()` with an efficient implementation. I'm not sure how to add
this check to the lint, and I can't find any cases of this being done in
practice.
* I don't think an automatic fix for this lint is possible, but I'd love
to be proven wrong
* This is my first time contributing to clippy, please let me know if I
did anything wrong

Fixes #14087
```
changelog: [`unbuffered_bytes`]: new lint
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: New lints
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant