Skip to content

Iterating over ranges generates overflow check with opt-level = "z" and lto enabled #53627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
manuelVo opened this issue Aug 23, 2018 · 2 comments
Labels
A-codegen Area: Code generation

Comments

@manuelVo
Copy link

The following code generates an overflow check, when opt-level="z" and lto=true are set in the Cargo.toml. As the implementation of Iterator for Range states this check shouldn't actually be there: https://doc.rust-lang.org/src/core/iter/range.rs.html#215

My build target is x86_64-unknown-linux-gnu. I'm building with cargo build --release.

This is the code producing the bug:

fn main() {
    let arr1 = [1, 2, 3];
    let arr2 = [4, 5];
    println!("{} {}", add(&arr1), add(&arr2));
}

#[inline(never)]
pub extern fn add(numbers: &[i32]) -> i32 {
    let mut result = 0;
    for i in 0..numbers.len() {
        result += numbers[i];
    }
    result
}

Combined with this Cargo.toml:

[package]
name = "rust_binary"
version = "0.1.0"

[profile.release]
lto = true
opt-level = "z"

This is the resulting add function:

0000000000006a76 <rust_binary::add>:
    6a76: xor    ecx,ecx
    6a78: xor    eax,eax
    6a7a: jmp    6a8b <rust_binary::add+0x15>
    6a7c: mov    rdx,rcx
    6a7f: add    rdx,0x1
    6a83: jb     6a90 <rust_binary::add+0x1a>   # This is the overflow check
    6a85: add    eax,DWORD PTR [rdi+rcx*4]
    6a88: inc    rcx
    6a8b: cmp    rcx,rsi
    6a8e: jb     6a7c <rust_binary::add+0x6>
    6a90: ret    

Meta

rustc 1.30.0-nightly (33b923fd4 2018-08-18)
binary: rustc
commit-hash: 33b923fd44c5c5925e635815fce68bdf1f98740f
commit-date: 2018-08-18
host: x86_64-unknown-linux-gnu
release: 1.30.0-nightly
LLVM version: 7.0
@manuelVo
Copy link
Author

Setting codegen-units=1 fixes this, so this might be a duplicate of #48371.

@memoryruins memoryruins added the A-codegen Area: Code generation label Sep 15, 2018
@workingjubilee
Copy link
Member

Current codegen with an equivalent Godbolt:

example::add:
        xor     ecx, ecx
        xor     eax, eax
.LBB0_1:
        cmp     rsi, rcx
        je      .LBB0_2
        add     eax, dword ptr [rdi + 4*rcx]
        inc     rcx
        jmp     .LBB0_1
.LBB0_2:
        ret

which matches with the objdump of a cargo project I did just to check:

0000000000005ae7 <add>:
    5ae7:	31 c9                	xor    ecx,ecx
    5ae9:	31 c0                	xor    eax,eax
    5aeb:	48 39 ce             	cmp    rsi,rcx
    5aee:	74 08                	je     5af8 <add+0x11>
    5af0:	03 04 8f             	add    eax,DWORD PTR [rdi+rcx*4]
    5af3:	48 ff c1             	inc    rcx
    5af6:	eb f3                	jmp    5aeb <add+0x4>
    5af8:	c3                   	ret    

Appears to be resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation
Projects
None yet
Development

No branches or pull requests

3 participants