Skip to content

Commit 8df391c

Browse files
Mike PallBuristan
Mike Pall
authored andcommitted
Limit CSE for IR_CARG to fix loop optimizations.
Thanks to Peter Cawley. (cherry picked from commit 3bdc649) `IR_CALLXS` for the vararg function contains `IR_CARG(fptr, ctid)` as the second operand. The `loop_emit_phi()` scans only the first operand of the IR, so the second is not marked as PHI. In this case, when the IR appears in both the invariant and variant parts of the loop, CSE may remove it and thus lead to incorrect emitting results. This patch tweaks the CSE rules to avoid CSE across the `IR_LOOP`. Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#10199 Reviewed-by: Sergey Bronnikov <[email protected]> Reviewed-by: Maxim Kokryashkin <[email protected]> Signed-off-by: Sergey Kaplun <[email protected]> (cherry picked from commit b52fe97)
1 parent ac1cb36 commit 8df391c

File tree

2 files changed

+67
-0
lines changed

2 files changed

+67
-0
lines changed

Diff for: src/lj_opt_fold.c

+11
Original file line numberDiff line numberDiff line change
@@ -2406,6 +2406,17 @@ LJFOLD(XSNEW any any)
24062406
LJFOLD(BUFHDR any any)
24072407
LJFOLDX(lj_ir_emit)
24082408

2409+
/* -- Miscellaneous ------------------------------------------------------- */
2410+
2411+
LJFOLD(CARG any any)
2412+
LJFOLDF(cse_carg)
2413+
{
2414+
TRef tr = lj_opt_cse(J);
2415+
if (tref_ref(tr) < J->chain[IR_LOOP]) /* CSE across loop? */
2416+
return EMITFOLD; /* Raw emit. Assumes fins is left intact by CSE. */
2417+
return tr;
2418+
}
2419+
24092420
/* ------------------------------------------------------------------------ */
24102421

24112422
/* Every entry in the generated hash table is a 32 bit pattern:
+56
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
local ffi = require('ffi')
2+
local table_new = require('table.new')
3+
4+
-- Test file to demonstrate LuaJIT incorrect behaviour for
5+
-- recording the FFI call to the vararg function. See also:
6+
-- https://github.com/LuaJIT/LuaJIT/issues/1244.
7+
local tap = require('tap')
8+
local test = tap.test('lj-1244-missing-phi-carg'):skipcond({
9+
['Test requires JIT enabled'] = not jit.status(),
10+
})
11+
12+
-- Loop unrolls into 2 iterations. Thus means that the loop is
13+
-- executed on trace on the 5th iteration (instead of the usual
14+
-- 4th). Run it even number of iterations to test both, so last is
15+
-- 6th.
16+
local NTESTS = 6
17+
18+
test:plan(NTESTS)
19+
20+
-- XXX: Hack with function's prototypes to avoid creation of
21+
-- custom functions to be loaded via FFI (vararg part will be just
22+
-- ignored).
23+
ffi.cdef[[
24+
double sin(double, ...);
25+
double cos(double, ...);
26+
]]
27+
28+
local EXPECTED = {[0] = ffi.C.sin(0), ffi.C.cos(0)}
29+
30+
-- Array of 2 functions.
31+
local fns = ffi.new('double (*[2])(double, ...)')
32+
fns[0] = ffi.C.cos
33+
fns[1] = ffi.C.sin
34+
35+
-- Avoid reallocating the table on the trace.
36+
local result = table_new(8, 0)
37+
38+
jit.opt.start('hotloop=1')
39+
40+
local fn = fns[0]
41+
-- The first result is `cos()`.
42+
for i = 1, NTESTS do
43+
result[i] = fn(0)
44+
fn = fns[i % 2]
45+
-- The call persists in the invariant part of the loop as well.
46+
-- Hence, XLOAD (part of the IR_CARG -- function to be called)
47+
-- should be marked as PHI, but it isn't due to CSE.
48+
fn(0)
49+
end
50+
51+
for i = 1, NTESTS do
52+
test:is(result[i], EXPECTED[i % 2],
53+
('correct result on iteration %d'):format(i))
54+
end
55+
56+
test:done(true)

0 commit comments

Comments
 (0)