Fast Path Math.min/max_F/D #7617

luke-li-2003 · 2025-01-22T17:09:14Z

Re-enable the fast pathing of Math.min/max for floating points with the behaviour of +/-0.0 and NaN values correctly addressed.

luke-li-2003 · 2025-01-22T17:10:06Z

Issue: https://github.ibm.com/runtimes/openj9-jit-power/issues/416

See also: eclipse-openj9/openj9#20999

rmnattas

Looks good, just typo and format changes

rmnattas · 2025-01-22T19:37:35Z

compiler/p/codegen/ControlFlowEvaluator.cpp

-         generateConditionalBranchInstruction(cg, TR::InstOpCode::bnun, node, nan_label, condReg);
-         // Move the NaN which is in one of trgReg or src2Reg to trgReg by fadd
-         generateTrg1Src2Instruction(cg, TR::InstOpCode::fadd, node, trgReg, trgReg, src2Reg);
+         // Go to the nan_lavel for NaN


Small typo nan_lavel

rmnattas · 2025-01-22T19:37:38Z

compiler/p/codegen/ControlFlowEvaluator.cpp

+         generateTrg1Src2Instruction(cg,
+            max ? TR::InstOpCode::xsmaxdp : TR::InstOpCode::xsmindp,
+            node,
+            trgReg, src1Reg, src2Reg);


Instruction generation calls are usually in one-line or few if needed.

rmnattas

LGTM

luke-li-2003 · 2025-01-23T15:55:46Z

@hzongaro can you review and merge this?

luke-li-2003 · 2025-01-24T16:51:20Z

Performance test for jdk11 SIMDDoubleMaxMinBench

Branch	Score
Baseline	14180465.000000
FastPathMathMinMaxFD	15760804.000000

zl-wang · 2025-01-27T19:27:34Z

compiler/p/codegen/ControlFlowEvaluator.cpp

-         generateConditionalBranchInstruction(cg, TR::InstOpCode::bnun, node, nan_label, condReg);
-         // Move the NaN which is in one of trgReg or src2Reg to trgReg by fadd
-         generateTrg1Src2Instruction(cg, TR::InstOpCode::fadd, node, trgReg, trgReg, src2Reg);
+         // Go to the nan_label for NaN


this will run into functional bugs (and difficult to debug) sooner or later: since src1Reg is now used in controlFlow area, you will run into register allocation bugs under high register pressure if it is not added in dependency condition.

I added a condition, is that correct?

zl-wang

LGTM

zl-wang · 2025-01-27T20:37:42Z

compiler/p/codegen/ControlFlowEvaluator.cpp

      dep->addPostCondition(condReg, TR::RealRegister::NoReg);
      dep->addPostCondition(trgReg, TR::RealRegister::NoReg);
+      dep->addPostCondition(src1Reg, TR::RealRegister::NoReg);
      dep->addPostCondition(src2Reg, TR::RealRegister::NoReg);


i guessed it is acceptable to be a little conservative. adding src1Reg into dependency condition is not necessary for integer side, but it is not wrong either.

zl-wang · 2025-01-29T13:37:31Z

compiler/p/codegen/ControlFlowEvaluator.cpp

      dep->addPostCondition(condReg, TR::RealRegister::NoReg);
      dep->addPostCondition(trgReg, TR::RealRegister::NoReg);
+      dep->addPostCondition(src1Reg, TR::RealRegister::NoReg);


I missed this bug previously: trgReg can be the same register as src1Reg in some cases, such that registerAllocator can get into endless loop when the same register appeared twice or more times in a dep condition.

Re-enable the fast pathing of Math.min/max for floating points with the behaviour of +/-0.0 and NaN values correctly addressed. Signed-off-by: Luke Li <[email protected]>

luke-li-2003 · 2025-02-04T19:02:29Z

@hzongaro the PR is ready now, can you review and merge this?

luke-li-2003 mentioned this pull request Jan 22, 2025

Fast Path Math.min/max_F/D eclipse-openj9/openj9#20999

Open

rmnattas suggested changes Jan 22, 2025

View reviewed changes

luke-li-2003 force-pushed the FastPathMathMinMaxFD branch 2 times, most recently from 2571b9d to d6e3454 Compare January 22, 2025 19:49

rmnattas approved these changes Jan 22, 2025

View reviewed changes

hzongaro added comp:compiler arch:power labels Jan 23, 2025

zl-wang suggested changes Jan 27, 2025

View reviewed changes

luke-li-2003 force-pushed the FastPathMathMinMaxFD branch 2 times, most recently from e9aa67f to d6b87f0 Compare January 27, 2025 19:59

zl-wang approved these changes Jan 27, 2025

View reviewed changes

zl-wang suggested changes Jan 29, 2025

View reviewed changes

Fast Path Math.min/max_F/D

3bae6ee

Re-enable the fast pathing of Math.min/max for floating points with the behaviour of +/-0.0 and NaN values correctly addressed. Signed-off-by: Luke Li <[email protected]>

luke-li-2003 force-pushed the FastPathMathMinMaxFD branch from d6b87f0 to 3bae6ee Compare January 29, 2025 16:12

zl-wang approved these changes Jan 29, 2025

View reviewed changes

hzongaro self-assigned this Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast Path Math.min/max_F/D #7617

Fast Path Math.min/max_F/D #7617

luke-li-2003 commented Jan 22, 2025

luke-li-2003 commented Jan 22, 2025

rmnattas left a comment

rmnattas Jan 22, 2025

rmnattas Jan 22, 2025 •

edited

Loading

rmnattas left a comment

luke-li-2003 commented Jan 23, 2025 •

edited

Loading

luke-li-2003 commented Jan 24, 2025

zl-wang Jan 27, 2025

luke-li-2003 Jan 27, 2025

zl-wang left a comment

zl-wang Jan 27, 2025

zl-wang Jan 29, 2025

luke-li-2003 commented Feb 4, 2025

Fast Path Math.min/max_F/D #7617

Are you sure you want to change the base?

Fast Path Math.min/max_F/D #7617

Conversation

luke-li-2003 commented Jan 22, 2025

luke-li-2003 commented Jan 22, 2025

rmnattas left a comment

Choose a reason for hiding this comment

rmnattas Jan 22, 2025

Choose a reason for hiding this comment

rmnattas Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

rmnattas left a comment

Choose a reason for hiding this comment

luke-li-2003 commented Jan 23, 2025 • edited Loading

luke-li-2003 commented Jan 24, 2025

zl-wang Jan 27, 2025

Choose a reason for hiding this comment

luke-li-2003 Jan 27, 2025

Choose a reason for hiding this comment

zl-wang left a comment

Choose a reason for hiding this comment

zl-wang Jan 27, 2025

Choose a reason for hiding this comment

zl-wang Jan 29, 2025

Choose a reason for hiding this comment

luke-li-2003 commented Feb 4, 2025

rmnattas Jan 22, 2025 •

edited

Loading

luke-li-2003 commented Jan 23, 2025 •

edited

Loading