Commit c82e7fd1 authored Feb 10, 2020 by Rafael Auler Committed by Maksim Panchenko Feb 10, 2020

[BOLT] Decoder cache friendly alignment wrt Intel JCC Erratum

Summary:
This diff ports reviews.llvm.org/D70157 to our LLVM tree, which
makes the integrated assembler able to align X86 control-flow changing
instructions in a way to reduce the performance impact of the ucode
update on Intel processors that implement the JCC erratum mitigation.

See white paper "Mitigations for Jump Conditional Code Erratum" by Intel
published November 2019.

To port this patch, I changed classifySecondInstInMacroFusion to analyze
instruction opcodes directly instead of analyzing the CondCond operand
(in more recent versions of LLVM, all conditional branches share the
same opcode, but with a different conditional operand). I also pulled to
our tree Alignment.h as a dependency, and the macroop analyzing helpers.

x86-align-branch-boundary and -x86-align-branch are the two flags that
control nop insertion to avoid disabling the decoder cache, following
the original patch. In BOLT, I added the flag
x86-align-branch-boundary-hot-only to request the alignment to only be
applied to hot code, which is turned on by default. The reason is
because such alignment  is expensive to perform on large modules, but if
we limit it to hot code, the relaxation pass runtime becomes tolerable.

(cherry picked from FBD19828850)

parent d5b8fc8f

Show whitespace changes

Inline Side-by-side

Please to comment