Commits · 05da2fe52162c80dfa18aedf70cf73cb11201811 · Lorenzo Albano / LLVM bpEVL

Nov 14, 2019

Sink all InitializePasses.h includes · 05da2fe5

Reid Kleckner authored Nov 13, 2019

This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211

05da2fe5

Forward declare Optional<T> in STLExtras.h · a36f3163
Reid Kleckner authored Nov 13, 2019
```
WIP stats
```
a36f3163
[AMDGPU] Fixed mfma-loop test. NFC. · af7d4022
Stanislav Mekhanoshin authored Nov 13, 2019

af7d4022

Sink MachineFunction private method out of line · 364d1785

Reid Kleckner authored Nov 13, 2019

This method is private and only called from this file and doesn't need
to be inline. Saves a TargetMachine.h include in MachineFunction.h, a
popular header. The include was introduced in 98603a81 despite the
forward decl of LLVMTargetMachine.

364d1785

[X86] Don't treat mxcsr as a register name when parsing MS inline assembly. · 188d92b9
Craig Topper authored Nov 13, 2019
```
No instruction takes mxcsr as a an operand so we should always
treat it as an identifier name.
```
188d92b9

Nov 13, 2019

[X86] Don't set the operation action for i16 SINT_TO_FP to Promote just because SSE1 is enabled. · f7e9d81a
Craig Topper authored Nov 13, 2019
```
Instead do custom promotion in the handler so that we can still
allow i16 to be used with fp80. And f64 without sse2.
```
f7e9d81a
[X86] Fix typo in comment. NFC · 787595b2
Craig Topper authored Nov 13, 2019

787595b2

[X86] Move all the FP_TO_XINT/XINT_TO_FP setOperationActions into the same... · fee90672

Craig Topper authored Nov 13, 2019

[X86] Move all the FP_TO_XINT/XINT_TO_FP setOperationActions into the same !useSoftFloat block. Qualify all of the Promote actions for these with !useSoftFloat too. NFCI

The Promote action doesn't apply until LegalizeDAG. By the time
we get there, we would have already softened all the FP operations
if useSoftFloat was true. So there wouldn't be any operation left
to Promote.

fee90672

[PGO][PGSO] Temporarily disable the large working set size behavior. · 3f0969da

Hiroshi Yamauchi authored Nov 13, 2019

Summary:
This temporarily disables the large working set size behavior in profile guided
size optimization due to internal benchmark regressions.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70207

3f0969da

[SimplifyCFG] add test for select with FMF; NFC · be08af88
Sanjay Patel authored Nov 13, 2019

be08af88

[SLP] fix miscompile on min/max reductions with extra uses (PR43948) · a3e61946

Sanjay Patel authored Nov 13, 2019

The bug manifests as replacing a reduction operand with an undef
value.

The problem appears to be limited to cases where a min/max reduction
has extra uses of the compare operand to the select.

In the general case, we are tracking "ExternallyUsedValues" and
an "IgnoreList" of the reduction operations, but those may not apply
to the final compare+select in a min/max reduction.

For that, we use replaceAllUsesWith (RAUW) to ensure that the new
vectorized reduction values are transferred to all subsequent users.

Differential Revision: https://reviews.llvm.org/D70148

a3e61946

Add -disable-builtin option to opt · 597b77fb

Dimitry Andric authored Nov 13, 2019

Summary:
The option allows to disable specific target library builtin functions,
instead of -disable-simplify-libcalls, which disables all of them.

This is a prerequisite for D70143, which fixes PR43081.

Reviewers: xbolva00, spatel, jdoerfert, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70193

597b77fb

[dsymutil] Add -dump to llvm-bcanalyzer invocations · 3dfe4cf9
Francis Visoiu Mistrih authored Nov 13, 2019

3dfe4cf9

[TargetLowering] Increase the storage size of NumRegistersForVT to allow the... · 84e83b54

Craig Topper authored Nov 13, 2019

[TargetLowering] Increase the storage size of NumRegistersForVT to allow the type break down for v256i1 and other types to be stored correctly

v256i1 on X86 without avx512 breaks down to 256 i8 values when passed between basic blocks. But the NumRegistersForVT was sized at a byte for each VT. This results in 256 being stored as 0.

This patch enlarges the type to 16 bits and adds an assert to ensure that no information is lost when the entry is stored.

Differential Revision: https://reviews.llvm.org/D70138

84e83b54

[mips] Reduce number of nested `if` statements. NFC · 63bbbcde
Simon Atanasyan authored Nov 12, 2019

63bbbcde
[mips] Add test to check ELF output for JAL XGOT expansion. NFC · 14d31622
Simon Atanasyan authored Nov 10, 2019

14d31622
[mips] Add tests to check `jal sym+offset`. NFC · 3216d284
Simon Atanasyan authored Nov 09, 2019

3216d284

[LiveInterval] Allow updating subranges with slightly out-dated IR · de94cda8

Quentin Colombet authored Nov 12, 2019

During register coalescing, we update the live-intervals on-the-fly.
To do that we are in this strange mode where the live-intervals can
be slightly out-of-sync (more precisely they are forward looking)
compared to what the IR actually represents.
This happens because the register coalescer only updates the IR when
it is done with updating the live-intervals and it has to do it this
way because updating the IR on-the-fly would actually clobber some
information on how the live-ranges that are being updated look like.

This is problematic for updates that rely on the IR to accurately
represents the state of the live-ranges. Right now, we have only
one of those: stripValuesNotDefiningMask.
To reconcile this need of out-of-sync IR, this patch introduces a
new argument to LiveInterval::refineSubRanges that allows the code
doing the live range updates to reason about how the code should
look like after the coalescer will have rewritten the registers.
Essentially this captures how a subregister index with be offseted
to match its position in a new register class.

E.g., let say we want to merge:
    V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32>

We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32>
overlap, i.e., by choosing a class where we can find "offset + 1 == 3".
Put differently we align V2's sub3 with V1's sub1:
    V2: sub0 sub1 sub2 sub3
    V1: <offset>  sub0 sub1

This offset will look like a composed subregidx in the the class:
     V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>
 =>  V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>

Now if we didn't rewrite the uses and def of V1, all the checks for V1
need to account for this offset to match what the live intervals intend
to capture.

Prior to this patch, we would fail to recognize the uses and def of V1
and would end up with machine verifier errors: No live segment at def.
This could lead to miscompile as we would drop some live-ranges and
thus, miss some interferences.

For this problem to trigger, we need to reach stripValuesNotDefiningMask
while having a mismatch between the IR and the live-ranges (i.e.,
we have to apply a subreg offset to the IR.)

This requires the following three conditions:
1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1>
2. An update with Tuple registers with a possibility to coalesce the
   subreg index: e.g., v1.dsub_1 == v2.dsub_3
3. Subreg liveness enabled.

looking at the IR to decide what is alive and what is not, i.e., calling
stripValuesNotDefiningMask.
coalescer maintains for the live-ranges information.

None of the targets that currently use subreg liveness (i.e., the targets
that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and
and #2, so this patch also artificial enables subreg liveness for ARM,
so that a nice test case can be attached.

de94cda8

[TTI] Fix cast cost on vector types. · 2bf9b9a5
Michael Liao authored Nov 13, 2019
```
- Only split vector types when both src and dst types are splittable.
```
2bf9b9a5

[llvm-bcanalyzer] Don't dump the contents if -dump is not passed · 1ca85b3d

Francis Visoiu Mistrih authored Nov 13, 2019

With all the previous refactorings this slipped through and now we
always dump the contents of the bitcode files, even if -dump is not
passed.

1ca85b3d

[AArch64][v8.3a] Add missing imp-defs on RETA*. · 7313d7d6

Ahmed Bougacha authored Jun 25, 2019

RETA always implicitly uses LR, unlike RET which merely has an
alias that defaults it to LR.
Additionally, RETA implicitly uses SP as well, which it uses as
a discriminator to authenticate LR.

This isn't usually noticeable, because RET_ReallyLR is used in most
of the backend.  However, the post-RA scheduler, if enabled, will
cause miscompiles if the imp-uses are missing.

While there, fix a typo in the lone affected testcase.

7313d7d6

[AArch64][v8.3a] Add LDRA '[xN]!' alias. · 643ac6c0

Ahmed Bougacha authored Jun 25, 2019

The instruction definition has been retroactively expanded to
allow for an alias for '[xN, 0]!' as '[xN]!'.
That wouldn't make sense on LDR, but does for LDRA.

643ac6c0

[SLP] improve test readability; NFC · 142cbe73
Sanjay Patel authored Nov 13, 2019

142cbe73
Fix typo in DwarfDebug [NFC] · 7417cc14
David Stenberg authored Nov 13, 2019

7417cc14

Don't set LLVM_NO_DEAD_STRIP on AIX · 8b2b2c08

David Tenty authored Nov 13, 2019

Summary:
when building plugins, as AIX has symbols in it's standard library that
must be garbage collected or we will see link errors. Export lists will
handle this instead on AIX.

Reviewers: stevewan, sfertile, jasonliu, xingxue, DiggerLin

Reviewed By: DiggerLin

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70130

8b2b2c08

[SLP] reduce code duplication for min/max vs. other reductions; NFCI · e9bf7a60
Sanjay Patel authored Nov 13, 2019

e9bf7a60
Fix comment spelling {addresing -> addressing} (NFC) · e5f3760e
Matthew Malcomson authored Nov 13, 2019

e5f3760e

[InstCombine] propagate fast-math-flags (FMF) to select when inverting fcmp+select · 3d6b5398

Sanjay Patel authored Nov 13, 2019

As noted by the FIXME comment, this is not correct based on our current FMF semantics.
We should be propagating FMF from the final value in a sequence (in this case the
'select'). So the behavior even without this patch is wrong, but we did not allow FMF
on 'select' until recently.

But if we do the correct thing right now in this patch, we'll inevitably introduce
regressions because we have not wired up FMF propagation for 'phi' and 'select' in
other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a
better incremental way to make progress.

That said, the potential extra damage over the existing wrong behavior from this
patch is very limited. AFAIK, the only way to have different FMF on IR in the same
function is if we have LTO inlined IR from 2 modules that were compiled using
different fast-math settings.

As seen in the tests, we may actually see some improvements with this patch because
adding the FMF to the 'select' allows matching to min/max intrinsics that were
previously missed (in the common case, the 'fcmp' and 'select' should have identical
FMF to begin with).

Next steps in the transition:

Make similar changes in instcombine as needed.
Enable phi-to-select FMF propagation in SimplifyCFG.
Remove dependencies on fcmp with FMF.
Deprecate FMF on fcmp.

Differential Revision: https://reviews.llvm.org/D69720

3d6b5398

DWARFDebugLoclists: Add an api to get the location lists of a DWARF unit · 1eea3fa0

Pavel Labath authored Nov 08, 2019

Summary:
This avoid the need to duplicate the location lists searching logic in
various users. The "inline location list dumping" code (which is the
only user actually updated to handle DWARF v5 location lists)  is
switched to this method. After adding v4 location list support, I'll
switch other users too.

Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70084

1eea3fa0

Remove commented out CHECK-NEXT to try and appease llvm-clang-x86_64-expensive-checks-win buildbot · e84b7a5f
Simon Pilgrim authored Nov 13, 2019

e84b7a5f
PowerPC - fix uninitialized variable warnings. NFCI. · 86f07e82
Simon Pilgrim authored Nov 13, 2019

86f07e82
Fix uninitialized variable warning. NFCI. · e1670175
Simon Pilgrim authored Nov 13, 2019

e1670175
Fix uninitialized variable warning. NFCI. · 29a5a6ee
Simon Pilgrim authored Nov 13, 2019

29a5a6ee
Fix uninitialized variable warning. NFCI. · 6ebc5089
Simon Pilgrim authored Nov 13, 2019

6ebc5089
Sparc - fix uninitialized variable warnings. NFCI. · b3be859b
Simon Pilgrim authored Nov 13, 2019

b3be859b
PPCReduceCRLogicals - fix static analyzer warnings. NFC · 66f2ed07
Simon Pilgrim authored Nov 13, 2019
```
- Fix uninitialized variable warnings.
- Fix null dereference warnings.
```
66f2ed07
SLPVectorizer - make comparison operators + isInSchedulingRegion const · d1bd5e47
Simon Pilgrim authored Nov 13, 2019
```
Fixes cppcheck warnings.
```
d1bd5e47

[InstCombine] Avoid moving ops that do restrict undef across shuffles. · f7499011

Florian Hahn authored Nov 13, 2019

I think we have to be a bit more careful when it comes to moving
ops across shuffles, if the op does restrict undef. For example, without
this patch, we would move 'and %v, <0, 0, -1, -1>' over a
'shufflevector %a, undef, <undef, undef, 1, 2>'. As a result, the first
2 lanes of the result are undef after the combine, but they really
should be 0, unless I am missing something.

For ops that do fold to undef on undef operands, the current behavior
should be fine. I've add conservative check OpDoesRestrictUndef, maybe
there's a better existing utility?

Reviewers: spatel, RKSimon, lebedev.ri

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D70093

f7499011

Revert "[RISCV] Fix wrong CFI directives" · c5b56caa
Luís Marques authored Nov 13, 2019
```
test/DebugInfo/RISCV/relax-debug-frame.ll wasn't properly updated.
```
c5b56caa
[InstCombine] Precommit shuffle tests for D70093. · 70cc355f
Florian Hahn authored Nov 11, 2019

70cc355f