Commits · 3b0b3def86e5d53bfe23e52888ddebfda5ede1a6 · Roger Ferrer / llvm-epi

Sep 10, 2019
- [ARM] auto-generate complete test checks; NFC · 3b0b3def
  Sanjay Patel authored Sep 10, 2019
```
llvm-svn: 371524
```
  3b0b3def
- Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."" · 2bf8d774
  Dmitri Gribenko authored Sep 10, 2019
```
This reverts commit r371502, it broke tests
(clang/test/CodeGenCXX/auto-var-init.cpp).

llvm-svn: 371507
```
  2bf8d774
- Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline." · 612c260e
  Clement Courbet authored Sep 10, 2019
```
With a fix for sanitizer breakage (see explanation in D60318).

llvm-svn: 371502
```
  612c260e
Sep 09, 2019

[IfConversion] Correctly handle cases where analyzeBranch fails. · 79f0d3a6

Eli Friedman authored Sep 09, 2019

If analyzeBranch fails, on some targets, the out parameters point to
some blocks in the function. But we can't use that information, so make
sure to clear it out.  (In some places in IfConversion, we assume that
any block with a TrueBB is analyzable.)

The change to the testcase makes it trigger a bug on builds without this
fix: IfConvertDiamond tries to perform a followup "merge" operation,
which isn't legal, and we somehow end up with a branch to a deleted MBB.
I'm not sure how this doesn't crash the compiler.

Differential Revision: https://reviews.llvm.org/D67306

llvm-svn: 371434

79f0d3a6

[ARM][ParallelDSP] Fix for sext input · c363deb5

Sam Parker authored Sep 09, 2019

    
The incoming accumulator value can be discovered through a sext, in
which case there will be a mismatch between the input and the result.
So sign extend the accumulator input if we're performing a 64-bit mac.

Differential Revision: https://reviews.llvm.org/D67220

llvm-svn: 371370

c363deb5

Sep 08, 2019
- [DAGCombiner][X86][ARM] Teach visitMULO to fold multiplies with 0 to 0 and no carry. · dac34f52
  Craig Topper authored Sep 08, 2019
```
I modified the ARM test to use two inputs instead of 0 so the
test hopefully still tests what was intended.

llvm-svn: 371344
```
  dac34f52
Sep 06, 2019
- [ARM] Fix for buildbot · 29bf68fc
  Sam Parker authored Sep 06, 2019
```
llvm-svn: 371187
```
  29bf68fc
Sep 05, 2019

[IfConversion] Fix diamond conversion with unanalyzable branches. · cae1e47f

Eli Friedman authored Sep 05, 2019

The code was incorrectly counting the number of identical instructions,
and therefore tried to predicate an instruction which should not have
been predicated.  This could have various effects: a compiler crash,
an assembler failure, a miscompile, or just generating an extra,
unnecessary instruction.

Instead of depending on TargetInstrInfo::removeBranch, which only
works on analyzable branches, just remove all branch instructions.

Fixes https://bugs.llvm.org/show_bug.cgi?id=43121 and
https://bugs.llvm.org/show_bug.cgi?id=41121 .

Differential Revision: https://reviews.llvm.org/D67203

llvm-svn: 371111

cae1e47f

[LLVM][Alignment] Make functions using log of alignment explicit · aff45e4b

Guillaume Chatelet authored Sep 05, 2019

Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:

 - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
 - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
 - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,

Reviewers: lattner, thegameg, courbet

Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65945

llvm-svn: 371045

aff45e4b

Sep 04, 2019

[ARM][ParallelDSP] SExt mul for accumulation · fea53223

Sam Parker authored Sep 04, 2019

For any unpaired muls, we accumulate them as an input to the
reduction. Check the type of the mul and perform a sext if the
existing accumlator input type is not the same.

Differential Revision: https://reviews.llvm.org/D66993

llvm-svn: 370851

fea53223

Sep 03, 2019

Bug fix on function epilog optimization (ARM backend) · 39bf484d

Oliver Stannard authored Sep 03, 2019

To save a 'add sp,#val' instruction by adding registers to the final pop instruction,
the first register transferred by this pop instruction need to be found.
If the function to be optimized has a non-void return value, the operand list contains
r0 (implicit) which prevents the optimization to take place.
Therefore implicit register references should be skipped in the search loop,
because this registers are never popped from the stack.

Patch by Rainer Herbertz (rOptimizer)!

Differential revision: https://reviews.llvm.org/D66730

llvm-svn: 370728

39bf484d

Sep 01, 2019

[TargetLowering] Fix Bugzilla ID 43183 to avoid soften comparison broken with constant inputs · adfdcb9c

Shiva Chen authored Sep 01, 2019

Summary:
  This fixes the bugzilla id 43183 which triggerd by the following commit:
  [RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall

llvm-svn: 370604

adfdcb9c

Aug 29, 2019

Revert [MBP] Disable aggressive loop rotate in plain mode · f9f81289

Jordan Rupprecht authored Aug 29, 2019

This reverts r369664 (git commit 51f48295)

It causes many benchmark regressions, internally and in llvm's benchmark suite.

llvm-svn: 370398

f9f81289

[DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locations · ca0e4b36

Jeremy Morse authored Aug 29, 2019

The missing line added by this patch ensures that only spilt variable
locations are candidates for being restored from the stack. Otherwise,
register or constant-value information can be interpreted as a spill
location, through a union.

The added regression test replicates a scenario where this occurs: the
stack load from [rsp] causes the register-location DBG_VALUE to be
"restored" to rsi, when it should be left alone. See PR43058 for details.

Un x-fail a test that was suffering from this from a previous patch.

Differential Revision: https://reviews.llvm.org/D66895

llvm-svn: 370334

ca0e4b36

[DebugInfo] LiveDebugValues should always revisit backedges if it skips them · 313d2ce9

Jeremy Morse authored Aug 29, 2019

The "join" method in LiveDebugValues does not attempt to join unseen
predecessor blocks if their out-locations aren't yet initialized, instead
the block should be re-visited later to see if any locations have changed
validity. However, because the set of blocks were all being "process"'d
once before "join" saw them, that logic in "join" was actually ignoring
legitimate out-locations on the first pass through. This meant that some
invalidated locations were not removed from the head of loops, allowing
illegal locations to persist.

Fix this by removing the run of "process" before the main join/process loop
in ExtendRanges. Now the unseen predecessors that "join" skips truly are
uninitialized, and we come back to the block at a later time to re-run
"join", see the @baz function added.

This also fixes another fault where stack/register transfers in the entry
block (or any other before-any-loop-block) had their tranfers initially
ignored, and were then never revisited. The MIR test added tests for this
behaviour.

XFail a test that exposes another bug; a fix for this is coming in D66895.

Differential Revision: https://reviews.llvm.org/D66663

llvm-svn: 370328

313d2ce9

Aug 28, 2019

[ARM][ParallelDSP] Change search for muls · a761ba0f

Sam Parker authored Aug 28, 2019

rL369567 reverted a couple of recent changes made to ARMParallelDSP
because of a miscompilation error: PR43073.

The issue stemmed from an underlying bug that was caused by adding
muls into a reduction before it was proved that they could be executed
in parallel with another mul.

Most of the changes here are from the previously reverted commits.
The additional changes have been made area:
1) The Search function now doesn't insert any muls into the Reduction
   object. That now happens once the search has successfully finished.
2) For any muls added into the reduction but that weren't paired, we
   accumulate their values as an input into the smlad.

Differential Revision: https://reviews.llvm.org/D66660

llvm-svn: 370171

a761ba0f

Aug 26, 2019
- [NFC][Regalloc] Add testcases for D66576 · e18aa1e0
  Zi Xuan Wu authored Aug 26, 2019
```
llvm-svn: 369877
```
  e18aa1e0
Aug 23, 2019
- [ARM] Automatically generate dsp-mlal.ll . NFC · 83f53334
  Amaury Sechet authored Aug 22, 2019
```
llvm-svn: 369718
```
  83f53334
Aug 22, 2019

[MBP] Disable aggressive loop rotate in plain mode · 51f48295

Guozhi Wei authored Aug 22, 2019

Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse.

To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true.

Differential Revision: https://reviews.llvm.org/D65673

llvm-svn: 369664

51f48295

Reapply: [ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32 · a69d9d61

Sam Tebbs authored Aug 22, 2019

The CodeGen/Thumb2/mve-vaddv.ll test needed to be amended to reflect the
changes from the above patch.

This reverts commit cd53ff6c, reapplying 7c6b2292.

llvm-svn: 369638

a69d9d61

Revert r369626 "[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32" · cd53ff6c

Hans Wennborg authored Aug 22, 2019

It broke the bots, see e.g. http://lab.llvm.org:8011/builders/clang-cuda-build/builds/36275/

> This patch fixes shifts by a 128/256 bit shift amount. It also fixes
> codegen for shifts of 32 by delegating to LLVM's default optimisation
> instead of emitting a long shift.
>
> Tests that used to generate long shifts of 32 are updated to check for the
> more optimised codegen.
>
> Differential revision: https://reviews.llvm.org/D66519
>
> llvm-svn: 369626

llvm-svn: 369636

cd53ff6c

[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32 · 7c6b2292

Sam Tebbs authored Aug 22, 2019

This patch fixes shifts by a 128/256 bit shift amount. It also fixes
codegen for shifts of 32 by delegating to LLVM's default optimisation
instead of emitting a long shift.

Tests that used to generate long shifts of 32 are updated to check for the
more optimised codegen.

Differential revision: https://reviews.llvm.org/D66519

llvm-svn: 369626

7c6b2292

Aug 21, 2019
- Revert r367389 (and follow-up r368404); it caused PR43073. · ed18e70c
  Nico Weber authored Aug 21, 2019
```
llvm-svn: 369567
```
  ed18e70c
Aug 17, 2019

Reland "[ARM] push LR before __gnu_mcount_nc" · 16fa8b09

Jian Cai authored Aug 16, 2019

This relands r369147 with fixes to unit tests.

https://reviews.llvm.org/D65019

llvm-svn: 369173

16fa8b09

[ARM] Preserve liveness in ARMConstantIslands. · eaff844f

Eli Friedman authored Aug 16, 2019

We currently don't use liveness information after this point, but it can
be useful to catch bugs using -verify-machineinstrs, and optimizations
could potentially use this information in the future.

Differential Revision: https://reviews.llvm.org/D66319

llvm-svn: 369162

eaff844f

Aug 16, 2019

Revert "[ARM] push LR before __gnu_mcount_nc" · 2d957cfe
Jian Cai authored Aug 16, 2019
```
This reverts commit f4cf3b95.

llvm-svn: 369149
```
2d957cfe

[ARM] push LR before __gnu_mcount_nc · f4cf3b95

Jian Cai authored Aug 16, 2019

Push LR register before calling __gnu_mcount_nc as it expects the value of LR register to be the top value of
the stack on ARM32.

Differential Revision: https://reviews.llvm.org/D65019

llvm-svn: 369147

f4cf3b95

[GlobalISel] CSEMIRBuilder: Add support for G_GEP · 0ae6006b

Volkan Keles authored Aug 15, 2019

Summary:
This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors
`translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types.

Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson

Reviewed By: aditya_nandakumar

Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66316

llvm-svn: 369070

0ae6006b

Aug 13, 2019

GlobalISel: Add more verifier checks for G_SHUFFLE_VECTOR · 0a04a062
Matt Arsenault authored Aug 13, 2019
```
llvm-svn: 368705
```
0a04a062

GlobalISel: Change representation of shuffle masks · 5af9cf04

Matt Arsenault authored Aug 13, 2019

Currently shufflemasks get emitted as any other constant, and you end
up with a bunch of virtual registers of G_CONSTANT with a
G_BUILD_VECTOR. The AArch64 selector then asserts on anything that
doesn't fit this pattern. This isn't an ideal representation, and
should avoid legalization and have fewer opportunities for a
representational error.

Rather than invent a new shuffle mask operand type, similar to what
ShuffleVectorSDNode does, just track the original IR Constant mask
operand. I don't completely like the idea of adding another link to
the IR, but MIR is already quite dependent on IR constants already,
and this will allow sharing the shuffle mask utility functions with
the IR.

llvm-svn: 368704

5af9cf04

Revert r368276 "[TargetLowering] SimplifyDemandedBits - call... · 5390d25f

Hans Wennborg authored Aug 13, 2019

Revert r368276 "[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT"

This introduced a false positive MemorySanitizer warning about use of
uninitialized memory in a vectorized crc function in Chromium. That suggests
maybe something is not right with this transformation. See
https://crbug.com/992853#c7 for a reproducer.

This also reverts the follow-up commits r368307 and r368308 which
depended on this.

> This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract.
>
> In particular this helps remove some unnecessary scalar->vector->scalar patterns.
>
> The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue.
>
> Differential Revision: https://reviews.llvm.org/D65887

llvm-svn: 368660

5390d25f

Aug 12, 2019

Revert r368339 "[MBP] Disable aggressive loop rotate in plain mode" · a45f301f

Hans Wennborg authored Aug 12, 2019

It caused assertions to fire when building Chromium:

  lib/CodeGen/LiveDebugValues.cpp:331: bool
  {anonymous}::LiveDebugValues::OpenRangesSet::empty() const: Assertion
  `Vars.empty() == VarLocs.empty() && "open ranges are inconsistent"' failed.

See https://crbug.com/992871#c3 for how to reproduce.

> Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse.
>
> To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true.
>
> Differential Revision: https://reviews.llvm.org/D65673

llvm-svn: 368579

a45f301f

Aug 09, 2019

[globalisel] Add G_SEXT_INREG · e9a57c2b

Daniel Sanders authored Aug 09, 2019

Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
  %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
  %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 23

All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.

To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
  %1 = G_CONSTANT 16
  %2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
  %2 = G_SEXT_INREG %0, 16
or:
  %1 = G_CONSTANT 16
  %2:_(s32) = G_SHL %0, %1
  %3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.

Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61289

llvm-svn: 368487

e9a57c2b

[ARM][ParallelDSP] Replace SExt uses · 0dba791a

Sam Parker authored Aug 09, 2019

As loads are combined and widened, we replaced their sext users
operands whereas we should have been replacing the uses of the sext.
I've added a load of tests, with only a few of them originally
causing assertion failures, the rest improve pattern coverage.

Differential Revision: https://reviews.llvm.org/D65740

llvm-svn: 368404

0dba791a

Aug 08, 2019

[MBP] Disable aggressive loop rotate in plain mode · 80347c3a

Guozhi Wei authored Aug 08, 2019

To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true.

Differential Revision: https://reviews.llvm.org/D65673

llvm-svn: 368339

80347c3a

[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits... · e2e36679

Simon Pilgrim authored Aug 08, 2019

[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT

This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract.

In particular this helps remove some unnecessary scalar->vector->scalar patterns.

The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue.

Differential Revision: https://reviews.llvm.org/D65887

llvm-svn: 368276

e2e36679

Aug 05, 2019

Reland: Fix and test inter-procedural register allocation for ARM · 8ed8353f

Oliver Stannard authored Aug 05, 2019

Add an explicit construction of the ArrayRef, gcc 5 and earlier don't
seem to select the ArrayRef constructor which takes a C array when the
construction is implicit.

Original commit message:

- Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves
  with a null RegScavenger. Simply not updating the register scavenger
  is fine because IPRA only cares about the SavedRegs vector, the acutal
  code of the function has already been generated at this point.
- Add a new hook to TargetRegisterInfo to get the set of registers which
  can be clobbered inside a call, even if the compiler can see both
  sides, by linker-generated code.

Differential revision: https://reviews.llvm.org/D64908

llvm-svn: 367819

8ed8353f

Aug 03, 2019

Revert Fix and test inter-procedural register allocation for ARM · 42618b27

Douglas Yung authored Aug 02, 2019

This reverts r367669 (git commit f6b00c27)

This was breaking a build bot http://lab.llvm.org:8011/builders/netbsd-amd64/builds/21233

llvm-svn: 367731

42618b27

Aug 02, 2019

[IPRA][ARM] Disable no-CSR optimisation for ARM · 4b7239eb

Oliver Stannard authored Aug 02, 2019

This optimisation isn't generally profitable for ARM, because we can
save/restore many registers in the prologue and epilogue using the PUSH
and POP instructions, but mostly use individual LDR/STR instructions for
other spills.

Differential revision: https://reviews.llvm.org/D64910

llvm-svn: 367670

4b7239eb

Fix and test inter-procedural register allocation for ARM · f6b00c27

Oliver Stannard authored Aug 02, 2019

- Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves
  with a null RegScavenger. Simply not updating the register scavenger
  is fine because IPRA only cares about the SavedRegs vector, the acutal
  code of the function has already been generated at this point.
- Add a new hook to TargetRegisterInfo to get the set of registers which
  can be clobbered inside a call, even if the compiler can see both
  sides, by linker-generated code.

Differential revision: https://reviews.llvm.org/D64908

llvm-svn: 367669

f6b00c27