Commits · b9d01aa29e5d0aa433c2fc62ace709fe69c45ceb · Lorenzo Albano / LLVM bpEVL

Jul 11, 2018

[Power9] Add remaining __flaot128 builtin support for FMA round to odd · b9d01aa2

Stefan Pintilie authored Jul 11, 2018

Implement this as it is done on GCC:

__float128 a, b, c, d;
a = __builtin_fmaf128_round_to_odd (b, c, d);         // generates xsmaddqpo
a = __builtin_fmaf128_round_to_odd (b, c, -d);        // generates xsmsubqpo
a = - __builtin_fmaf128_round_to_odd (b, c, d);       // generates xsnmaddqpo
a = - __builtin_fmaf128_round_to_odd (b, c, -d);      // generates xsnmsubpqp

Differential Revision: https://reviews.llvm.org/D48218

llvm-svn: 336754

b9d01aa2

· fb361d25

Chen Zheng authored Jul 11, 2018

  [test cases] add test cases for find more abs pattern

  Differential Revision: https://reviews.llvm.org/D49123

llvm-svn: 336752

fb361d25

[ARM] Treat cmn immediates as legal in isLegalICmpImmediate. · d2c73923

Eli Friedman authored Jul 10, 2018

The original code attempted to do this, but the std::abs() call didn't
actually do anything due to implicit type conversions.  Fix the type
conversions, and perform the correct check for negative immediates.

This probably has very little practical impact, but it's worth fixing
just to avoid confusion in the future, I think.

Differential Revision: https://reviews.llvm.org/D48907

llvm-svn: 336742

d2c73923

[X86] Teach X86InstrInfo::commuteInstructionImpl to use MOVSD/MOVSS for BLEND... · 860ab496

Craig Topper authored Jul 10, 2018

[X86] Teach X86InstrInfo::commuteInstructionImpl to use MOVSD/MOVSS for BLEND under optsize when the immediate allows it.

Isel currently emits movss/movsd a lot of the time and an accidental double commute turns it into a blend.

Ideally we'd select blend directly in isel under optspeed and not rely on the double commute to create blend.

llvm-svn: 336731

860ab496

Jul 10, 2018

[ThinLTO] Use std::map to get determistic imports files · c0320ef4

Teresa Johnson authored Jul 10, 2018

Summary:
I noticed that the .imports files emitted for distributed ThinLTO
backends do not have consistent ordering. This is because StringMap
iteration order is not guaranteed to be deterministic. Since we already
have a std::map with this information, used when emitting the individual
index files (ModuleToSummariesForIndex), use it for the imports files as
well.

This issue is likely causing some unnecessary rebuilds of the ThinLTO
backends in our distributed build system as the imports files are inputs
to those backends.

Reviewers: pcc, steven_wu, mehdi_amini

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48783

llvm-svn: 336721

c0320ef4

[GlobalISel][X86_64] Support for G_SITOFP · 48ca0550
Alexander Ivchenko authored Jul 10, 2018
```
The instruction selection is automatically handled by tablegen

llvm-svn: 336703
```
48ca0550
[Evaluator] Examine alias when evaluating function call · 6a572b8e
Eugene Leviant authored Jul 10, 2018
```
This fixes PR38120

llvm-svn: 336702
```
6a572b8e
[DAGCombiner] Add special case fast paths for udiv x,1 and udiv x,-1 · 4cb46093
Simon Pilgrim authored Jul 10, 2018
```
udiv x,-1 was going down the (slow) BuildUDIV route resulting in unnecessary shifts.

llvm-svn: 336701
```
4cb46093

AMDGPU: Make hidden argument metadata consistent with · f0badd5a

Konstantin Zhuravlyov authored Jul 10, 2018

amdgpu-implicitarg-num-bytes attribute

Differential Revision: https://reviews.llvm.org/D49096

llvm-svn: 336697

f0badd5a

[InstCombine] allow flag propagation when using safe constant · c8d9d812
Sanjay Patel authored Jul 10, 2018
```
This corresponds with the code for the single binop pattern
added in rL336684.

llvm-svn: 336696
```
c8d9d812
[X86] Add srem/udiv/urem by constant tests · 9bd9fef4
Simon Pilgrim authored Jul 10, 2018
```
Match the tests in combine-sdiv.ll

llvm-svn: 336694
```
9bd9fef4
[WebAssembly] Add missing a few {{$}}s to a test · 9ef850b8
Heejin Ahn authored Jul 10, 2018
```
llvm-svn: 336691
```
9ef850b8
AMDGPU/NFC: Fix typo in test name · 75024cf4
Konstantin Zhuravlyov authored Jul 10, 2018
```
hsa-metadata-enqueu-kernel.ll ->
hsa-metadata-enqueue-kernel.ll

llvm-svn: 336689
```
75024cf4
Update test to work on Windows · 88ef98b0
Paul Robinson authored Jul 10, 2018
```
llvm-svn: 336687
```
88ef98b0

[InstCombine] safely allow non-commutative binop identity constant folds · 509a1e7a

Sanjay Patel authored Jul 10, 2018

This was originally intended with D48893, but as discussed there, we
have to make the folds safe from producing extra poison. This should
give the single binop folds the same capabilities as the existing
folds for 2-binops+shuffle.

LLVM binary opcode review: there are a total of 18 binops. There are 7 
commutative binops (add, mul, and, or, xor, fadd, fmul) which we already 
fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr,
fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't 
bother with sub/fsub with constant operand 1 because those are 
canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18.

llvm-svn: 336684

509a1e7a

[Hexagon] Change .mir testcase to make sure function is not in SSA form · c87ecf25

Krzysztof Parzyszek authored Jul 10, 2018

If a machine function satisfies SSA, the IsSSA property is assumed even
if the pass to be executed runs after existing from SSA. If the pass
output then does not conform to SSA, a verifier error will be flagged
(with expensive checks enabled).

llvm-svn: 336682

c87ecf25

Support -fdebug-prefix-map in llvm-mc. This is useful to omit the · c17c8bf7

Paul Robinson authored Jul 10, 2018

debug compilation dir when compiling assembly files with -g.
Part of PR38050.

Patch by Siddhartha Bagaria!

Differential Revision: https://reviews.llvm.org/D48988

llvm-svn: 336680

c17c8bf7

[InstCombine] drop poison flags when shuffle mask undef propagates to constant · 3333106a
Sanjay Patel authored Jul 10, 2018
```
llvm-svn: 336679
```
3333106a

[AArch64][SVE] Asm: Support for predicated unary operations. · 53108d48

Sander de Smalen authored Jul 10, 2018

This patch adds support for the following instructions:
  CLS  (Count Leading Sign bits)
  CLZ  (Count Leading Zeros)
  CNT  (Count non-zero bits)
  CNOT (Logically invert boolean condition in vector)
  NOT  (Bitwise invert vector)
  FABS (Floating-point absolute value)
  FNEG (Floating-point negate)

All operations are predicated and unary, e.g.
  clz  z0.s, p0/m, z1.s

- CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32
  and 64 bit elements.

- FABS and FNEG have variants for 16, 32 and 64 bit elements.

llvm-svn: 336677

53108d48

Reapply "AMDGPU: Force inlining if LDS global address is used" · a680199a
Matt Arsenault authored Jul 10, 2018
```
This reverts commit r336623

llvm-svn: 336675
```
a680199a

[InstCombine] allow more shuffle-binop folds with safe constants · 06ea4206

Sanjay Patel authored Jul 10, 2018

The case with 2 variables is more complicated than the case where
we eliminate the shuffle entirely because a shuffle with an undef 
mask element creates an undef result. 

I'm not aware of any current analysis/transform that recognizes that 
undef propagating to a div/rem/shift, but we have to guard against 
the possibility.

llvm-svn: 336668

06ea4206

[DebugInfo][LoopVectorize] Preserve DL in induction PHI and Add · 612bf7ca
Anastasis Grammenos authored Jul 10, 2018
```
Differential Revision: https://reviews.llvm.org/D48968

llvm-svn: 336667
```
612bf7ca

[Hexagon] Add implicit uses even when untied explicit uses are present · c052451a

Krzysztof Parzyszek authored Jul 10, 2018

An explicit untied use is not sufficient to maintain liveness of a
register redefined in a predicated instruction. For example
  %1 = COPY %0
  ...
  %1 = A2_paddif %2, %1, 1
could become
  $r1 = COPY $r0
  ...
  $r1 = A2_paddif $p0, $r1, 1
and later
  $r1 = COPY $r0                ;; this is not really dead!
  ...
  $r1 = A2_paddif $p0, $r0, 1

llvm-svn: 336662

c052451a

[LowerSwitch] Fixed faulty PHI nodes · 1ffeb5d7

Karl-Johan Karlsson authored Jul 10, 2018

Summary:
Fixed two cases of where PHI nodes need to be updated by lowerswitch.

When lowerswitch find out that the switch default branch is not
reachable it remove the old default and replace it with the most
popular block from the cases, but it forget to update the PHI
nodes in the default block.

The PHI nodes also need to be updated when the switch is replaced
with a single branch.

Reviewers: hans, reames, arsenm

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D47203

llvm-svn: 336659

1ffeb5d7

[PM/Unswitch] Fix a collection of closely related issues with trivial · 47dc3a34

Chandler Carruth authored Jul 10, 2018

switch unswitching.

The core problem was that the way we handled unswitching trivial exit
edges through the default successor of a switch. For some reason
I thought the right way to do this was to add a block containing
unreachable and point the default successor at this block. In
retrospect, this has an amazing number of problems.

The first issue is the one that this pass has always worked around -- we
have to *detect* such edges and avoid unswitching them again. This
seemed pretty easy really. You juts look for an edge to a block
containing unreachable. However, this pattern is woefully unsound. So
many things can break it. The amazing thing is that I found a test case
where *simple-loop-unswitch itself* breaks this! When we do
a *non-trivial* unswitch of a switch we will end up splitting this exit
edge. The result will be a default successor that is an exit and
terminates in ... a perfectly normal branch. So the first test case that
I started trying to fix is added to the nontrivial test cases. This is
a ridiculous example that did just amazing things previously. With just
unswitch, it would create 10+ copies of this stuff stamped out. But if
you combine it *just right* with a bunch of other passes (like
simplify-cfg, loop rotate, and some LICM) you can get it to do this
infinitely. Or at least, I never got it to finish. =[

This, in turn, uncovered another related issue. When we are manipulating
these switches after doing a trivial unswitch we never correctly updated
PHI nodes to reflect our edits. As soon as I started changing how these
edges were managed, it became obvious there were more issues that
I couldn't realistically leave unaddressed, so I wrote more test cases
around PHI updates here and ensured all of that works now.

And this, in turn, required some adjustment to how we collect and manage
the exit successor when it is the default successor. That showed a clear
bug where we failed to include it in our search for the outer-most loop
reached by an unswitched exit edge. This was actually already tested and
the test case didn't work. I (wrongly) thought that was due to SCEV
failing to analyze the switch. In fact, it was just a simple bug in the
code that skipped the default successor. While changing this, I handled
it correctly and have updated the test to reflect that we now get
precise SCEV analysis of trip counts for the outer loop in one of these
cases.

llvm-svn: 336646

47dc3a34

[X86] Fast-isel tests for lowered truncation intrinsics · 89c919c2

Mikhail Dvoretckii authored Jul 10, 2018

This patch adds fast-isel tests for the IR patterns produced for truncation
intrinsics in rC336643.

Differential Revision: https://reviews.llvm.org/D48822

llvm-svn: 336645

89c919c2

[X86][SSE] Prefer BLEND(SHL(v,c1),SHL(v,c2)) over MUL(v, c3) · d32ca2c0

Simon Pilgrim authored Jul 10, 2018

Now that rL336250 has landed, we should prefer 2 immediate shifts + a shuffle blend over performing a multiply. Despite the increase in instructions, this is quicker (especially for slow v4i32 multiplies), avoid loads and constant pool usage. It does mean however that we increase register pressure. The code size will go up a little but by less than what we save on the constant pool data.

This patch also adds support for v16i16 to the BLEND(SHIFT(v,c1),SHIFT(v,c2)) combine, and also prevents blending on pre-SSE41 shifts if it would introduce extra blend masks/constant pool usage.

Differential Revision: https://reviews.llvm.org/D48936

llvm-svn: 336642

d32ca2c0

[X86] Regenerate vector-shuffle-512-v8.ll so the script will merge the 32 and... · 5fd020c0
Craig Topper authored Jul 10, 2018
```
[X86] Regenerate vector-shuffle-512-v8.ll so the script will merge the 32 and 64 bit checks together. NFC

llvm-svn: 336641
```
5fd020c0

[X86] Correct vfixupimm load patterns to look for an integer load, not a... · 866a377e

Craig Topper authored Jul 10, 2018

[X86] Correct vfixupimm load patterns to look for an integer load, not a floating point load bitcasted to integer.

DAG combine wouldn't let a floating point load bitcasted to integer exist. It would just be an integer load.

llvm-svn: 336626

866a377e

[X86] Add test cases that show failure to fold load into vfixupimm... · 59fd2f4c

Craig Topper authored Jul 10, 2018

[X86] Add test cases that show failure to fold load into vfixupimm instructions due to bad isel pattern.

llvm-svn: 336625

59fd2f4c

Revert "AMDGPU: Force inlining if LDS global address is used" · 688e7522
Vlad Tsyrklevich authored Jul 10, 2018
```
This reverts commit r336587, it was causing test failures on the
sanitizer bots.

llvm-svn: 336623
```
688e7522

[InstCombine] allow more shuffle folds using safe constants · 69faf464

Sanjay Patel authored Jul 09, 2018

getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming
that the constant we wanted was operand 1 (RHS). That's wrong, but I
don't think we could expose a bug or even a suboptimal fold from that
because the callers have other guards for any binop that would have
been affected.

llvm-svn: 336617

69faf464

[WebAssembly] Support for binary atomic RMW instructions · fed7382e

Heejin Ahn authored Jul 09, 2018

Summary:
This adds support for binary atomic read-modify-write instructions:
add, sub, and, or, xor, and xchg.

This does not yet support translations of some of LLVM IR atomicrmw
instructions (nand, max, min, umax, and umin) that do not have a direct
counterpart in wasm instructions.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D49088

llvm-svn: 336615

fed7382e

llvm: Add support for "-fno-delete-null-pointer-checks" · 77eeac3d

Manoj Gupta authored Jul 09, 2018

Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.

More details : https://lkml.org/lkml/2018/4/4/601

GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.

-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.

This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.

Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv

Reviewed By: efriedma, george.burgess.iv

Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47895

llvm-svn: 336613

77eeac3d

Make llvm.objectsize more conservative with null · 3fbfa9c4

George Burgess IV authored Jul 09, 2018

In non-zero address spaces, we were reporting that an object at `null`
always occupies zero bytes. This is incorrect in many cases, so just
return `unknown` in those cases for now.

Differential Revision: https://reviews.llvm.org/D48860

llvm-svn: 336611

3fbfa9c4

Jul 09, 2018

Fix line endings. NFCI. · 017c68c1
Simon Pilgrim authored Jul 09, 2018
```
llvm-svn: 336602
```
017c68c1

[Power9] Add __float128 builtins for Rounding Operations · 133acb22

Stefan Pintilie authored Jul 09, 2018

Added __float128 support for a number of rounding operations:

trunc
rint
nearbyint
round
floor
ceil

Differential Revision: https://reviews.llvm.org/D48415

llvm-svn: 336601

133acb22

[WebAssembly] Improve readability of load/stores and tests. NFC. · d31bc986

Heejin Ahn authored Jul 09, 2018

Summary:
- Changed variable/function names to be more consistent
- Improved comments in test files
- Added more tests
- Fixed a few typos
- Misc. cosmetic changes

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D49087

llvm-svn: 336598

d31bc986

[Power9] [LLVM] Add __float128 support for trunc to double round to odd · 58e3e0a8

Stefan Pintilie authored Jul 09, 2018

Add support for this builtin:
double builtin_truncf128_round_to_odd(float128)

Differential Revision: https://reviews.llvm.org/D48483

llvm-svn: 336595

58e3e0a8

RenameIndependentSubregs: Fix handling of undef tied operands · 7139dea6

Mark Searles authored Jul 09, 2018

Ensure that, if updating a tied operand pair, to only update
that pair.

Differential Revision: https://reviews.llvm.org/D49052

llvm-svn: 336593

7139dea6