Commits · 4506e447c1b6b8bb1da0f4bd2ea3b353d72f4a87 · Lorenzo Albano / LLVM bpEVL

Dec 27, 2016

[InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions · c9cf7fc7

Simon Pilgrim authored Dec 26, 2016

PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

Differential Revision: https://reviews.llvm.org/D28119

llvm-svn: 290554

c9cf7fc7

Dec 26, 2016

clang-format NewGVN files · 85f91b0e
Daniel Berlin authored Dec 26, 2016
```
llvm-svn: 290551
```
85f91b0e

Misc cleanups and simplifications for NewGVN. · 85cbc8c0

Daniel Berlin authored Dec 26, 2016

Mostly use a bit more idiomatic C++ where we can,
so we can combine some things later.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28111

llvm-svn: 290550

85cbc8c0

Don't use our own incorrect version of isTriviallyDeadInstruction in NewGVN. Fixes PR/31472 · d59e8010
Daniel Berlin authored Dec 26, 2016
```
llvm-svn: 290549
```
d59e8010

[NewGVN] Add a flag to enable the pass via `-mllvm`. · fe7a3ee5

Davide Italiano authored Dec 26, 2016

NewGVN can be tested passing `-mllvm -enable-newgvn` to clang.

Differential Revision:  https://reviews.llvm.org/D28059

llvm-svn: 290548

fe7a3ee5

[NewGVN] Fold lookupOperandLeader() when there's only one use. NFCI. · a312ca84
Davide Italiano authored Dec 26, 2016
```
llvm-svn: 290543
```
a312ca84

[AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with... · 7b788ada

Craig Topper authored Dec 26, 2016

[AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.

Summary:
I only do this for unmasked cases for now because isel is failing to fold the mask. I'll try to fix that soon.

I'll do the same thing for packed add/sub/mul/div in a future patch.

Reviewers: delena, RKSimon, zvi, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27879

llvm-svn: 290535

7b788ada

[AVX-512][InstCombine] Teach InstCombine to converted masked vpermv intrinsics... · e3280457

Craig Topper authored Dec 25, 2016

[AVX-512][InstCombine] Teach InstCombine to converted masked vpermv intrinsics into shufflevector instructions

Summary:
This patch adds support for converting the masked vpermv intrinsics into shufflevector instructions if the indices are constants.

We also need to wrap a select instruction around the shuffle to take care of the masking part. InstCombine will take care of optimizing the select if the mask is constant so I didn't bother checking for that.

Reviewers: zvi, delena, spatel, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27825

llvm-svn: 290530

e3280457

[MemorySSA] Define a restricted upward AccessList splice. · 4213d941
Bryant Wong authored Dec 25, 2016
```
Differential Revision: https://reviews.llvm.org/D26661

llvm-svn: 290527
```
4213d941

Dec 25, 2016
- Value number stores and memory states so we can detect when memory states are... · d7c12ee5
  Daniel Berlin authored Dec 25, 2016
```
Value number stores and memory states so we can detect when memory states are equivalent (IE store of same value to memory).

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28084

llvm-svn: 290525
```
  d7c12ee5
- Rename GVNExpression *ops_ members to *op_* to match conventions in the rest of LLVM · 65f5f0d7
  Daniel Berlin authored Dec 25, 2016
```
llvm-svn: 290524
```
  65f5f0d7
Dec 24, 2016

[NewGVN] Prefer `auto` to explicit type when the latter is obvious. · 463c32ea
Davide Italiano authored Dec 24, 2016
```
llvm-svn: 290499
```
463c32ea
Mark isOnlyReachableViaThisEdge as const · 8a6a8614
Daniel Berlin authored Dec 24, 2016
```
llvm-svn: 290468
```
8a6a8614

[PM] Teach the always inlining test case to be much more strict about · 4eaff12b

Chandler Carruth authored Dec 23, 2016

whether functions are removed, and fix the new PM's always inliner to
actually pass this test.

Without this, the new PM's always inliner leaves all the functions
kicking around which won't work out very well given the semantics of
always inline.

Doing this really highlights how frustrating the current alwaysinline
semantic contract is though -- why can we put it on *external*
functions, etc?

Also I've added a number of tricky and interesting test cases for
removing functions with the always inliner. There is one remaining case
not handled -- fully removing comdats -- and I've left a FIXME about
this.

llvm-svn: 290457

4eaff12b

Dec 23, 2016
- Function-import: Disable IRVerifier on lazy-loaded modules: the ODR... · 94f86ad4
  Mehdi Amini authored Dec 23, 2016
```
Function-import: Disable IRVerifier on lazy-loaded modules: the ODR TypeUniquing generates invalid debug info.

llvm-svn: 290442
```
  94f86ad4
- Fix build after r290437 (missing include) · fc06b83e
  Mehdi Amini authored Dec 23, 2016
```
llvm-svn: 290438
```
  fc06b83e
- FunctionImport: fix typo '#ifndef NDEBUG' instead of '#ifndef DEBUG' · 9a9077fd
  Mehdi Amini authored Dec 23, 2016
```
llvm-svn: 290437
```
  9a9077fd
- [LICM] Plug a leak freeing the ASTs before clearing the map. · b9ff23a4
  Davide Italiano authored Dec 23, 2016
```
llvm-svn: 290433
```
  b9ff23a4
- [LICM] Work around LICM needs to maintain state across loops. · 34f94384
  Davide Italiano authored Dec 23, 2016
```
The pass creates some state which expects to be cleaned up by
a later instance of the same pass. opt-bisect happens to expose
this not ideal design because calling skipLoop() will result in
this state not being cleaned up at times and an assertion firing
in `doFinalization()`. Chandler tells me the new pass manager will
give us options to avoid these design traps, but until it's not ready,
we need a workaround for the current pass infrastructure. Fix provided
by Andy Kaylor, see the review for a complete discussion.

Differential Revision:  https://reviews.llvm.org/D25848

llvm-svn: 290427
```
  34f94384
- [NewGVN] Remove (for now) unused code. NFCI. · 0ff94162
  Davide Italiano authored Dec 23, 2016
```
llvm-svn: 290420
```
  0ff94162
- [ThinLTO] Verify lazy-loaded source module for function importing when assertions are enabled (NFC) · 96cdc493
  Mehdi Amini authored Dec 23, 2016
```
llvm-svn: 290416
```
  96cdc493
- Enable '-Wstring-conversion' and fix some bad asserts that it helped · ee086761
  Chandler Carruth authored Dec 23, 2016
```
find.

Notable is the assert in NewGVN which had no effect because of the bug.

llvm-svn: 290400
```
  ee086761
Dec 22, 2016

[cfi] Emit jump tables as a function-level inline asm. · 27d4c9b7

Evgeniy Stepanov authored Dec 22, 2016

Use a dummy private function with inline asm calls instead of module
level asm blocks for CFI jumptables.

The main advantage is that now jumptable codegen can be affected by
the function attributes (like target_cpu on ARM). Module level asm
gets the default subtarget based on the target triple, which is often
not good enough.

This change also uses asm constraints/arguments to reference
jumptable targets and aliases directly. We no longer do asm name
mangling in an IR pass.

Differential Revision: https://reviews.llvm.org/D28012

llvm-svn: 290384

27d4c9b7

[GVN] Initial check-in of a new global value numbering algorithm. · 7e274e02

Davide Italiano authored Dec 22, 2016

The code have been developed by Daniel Berlin over the years, and
the new implementation goal is that of addressing shortcomings of
the current GVN infrastructure, i.e. long compile time for large
testcases, lack of phi predication, no load/store value numbering
etc...

The current code just implements the "core" GVN algorithm, although
other pieces (load coercion, phi handling, predicate system) are
already implemented in a branch out of tree. Once the core is stable,
we'll start adding pieces on top of the base framework.
The test currently living in test/Transform/NewGVN are a copy
of the ones in GVN, with proper `XFAIL` (missing features in NewGVN).
A flag will be added in a future commit to enable NewGVN, so that
interested parties can exercise this code easily.

Differential Revision:  https://reviews.llvm.org/D26224

llvm-svn: 290346

7e274e02

[PM] Introduce a reasonable port of the main per-module pass pipeline · e3f5064b

Chandler Carruth authored Dec 22, 2016

from the old pass manager in the new one.

I'm not trying to support (initially) the numerous options that are
currently available to customize the pass pipeline. If we end up really
wanting them, we can add them later, but I suspect many are no longer
interesting. The simplicity of omitting them will help a lot as we sort
out what the pipeline should look like in the new PM.

I've also documented to the best of my ability *why* each pass or group
of passes is used so that reading the pipeline is more helpful. In many
cases I think we have some questionable choices of ordering and I've
left FIXME comments in place so we know what to come back and revisit
going forward. But for now, I've left it as similar to the current
pipeline as I could.

Lastly, I've had to comment out several places where passes are not
ported to the new pass manager or where the loop pass infrastructure is
not yet ready. I did at least fix a few bugs in the loop pass
infrastructure uncovered by running the full pipeline, but I didn't want
to go too far in this patch -- I'll come back and re-enable these as the
infrastructure comes online. But I'd like to keep the comments in place
because I don't want to lose track of which passes need to be enabled
and where they go.

One thing that seemed like a significant API improvement was to require
that we don't build pipelines for O0. It seems to have no real benefit.

I've also switched back to returning pass managers by value as at this
API layer it feels much more natural to me for composition. But if
others disagree, I'm happy to go back to an output parameter.

I'm not 100% happy with the testing strategy currently, but it seems at
least OK. I may come back and try to refactor or otherwise improve this
in subsequent patches but I wanted to at least get a good starting point
in place.

Differential Revision: https://reviews.llvm.org/D28042

llvm-svn: 290325

e3f5064b

Refactor the DIExpression fragment query interface (NFC) · 49797ca6
Adrian Prantl authored Dec 22, 2016
```
... so it becomes available to DIExpressionCursor.

llvm-svn: 290322
```
49797ca6
Pass GetAssumptionCache to InlineFunctionInfo constructor · 180bd9f6
Easwaran Raman authored Dec 22, 2016
```
Differential revision: https://reviews.llvm.org/D28038

llvm-svn: 290295
```
180bd9f6

Dec 21, 2016

Revert "[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp" · b0761a0c
David Majnemer authored Dec 21, 2016
```
This reverts commit r289813, it caused PR31449.

llvm-svn: 290266
```
b0761a0c

[LDist] Match behavior between invoking via optimization pipeline or opt -loop-distribute · 32e6a34c

Adam Nemet authored Dec 21, 2016

In r267672, where the loop distribution pragma was introduced, I tried
it hard to keep the old behavior for opt: when opt is invoked
with -loop-distribute, it should distribute the loop (it's off by
default when ran via the optimization pipeline).

As MichaelZ has discovered this has the unintended consequence of
breaking a very common developer work-flow to reproduce compilations
using opt: First you print the pass pipeline of clang
with -debug-pass=Arguments and then invoking opt with the returned
arguments.

clang -debug-pass will include -loop-distribute but the pass is invoked
with default=off so nothing happens unless the loop carries the pragma.
While through opt (default=on) we will try to distribute all loops.

This changes opt's default to off as well to match clang.  The tests are
modified to explicitly enable the transformation.

llvm-svn: 290235

32e6a34c

IPO: Remove the ModuleSummary argument to the FunctionImport pass. NFCI. · 598bd2a2

Peter Collingbourne authored Dec 21, 2016

No existing client is passing a non-null value here. This will come back
in a slightly different form as part of the type identifier summary work.

Differential Revision: https://reviews.llvm.org/D28006

llvm-svn: 290222

598bd2a2

[Analysis] Centralize objectsize lowering logic. · 3f08914e

George Burgess IV authored Dec 20, 2016

We're currently doing nearly the same thing for @llvm.objectsize in
three different places: two of them are missing checks for overflow,
and one of them could subtly break if InstCombine gets much smarter
about removing alloc sites. Seems like a good idea to not do that.

llvm-svn: 290214

3f08914e

Dec 20, 2016

[LoopUnroll] Modify a comment to clarify the usage of TripCount. NFC. · b29dd010

Haicheng Wu authored Dec 20, 2016

Make it clear that TripCount is the upper bound of the iteration on which
control exits LatchBlock.

Differential Revision: https://reviews.llvm.org/D26675

llvm-svn: 290199

b29dd010

[PM] Provide an initial, minimal port of the inliner to the new pass manager. · 1d963114

Chandler Carruth authored Dec 20, 2016

This doesn't implement *every* feature of the existing inliner, but
tries to implement the most important ones for building a functional
optimization pipeline and beginning to sort out bugs, regressions, and
other problems.

Notable, but intentional omissions:
- No alloca merging support. Why? Because it isn't clear we want to do
  this at all. Active discussion and investigation is going on to remove
  it, so for simplicity I omitted it.
- No support for trying to iterate on "internally" devirtualized calls.
  Why? Because it adds what I suspect is inappropriate coupling for
  little or no benefit. We will have an outer iteration system that
  tracks devirtualization including that from function passes and
  iterates already. We should improve that rather than approximate it
  here.
- Optimization remarks. Why? Purely to make the patch smaller, no other
  reason at all.

The last one I'll probably work on almost immediately. But I wanted to
skip it in the initial patch to try to focus the change as much as
possible as there is already a lot of code moving around and both of
these *could* be skipped without really disrupting the core logic.

A summary of the different things happening here:

1) Adding the usual new PM class and rigging.

2) Fixing minor underlying assumptions in the inline cost analysis or
   inline logic that don't generally hold in the new PM world.

3) Adding the core pass logic which is in essence a loop over the calls
   in the nodes in the call graph. This is a bit duplicated from the old
   inliner, but only a handful of lines could realistically be shared.
   (I tried at first, and it really didn't help anything.) All told,
   this is only about 100 lines of code, and most of that is the
   mechanics of wiring up analyses from the new PM world.

4) Updating the LazyCallGraph (in the new PM) based on the *newly
   inlined* calls and references. This is very minimal because we cannot
   form cycles.

5) When inlining removes the last use of a function, eagerly nuking the
   body of the function so that any "one use remaining" inline cost
   heuristics are immediately refined, and queuing these functions to be
   completely deleted once inlining is complete and the call graph
   updated to reflect that they have become dead.

6) After all the inlining for a particular function, updating the
   LazyCallGraph and the CGSCC pass manager to reflect the
   function-local simplifications that are done immediately and
   internally by the inline utilties. These are the exact same
   fundamental set of CG updates done by arbitrary function passes.

7) Adding a bunch of test cases to specifically target CGSCC and other
   subtle aspects in the new PM world.

Many thanks to the careful review from Easwaran and Sanjoy and others!

Differential Revision: https://reviews.llvm.org/D24226

llvm-svn: 290161

1d963114

[IR] Remove the DIExpression field from DIGlobalVariable. · bceaaa96

Adrian Prantl authored Dec 20, 2016

This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.

Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:

(1) The DIGlobalVariable should describe the source level variable,
    not how to get to its location.

(2) It makes it unsafe/hard to update the expressions when we call
    replaceExpression on the DIGLobalVariable.

(3) It makes it impossible to represent a global variable that is in
    more than one location (e.g., a variable with multiple
    DW_OP_LLVM_fragment-s).  We also moved away from attaching the
    DIExpression to DILocalVariable for the same reasons.

This reapplies r289902 with additional testcase upgrades and a change
to the Bitcode record for DIGlobalVariable, that makes upgrading the
old format unambiguous also for variables without DIExpressions.

<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769

llvm-svn: 290153

bceaaa96

Dec 19, 2016

[LV] Sink tripcount query to where it's actually used. NFC. · fb7dd86f
Michael Kuperstein authored Dec 19, 2016
```
llvm-svn: 290142
```
fb7dd86f

[InstCombine] use commutative matcher for pattern with commutative operators · 5a443ac0

Sanjay Patel authored Dec 19, 2016

This is a case that was missed in:
https://reviews.llvm.org/rL290067
...and it would regress if we fix operand complexity (PR28296).

llvm-svn: 290127

5a443ac0

[InstCombine] add folds for icmp (umin|umax X, Y), X · dd46b529

Sanjay Patel authored Dec 19, 2016

This is a follow-up to:
https://reviews.llvm.org/rL289855 (https://reviews.llvm.org/D27531)
https://reviews.llvm.org/rL290111

llvm-svn: 290118

dd46b529

[LoopVersioning] Require loop-simplify form for loop versioning. · 2e03213f

Florian Hahn authored Dec 19, 2016

Summary:
Requiring loop-simplify form for loop versioning ensures that the
runtime check block always dominates the exit block.
    
This patch closes #30958 (https://llvm.org/bugs/show_bug.cgi?id=30958).

Reviewers: silviu.baranga, hfinkel, anemet, ashutosh.nema

Subscribers: ashutosh.nema, mzolotukhin, efriedma, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D27469

llvm-svn: 290116

2e03213f

[InstCombine] add folds for icmp (smax X, Y), X · 8296c6c9
Sanjay Patel authored Dec 19, 2016
```
This is a follow-up to:
https://reviews.llvm.org/rL289855 (D27531)

llvm-svn: 290111
```
8296c6c9

Revert @llvm.assume with operator bundles (r289755-r289757) · aec2fa35

Daniel Jasper authored Dec 19, 2016

This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086

aec2fa35