Commits · bd254f26012752426104588640bccb85d086d660 · Roger Ferrer / llvm-epi-0.8

Jun 08, 2013

Fix an assertion in MemCpyOpt pass. · bd254f26

Shuxin Yang authored Jun 07, 2013

  The MemCpyOpt pass is capable of optimizing:
      callee(&S); copy N bytes from S to D.
    into:
      callee(&D);
subject to some legality constraints. 

  Assertion is triggered when the compiler tries to evalute "sizeof(typeof(D))",
while D is an opaque-typed, 'sret' formal argument of function being compiled.
i.e. the signature of the func being compiled is something like this:
  T caller(...,%opaque* noalias nocapture sret %D, ...)

  The fix is that when come across such situation, instead of calling some
utility functions to get the size of D's type (which will crash), we simply
assume D has at least N bytes as implified by the copy-instruction.

rdar://14073661 

llvm-svn: 183584

bd254f26

Jun 04, 2013

IndVarSimplify: check if loop invariant expansion can trap · 29130c5e

David Majnemer authored Jun 04, 2013

IndVarSimplify is willing to move divide instructions outside of their
loop bodies if they are invariant of the loop.  However, it may not be
safe to expand them if we do not know if they can trap.

Instead, check to see if it is not safe to expand the instruction and
skip the expansion.

This fixes PR16041.

Testcase by Rafael Ávila de Espíndola.

llvm-svn: 183239

29130c5e

May 31, 2013

Loop Strength Reduce: Scaling factor cost. · bf490d4a

Quentin Colombet authored May 31, 2013

Account for the cost of scaling factor in Loop Strength Reduce when rating the
formulae. This uses a target hook.

The default implementation of the hook is: if the addressing mode is legal, the
scaling factor is free.

<rdar://problem/13806271>

llvm-svn: 183045

bf490d4a

Modify how the formulae are rated in Loop Strength Reduce. · 8aa7abe2

Quentin Colombet authored May 31, 2013

Namely, check if the target allows to fold more that one register in the
addressing mode and if yes, adjust the cost accordingly.

Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred
to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2
needs a temporary register for the computation, whereas it was correctly
estimated for reg1 + scale * reg2.

<rdar://problem/13973908>

llvm-svn: 183021

8aa7abe2

May 25, 2013
- Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. · df1ecbd7
  Michael J. Spencer authored May 24, 2013
```
llvm-svn: 182680
```
  df1ecbd7
May 09, 2013

[GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next · 1d8d7e4d

Shuxin Yang authored May 09, 2013

  iteration.
  
  This on step toward non-iterative GVN. My local hack suggests that getting rid
of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++).
I cannot explain why not 2x or more at this moment.

llvm-svn: 181532

1d8d7e4d

May 08, 2013
- Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst · 5fb1963f
  Nick Lewycky authored May 08, 2013
```
by switching to a ValueMap. Patch by Andrea DiBiagio!

llvm-svn: 181397
```
  5fb1963f
May 06, 2013

Rotate multi-exit loops even if the latch was simplified. · 9c72b071

Andrew Trick authored May 06, 2013

Test case by Michele Scandale!

Fixes PR10293: Load not hoisted out of loop with multiple exits.

There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.

This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.

I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.

(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:

- We need a strong guarantee that we won't endlessly rotate, so the
  analysis would need to be precise in order to avoid the
  SimplifiedLoopLatch precondition.

- Analysis like this are usually based on SCEV, which we don't want to
  rely on.

(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.

llvm-svn: 181230

9c72b071

May 03, 2013

Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper... · 637b9beb

Shuxin Yang authored May 03, 2013

Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper functions. No function change. 

This function consists of following steps:
   1. Collect dependent memory accesses.
   2. Analyze availability.
   3. Perform fully redundancy elimination, or 
   4. Perform PRE, depending on the availability

 Step 2, 3 and 4 are now moved to three helper routines.

llvm-svn: 181047

637b9beb

May 02, 2013

[GV] Remove dead code which is really difficult to decipher. · af2c3ddf

Shuxin Yang authored May 02, 2013

Actually it took me couple of hours trying to make sense of them and
only to find they are dead code.  I guess the original author used
"allSingleSucc" to indicate if there are any critial edge emanating
from some blocks, and tried to perform code motion (actually speculation)
in the presence of these critical edges; but later on he/she changed mind
and decided to perform edge-splitting first.

llvm-svn: 180951

af2c3ddf

May 01, 2013

This patch breaks up Wrap.h so that it does not have to include all of · dec20e43

Filip Pizlo authored May 01, 2013

the things, and renames it to CBindingWrapping.h.  I also moved 
CBindingWrapping.h into Support/.

This new file just contains the macros for defining different wrap/unwrap 
methods.

The calls to those macros, as well as any custom wrap/unwrap definitions 
(like for array of Values for example), are put into corresponding C++ 
headers.

Doing this required some #include surgery, since some .cpp files relied 
on the fact that including Wrap.h implicitly caused the inclusion of a 
bunch of other things.

This also now means that the C++ headers will include their corresponding 
C API headers; for example Value.h must include llvm-c/Core.h.  I think 
this is harmless, since the C API headers contain just external function 
declarations and some C types, so I don't believe there should be any 
nasty dependency issues here.

llvm-svn: 180881

dec20e43

SROA: Generate selects instead of shuffles when blending values because this... · 1e211913

Nadav Rotem authored May 01, 2013

SROA: Generate selects instead of shuffles when blending values because this is the cannonical form.
Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often.

llvm-svn: 180875

1e211913

Apr 27, 2013

Fix a XOR reassociation bug. · 04a4fd43

Shuxin Yang authored Apr 27, 2013

When Reassociator optimize "(x | C1)" ^ "(X & C2)", it may swap the two
subexpressions, however, it forgot to swap cached constants (of C1 and C2)
accordingly.

rdar://13739160

llvm-svn: 180676

04a4fd43

Apr 23, 2013

Move C++ code out of the C headers and into either C++ headers · 04d4e931

Eric Christopher authored Apr 22, 2013

or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.

llvm-svn: 180063

04d4e931

Apr 22, 2013

Clarify that llvm.used can contain aliases. · 74f2e46e

Rafael Espindola authored Apr 22, 2013

Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.

llvm-svn: 180019

74f2e46e

Apr 21, 2013

SROA: Don't crash on a select with two identical operands. · 0212dc27

Benjamin Kramer authored Apr 21, 2013

This is an edge case that can happen if we modify a chain of multiple selects.
Update all operands in that case and remove the assert. PR15805.

llvm-svn: 179982

0212dc27

Apr 18, 2013
- Fix a comment, PR15777. · 8cf09416
  Chris Lattner authored Apr 18, 2013
```
llvm-svn: 179775
```
  8cf09416
Apr 15, 2013
- Fix a typo in comment. · 0f38c1e3
  Jim Grosbach authored Apr 15, 2013
```
llvm-svn: 179542
```
  0f38c1e3
Apr 09, 2013

Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate. · 331f01dc

Shuxin Yang authored Apr 08, 2013

I brazenly think this change is slightly simpler than r178793 because: 
  - no "state" in functor
  - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" 

  While I can reproduce the probelm in Valgrind, it is rather difficult to come up
a standalone testing case. The reason is that when an iterator is invalidated,
the stale invalidated elements are not yet clobbered by nonsense data, so the
optimizer can still proceed successfully. 

  Thank Benjamin for fixing this bug and generously providing the test case.

llvm-svn: 179062

331f01dc

Apr 07, 2013

Fix PR15674 (and PR15603): a SROA think-o. · 0e8a52d1

Chandler Carruth authored Apr 07, 2013

The fix for PR14972 in r177055 introduced a real think-o in the *store*
side, likely because I was much more focused on the load side. While we
can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily
widen a value to be stored, as that changes the width of memory access!
Lock down the code path in the store rewriting which would do this to
only handle the intended circumstance.

All of the existing tests continue to pass, and I've added a test from
the PR.

llvm-svn: 178974

0e8a52d1

Apr 05, 2013

Disable the optimization about promoting vector-element-access with symbolic index. · 95adf525

Shuxin Yang authored Apr 05, 2013

This optimization is unstable at this moment; it 
  1) block us on a very important application
  2) PR15200
  3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll
     (the CHECK command compare the output against wrong result)

   I personally believe this optimization should not have any impact on the
autovectorized code, as auto-vectorizer is supposed to put gather/scatter
in a "right" way.  Although in theory downstream optimizaters might reveal 
some gather/scatter optimization opportunities, the chance is quite slim.

   For the hand-crafted vectorizing code, in term of redundancy elimination,
load-CSE, copy-propagation and DSE can collectively achieve the same result,
but in much simpler way. On the other hand, these optimizers are able to 
improve the code in a incremental way; in contrast, SROA is sort of all-or-none
approach. However, SROA might slighly win in stack size, as it tries to figure 
out a stretch of memory tightenly cover the area accessed by the dynamic index.

 rdar://13174884
 PR15200

llvm-svn: 178912

95adf525

Apr 04, 2013

Reassociate: Avoid iterator invalidation. · dd67654a

Benjamin Kramer authored Apr 04, 2013

OpndPtrs stored pointers into the Opnd vector that became invalid when the
vector grows. Store indices instead. Sadly I only have a large testcase that
only triggers under valgrind, so I didn't include it.

llvm-svn: 178793

dd67654a

Apr 01, 2013
- Correct assertion condition · 6662fd0f
  Shuxin Yang authored Apr 01, 2013
```
llvm-svn: 178484
```
  6662fd0f
Mar 30, 2013

Implement XOR reassociation. It is based on following rules: · 7b0c94e2

Shuxin Yang authored Mar 30, 2013

  rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2),
     only useful when c1=c2
  rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
  rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
  rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2

 It reduces an application's size (in terms of # of instructions) by 8.9%.
 Reviwed by Pete Cooper. Thanks a lot!

 rdar://13212115  

llvm-svn: 178409

7b0c94e2

Mar 24, 2013
- Minor cleanups. No functionality change. · 4f9d1e85
  Jakub Staszak authored Mar 24, 2013
```
llvm-svn: 177837
```
  4f9d1e85
- Use dyn_cast instead of isa && cast. · f6df1e3d
  Jakub Staszak authored Mar 24, 2013
```
No functionality change.

llvm-svn: 177836
```
  f6df1e3d
Mar 21, 2013

[SROA] Prefix names using a custom IRBuilder inserter. · 34f0c7fc

Chandler Carruth authored Mar 21, 2013

The key part of this is ensuring that name prefixes remain in a Twine
form until we get to a point where we can nuke them under NDEBUG. This
is tricky using the old APIs as they played fast and loose with Twine,
which is prone to serious error. The inserter is much cleaner as it is
actually in the call stack leading to the setName call, and so has
a good opportunity to prepend the prefix.

This matters more than you might imagine because most runs over an
alloca find a single partition, and rewrite 3 or 4 instructions
referring to it. As a consequence doing this lazily and exclusively with
Twine allows the optimizer to delete more of it and shaves another 2% to
3% off of the release build's SROA run time for PR15412. I also think
the APIs are cleaner, and the use of Twine is more reliable, so
I consider it a win-win despite the churn required to reach this state.

llvm-svn: 177631

34f0c7fc

simplify-libcalls: Removed unused variable · cf691565

Meador Inge authored Mar 21, 2013

The 'Modified' variable should have been removed from SimplifyLibCalls
in r177619, but was missed.  This commit removes it.

llvm-svn: 177622

cf691565

Move library call prototype attribute inference to functionattrs · 6b6a161c

Meador Inge authored Mar 21, 2013

The simplify-libcalls pass implemented a doInitialization hook to infer
function prototype attributes for well-known functions.  Given that the
simplify-libcalls pass is going away *and* that the functionattrs pass
is already in place to deduce function attributes, I am moving this logic
to the functionattrs pass.  This approach was discussed during patch
review:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html.

llvm-svn: 177619

6b6a161c

Mar 20, 2013
- Fix a silly search-and-replace goof with r177495 that only broke · 0fad1752
  Chandler Carruth authored Mar 20, 2013
```
non-release builds.

llvm-svn: 177498
```
  0fad1752
- [SROA] Don't preserve the IR names in release builds. · d177f861
  Chandler Carruth authored Mar 20, 2013
```
This is espcially important because the new SROA pass goes to great
lengths to provide helpful names for debugging, and as a consequence
they can become very slow to render.

Good for between 5% and 15% of the SROA runtime on some slow test cases
such as the one in PR15412.

llvm-svn: 177495
```
  d177f861
- Move the endif to the correct line so we don't have warnings about · 0941b662
  Chandler Carruth authored Mar 20, 2013
```
unused statistics variables.

llvm-svn: 177494
```
  0941b662
- Introduce some new statistics to help track the exact behavior of the · 5f5b6163
  Chandler Carruth authored Mar 20, 2013
```
new SROA pass.

llvm-svn: 177493
```
  5f5b6163
Mar 19, 2013
- Update global merge pass according to Duncan's advices: · 2393cb92
  Quentin Colombet authored Mar 19, 2013
```
- Remove useless includes
- Change misleading comments
- Move code into doFinalization

llvm-svn: 177445
```
  2393cb92
- IndVarSimplify: do not recompute an IV value outside of the loop if : · 87c473f0
  Arnaud A. de Grandmaison authored Mar 19, 2013
```
- it is trivially known to be used inside the loop in a way that can not be optimized away
- there is no use outside of the loop which can take advantage of the computation hoisting

llvm-svn: 177432
```
  87c473f0
- Revert "Cleanup some SCEV logic a bit." · f3a2544d
  Andrew Trick authored Mar 19, 2013
```
This reverts commit 82cd8f7382322bee7a71cdc31f7a923c44d37d32.

Just add a comment instead!

llvm-svn: 177377
```
  f3a2544d
- Cleanup some SCEV logic a bit. · de788665
  Andrew Trick authored Mar 19, 2013
```
Make the code more obvious to scan-build and humans.

llvm-svn: 177375
```
  de788665
- Tighten up an internal LSR API that should check for NULL. · a1c01ba8
  Andrew Trick authored Mar 19, 2013
```
No test case, but should fix a scan_build warning.

llvm-svn: 177374
```
  a1c01ba8
- Make method private. Keep coding standard. · bc421efd
  Jakub Staszak authored Mar 18, 2013
```
llvm-svn: 177348
```
  bc421efd
Mar 18, 2013
- Extend global merge pass to optionally consider global constant variables. · 8fc34097
  Quentin Colombet authored Mar 18, 2013
```
Also add some checks to not merge globals used within landing pad instructions or marked as "used".

llvm-svn: 177331
```
  8fc34097