Commits · f6b687e5d1f196ccc42d361497fe4be738d70c24 · Roger Ferrer / llvm-epi-0.8

May 12, 2012
- Teach Function::hasAddressTaken that BlockAddress doesn't really take · ca0c4996
  Jay Foad authored May 12, 2012
```
the address of a function.

llvm-svn: 156703
```
  ca0c4996
May 11, 2012
- objectsize: add a few more tests and fix a bug · e2cfd3ce
  Nuno Lopes authored May 11, 2012
```
llvm-svn: 156625
```
  e2cfd3ce
- Fix a minor logic mistake transforming compares in instcombine. PR12514. · e0a64d83
  Eli Friedman authored May 11, 2012
```
llvm-svn: 156600
```
  e0a64d83
- objectsize: add support for GEPs with non-constant indexes · f5730303
  Nuno Lopes authored May 10, 2012
```
add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag

llvm-svn: 156585
```
  f5730303
May 10, 2012
- Teach DeadStoreElimination to eliminate exit-block stores with phi addresses. · ed7c24e2
  Dan Gohman authored May 10, 2012
```
llvm-svn: 156558
```
  ed7c24e2
- teach DSE and isInstructionTriviallyDead() about calloc · 300d6299
  Nuno Lopes authored May 10, 2012
```
llvm-svn: 156553
```
  300d6299
- Fix the objc_storeStrong recognizer to stop before walking off the · f8b19d09
  Dan Gohman authored May 09, 2012
```
end of a basic block if there's no store.

llvm-svn: 156520
```
  f8b19d09
May 09, 2012
- objectsize: · 7100f463
  Nuno Lopes authored May 09, 2012
```
refactor code a bit to enable future changes to support run-time information
add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs)

llvm-svn: 156515
```
  7100f463
- Remove unused variable to get rid of warning. · 28540adf
  Craig Topper authored May 09, 2012
```
llvm-svn: 156466
```
  28540adf
- Miscellaneous accumulated cleanups. · 41375a35
  Dan Gohman authored May 08, 2012
```
llvm-svn: 156445
```
  41375a35
- Fix objc_storeStrong pattern matching to catch a potential use of the · 61708d37
  Dan Gohman authored May 08, 2012
```
old value after the store but before it is released.
This fixes rdar:/11116986.

llvm-svn: 156442
```
  61708d37
May 08, 2012

Calling ReassociateExpression recursively is extremely dangerous since it will · 3bbb1d50

Duncan Sands authored May 08, 2012

replace the operands of expressions with only one use with undef and generate
a new expression for the original without using RAUW to update the original.
Thus any copies of the original expression held in a vector may end up
referring to some bogus value - and using a ValueHandle won't help since there
is no RAUW. There is already a mechanism for getting the effect of recursion
non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't
being used systematically. Have various places where recursion had snuck in at
some point use the RedoInsts mechanism instead. Fixes PR12169.

llvm-svn: 156379

3bbb1d50

Allow NULL LoopPassManager argument in UnrollLoop. PR12734. · d29cd732
Andrew Trick authored May 08, 2012
```
llvm-svn: 156358
```
d29cd732

May 07, 2012

Teach reassociate to commute FMul's and FAdd's in order to canonicalize the... · f4f80e1f

Owen Anderson authored May 07, 2012

Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions.  This allows for greater CSE opportunities.

llvm-svn: 156323

f4f80e1f

May 06, 2012

Switch the select to branch transformation on by default. · 3d38c17b

Benjamin Kramer authored May 06, 2012

The primitive conservative heuristic seems to give a slight overall
improvement while not regressing stuff. Make it available to wider
testing. If you notice any speed regressions (or significant code
size regressions) let me know!

llvm-svn: 156258

3d38c17b

Remove trailing spaces. · cfc46f82
Jakub Staszak authored May 06, 2012
```
llvm-svn: 156257
```
cfc46f82

May 05, 2012

CodeGenPrepare: Add a transform to turn selects into branches in some cases. · 047d7ca0

Benjamin Kramer authored May 05, 2012

This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:

	ucomisd	(%rdi), %xmm0
	cmovbel	%edx, %esi

cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.

This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.

It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.


Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.

Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)

llvm-svn: 156234

047d7ca0

Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for... · cb2a1a34

Stepan Dyatkovskiy authored May 05, 2012

Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw".
Also added fix to 2011-06-13-nsw-alloca.ll test.

llvm-svn: 156231

cb2a1a34

May 04, 2012

Teach the code extractor how to extract a sequence of blocks from · 6781821c
Chandler Carruth authored May 04, 2012
```
RegionInfo's RegionNode. This mirrors the logic for automating the
extraction from a Loop.

llvm-svn: 156208
```
6781821c

Factor the computation of input and output sets into a public interface · 14316fcf

Chandler Carruth authored May 04, 2012

of the CodeExtractor utility. This allows speculatively computing input
and output sets to measure the likely size impact of the code
extraction.

These sets cannot be reused sadly -- we mutate the function prior to
forming the final sets used by the actual extraction.

The interface has been revamped slightly to make it easier to use
correctly by making the interface const and sinking the computation of
the number of exit blocks into the full extraction function and away
from the rest of this logic which just computed two output parameters.

llvm-svn: 156168

14316fcf

Rather than trying to gracefully handle input sequences with repeated · 44e13911

Chandler Carruth authored May 04, 2012

blocks, assert that this doesn't happen. We don't want to bother trying
to support this call pattern as it isn't necessary.

llvm-svn: 156167

44e13911

Fix a goof with my previous commit by completely returning when we · 0a570552
Chandler Carruth authored May 04, 2012
```
detect an in-eligible block rather than just breaking out of the loop.

llvm-svn: 156166
```
0a570552
Hoist a safety assert from the extraction method into the construction · 2f5d0191
Chandler Carruth authored May 04, 2012
```
of the extractor itself.

llvm-svn: 156164
```
2f5d0191

Move the CodeExtractor utility to a dedicated header file / source file, · 0fde0015

Chandler Carruth authored May 04, 2012

and expose it as a utility class rather than as free function wrappers.

The simple free-function interface works well for the bugpoint-specific
pass's uses of code extraction, but in an upcoming patch for more
advanced code extraction, they simply don't expose a rich enough
interface. I need to expose various stages of the process of doing the
code extraction and query information to decide whether or not to
actually complete the extraction or give up.

Rather than build up a new predicate model and pass that into these
functions, just take the class that was actually implementing the
functions and lift it up into a proper interface that can be used to
perform code extraction. The interface is cleaned up and re-documented
to work better in a header. It also is now setup to accept the blocks to
be extracted in the constructor rather than in a method.

In passing this essentially reverts my previous commit here exposing
a block-level query for eligibility of extraction. That is no longer
necessary with the more rich interface as clients can query the
extraction object for eligibility directly. This will reduce the number
of walks of the input basic block sequence by quite a bit which is
useful if this enters the normal optimization pipeline.

llvm-svn: 156163

0fde0015

Add 'landingpad' instructions to the list of instructions to ignore. · fa0ebcd1
Bill Wendling authored May 04, 2012
```
Also combine the code in the 'assert' statement.

llvm-svn: 156155
```
fa0ebcd1

A pile of long over-due refactorings here. There are some very, *very* · da7513a8

Chandler Carruth authored May 04, 2012

minor behavior changes with this, but nothing I have seen evidence of in
the wild or expect to be meaningful. The real goal is unifying our logic
and simplifying the interfaces. A summary of the changes follows:

- Make 'callIsSmall' actually accept a callsite so it can handle
  intrinsics, and simplify callers appropriately.
- Nuke a completely bogus declaration of 'callIsSmall' that was still
  lurking in InlineCost.h... No idea how this got missed.
- Teach the 'isInstructionFree' about the various more intelligent
  'free' heuristics that got added to the inline cost analysis during
  review and testing. This mostly surrounds int->ptr and ptr->int casts.
- Switch most of the interesting parts of the inline cost analysis that
  were essentially computing 'is this instruction free?' to use the code
  metrics routine instead. This way we won't keep duplicating logic.

All of this is motivated by the desire to allow other passes to compute
a roughly equivalent 'cost' metric for a particular basic block as the
inline cost analysis. Sadly, re-using the same analysis for both is
really messy because only the actual inline cost analysis is ever going
to go to the contortions required for simplification, SROA analysis,
etc.

llvm-svn: 156140

da7513a8

Factor the logic for testing whether a basic block is viable for code · a46e6242

Chandler Carruth authored May 03, 2012

extraction into a public interface. Also clean it up and apply it more
consistently such that we check for landing pads *anywhere* in the
extracted code, not just in single-block extraction.

This will be used to guide decisions in passes that are planning to
eventually perform a round of code extraction.

llvm-svn: 156114

a46e6242

remove calls to calloc if the allocated memory is not used (it was already being done for malloc) · d4cf35d7
Nuno Lopes authored May 03, 2012
```
fix a few typos found by Chad in my previous commit

llvm-svn: 156110
```
d4cf35d7

May 03, 2012
- add support for calloc to objectsize lowering · d2b71e7f
  Nuno Lopes authored May 03, 2012
```
llvm-svn: 156102
```
  d2b71e7f
- replace 'break's with 'return 0' in visitCallInst code for objectsize, since... · 22f6f3b0
  Nuno Lopes authored May 03, 2012
```
replace 'break's with 'return 0' in visitCallInst code for objectsize, since there is no need to fallback to visitCallSite.
This gives a 0.9% in a test case

llvm-svn: 156069
```
  22f6f3b0
- Whitespace cleanup. · c94d86c4
  Bill Wendling authored May 02, 2012
```
llvm-svn: 156034
```
  c94d86c4
May 02, 2012
- [tsan] typo and style (thanks to Nick Lewycky) · ae7188d9
  Kostya Serebryany authored May 02, 2012
```
llvm-svn: 155986
```
  ae7188d9
- The value held in the vector may be RAUW'ed by some of the canonicalization · 274ba89d
  Bill Wendling authored May 02, 2012
```
methods. Use a weak value handle to keep up with this.
PR12245

llvm-svn: 155984
```
  274ba89d
May 01, 2012
- An instruction in a loop is not guaranteed to be executed just because the loop · 78ee67e8
  Nick Lewycky authored May 01, 2012
```
has no exit blocks. Fixes PR12706!

llvm-svn: 155884
```
  78ee67e8
- Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. Fixes · 3a90fabd
  Lang Hames authored May 01, 2012
```
<rdar://problem/11291436>.

This is a second attempt at a fix for this, the first was r155468. Thanks
to Chandler, Bob and others for the feedback that helped me improve this.

llvm-svn: 155866
```
  3a90fabd
Apr 30, 2012

Second attempt at PR12573: · bf4b9afb

Bill Wendling authored Apr 30, 2012

Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If
the pass is *sure* that it thinks it knows what it's doing, then it may go ahead
and specify that the landing pad can have its critical edge split. The loop
unswitch pass is one of these passes. It will split the critical edges of all
edges coming from a loop to a landing pad not within the loop. Doing so will
retain important loop analysis information, such as loop simplify.

llvm-svn: 155817

bf4b9afb

Use an ArrayRef instead of explicit vector type. · 325e6cd9
Bill Wendling authored Apr 30, 2012
```
llvm-svn: 155816
```
325e6cd9
Remove hack from r154987. The problem persists even with it, so it's not even a good hack. · 712d85a8
Bill Wendling authored Apr 30, 2012
```
llvm-svn: 155813
```
712d85a8
Make sure HoistInsertPosition finds a position that is dominated by all · dd489314
Rafael Espindola authored Apr 30, 2012
```
inputs.

llvm-svn: 155809
```
dd489314

Apr 27, 2012

Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.). · 27c32461

Hal Finkel authored Apr 27, 2012

Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).

llvm-svn: 155729

27c32461