Commits · f5cca68c2c625e545ed5e47238fc176328517a1c · Roger Ferrer / llvm-epi-0.8

Dec 31, 2012
- Fix LICM's memory promotion optimization to preserve TBAA tags when · f5cca68c
  Chris Lattner authored Dec 31, 2012
```
promoting a store in a loop.  This was noticed when working on PR14753,
but isn't directly related.

llvm-svn: 171281
```
  f5cca68c
- teach instcombine to preserve TBAA tag when merging two stores, part of · eeefe1bc
  Chris Lattner authored Dec 31, 2012
```
PR14753

llvm-svn: 171279
```
  eeefe1bc
- Transform (A == C1 || A == C2) into (A & ~(C1 ^ C2)) == C1 · ea2b9b9d
  Jakub Staszak authored Dec 31, 2012
```
if C1 and C2 differ only with one bit.
Fixes PR14708.

llvm-svn: 171270
```
  ea2b9b9d
Dec 30, 2012
- Support ppcf128 in SelectionDAG::getConstantFP · 6dbdd430
  Hal Finkel authored Dec 30, 2012
```
Fixes pr14751.

Patch by Kai; Thanks!

llvm-svn: 171261
```
  6dbdd430
- LoopVectorizer: Fix a bug in the code that updates the loop exiting block. · 0b37f143
  Nadav Rotem authored Dec 30, 2012
```
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs.
The bug happened because undefs are not loop values. This patch handles these PHIs.

PR14725

llvm-svn: 171251
```
  0b37f143
- Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID · 56bf2e18
  Dmitri Gribenko authored Dec 30, 2012
```
This is done to avoid odd test failures, like the one fixed in r171243.

llvm-svn: 171250
```
  56bf2e18
- Add a check to the test Analysis/ScalarEvolution/2010-09-03-RequiredTransitive.ll · 10c4b4d2
  Dmitri Gribenko authored Dec 30, 2012
```
This test did not test anything at all (except for opt crashing, but that was
not the reason why it was added).

llvm-svn: 171248
```
  10c4b4d2
- Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID · b137c9e5
  Dmitri Gribenko authored Dec 30, 2012
```
This is done to avoid odd test failures, like the one fixed in r171243.

llvm-svn: 171246
```
  b137c9e5
- llvm/test/Transforms/GVN/null-aliases-nothing.ll: Fix a RUN line not to emit ModuleID. · 5a495a5c
  NAKAMURA Takumi authored Dec 30, 2012
```
Larry Evans reported it fails if source tree contains "load", like "download".

llvm-svn: 171243
```
  5a495a5c
Dec 28, 2012

Fix a stunning oversight in the inline cost analysis. It was never · 86ed5308

Chandler Carruth authored Dec 28, 2012

propagating one of the values it simplified to a constant across
a myriad of instructions. Notably, ptrtoint instructions when we had
a constant pointer (say, 0) didn't propagate that, blocking a massive
number of down-stream optimizations.

This was uncovered when investigating why we fail to inline and delete
the boilerplate in:

  void f() {
    std::vector<int> v;
    v.push_back(1);
  }

It turns out most of the efforts I've made thus far to improve the
analysis weren't making it far purely because of this. After this is
fixed, the store-to-load forwarding patch enables LLVM to optimize the
above to an empty function. We still can't nuke a second push_back, but
for different reasons.

There is a very real chance this will cause somewhat noticable changes
in inlining behavior, so please let me know if you see regressions (or
improvements!) because of this patch.

llvm-svn: 171196

86ed5308

Teach the inline cost analysis about calls that can be simplified and · 753e21d0

Chandler Carruth authored Dec 28, 2012

how to propagate constants through insert and extract value
instructions.

With the recent improvements to instsimplify, this allows inline cost
analysis to constant fold through intrinsic functions, including notably
the with.overflow intrinsic math routines which often show up inside of
STL abstractions. This is yet another piece in the puzzle of breaking
down the code for:

  void f() {
    std::vector<int> v;
    v.push_back(1);
  }

But it still isn't enough. There are a pile of bugs in inline cost still
blocking this.

llvm-svn: 171195

753e21d0

Teach instsimplify to use the constant folder where appropriate for · f6182155

Chandler Carruth authored Dec 28, 2012

constant folding calls. Add the initial tests for this which show that
now instsimplify can simplify blindingly obvious code patterns expressed
with both intrinsics and library calls.

llvm-svn: 171194

f6182155

AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these... · 3da9ac72

Nadav Rotem authored Dec 28, 2012

AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend.

llvm-svn: 171178

3da9ac72

Dec 27, 2012

[ASan] Fix lifetime intrinsics handling. Now for each intrinsic we check if it... · 29dd7f20

Alexey Samsonov authored Dec 27, 2012

[ASan] Fix lifetime intrinsics handling. Now for each intrinsic we check if it describes one of 'interesting' allocas. Assume that allocas can go through casts and phi-nodes before apperaring as llvm.lifetime arguments

llvm-svn: 171153

29dd7f20

On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized · 2a054b44

Nadav Rotem authored Dec 27, 2012

register. In most cases we actually compare or select YMM-sized registers
and mixing the two types creates horrible code. This commit optimizes
some of the transition sequences.

PR14657.

llvm-svn: 171148

2a054b44

For the dwarf5 split debug info code split out the string section · 3bf29fda
Eric Christopher authored Dec 27, 2012
```
per compile unit/skeleton compile unit. Update tests accordingly.

llvm-svn: 171133
```
3bf29fda
FileCheck-ize. · c8a88ee6
Eric Christopher authored Dec 27, 2012
```
llvm-svn: 171132
```
c8a88ee6
FileCheck-ize. · d6152aab
Eric Christopher authored Dec 27, 2012
```
llvm-svn: 171131
```
d6152aab

Right now all of the relocations are 32-bit dwarf, and the relocation · 5a6acfa4

Eric Christopher authored Dec 27, 2012

information doesn't return an addend for Rel relocations. Go ahead
and use this information to fix relocation handling inside dwarfdump
for 32-bit ELF REL.

llvm-svn: 171126

5a6acfa4

If all of the write objects are identified then we can vectorize the loop even... · 5350cd31

Nadav Rotem authored Dec 26, 2012

If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified.

PR14719.

llvm-svn: 171124

5350cd31

Dec 26, 2012
- LoopVectorizer: Optimize the vectorization of consecutive memory access when... · 3f7c4f36
  Nadav Rotem authored Dec 26, 2012
```
LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1

llvm-svn: 171114
```
  3f7c4f36
- [msan] Raise alignment of origin stores/loads when possible. · 5eb5bf8b
  Evgeniy Stepanov authored Dec 26, 2012
```
Origin alignment is as high as the alignment of the corresponding application
location, but never less than 4.

llvm-svn: 171110
```
  5eb5bf8b
- llvm/test/CodeGen/X86: FileCheck-ize two tests in r171083. · 40aa3285
  NAKAMURA Takumi authored Dec 26, 2012
```
llvm-svn: 171084
```
  40aa3285
- llvm/test/CodeGen/X86: Disable avx in two tests corresponding to r171082. · 334f6853
  NAKAMURA Takumi authored Dec 26, 2012
```
llvm-svn: 171083
```
  334f6853
- BBVectorize: Use VTTI to compute costs for intrinsics vectorization · 30e95a8e
  Hal Finkel authored Dec 26, 2012
```
For the time being this includes only some dummy test cases. Once the
generic implementation of the intrinsics cost function does something other
than assuming scalarization in all cases, or some target specializes the
interface, some real test cases can be added.

Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID
in a few other places.

llvm-svn: 171079
```
  30e95a8e
- LoopVectorize: Enable vectorization of the fmuladd intrinsic · b44f8901
  Hal Finkel authored Dec 25, 2012
```
llvm-svn: 171076
```
  b44f8901
Dec 25, 2012

BBVectorize: Enable vectorization of the fmuladd intrinsic · 2a456112
Hal Finkel authored Dec 25, 2012
```
llvm-svn: 171075
```
2a456112

Loosen scheduling restrictions on the PPC dcbt intrinsic · 2ebe6d08

Hal Finkel authored Dec 25, 2012

As with the prefetch intrinsic to which it maps, simply have dcbt
marked as reading from and writing to its arguments instead of having
unmodeled side effects. While this might cause unwanted code motion
(because aliasing checks don't really capture cache-line sharing),
it is more important that prefetches in unrolled loops don't block
the scheduler from rearranging the unrolled loop body.

llvm-svn: 171073

2ebe6d08

Expand PPC64 atomic load and store · 1b5ff08d

Hal Finkel authored Dec 25, 2012

Use of store or load with the atomic specifier on 64-bit types would
cause instruction-selection failures. As with the 32-bit case, these
can use the default expansion in terms of cmp-and-swap.

llvm-svn: 171072

1b5ff08d

[msan] Fix handling of vectors of pointers. · f19c086d

Evgeniy Stepanov authored Dec 25, 2012

VectorType::getInteger() can not be used with them, because pointer size
depends on the target.

llvm-svn: 171070

f19c086d

[msan] Fix handling of select with vector condition. · ec837128
Evgeniy Stepanov authored Dec 25, 2012
```
llvm-svn: 171069
```
ec837128
Harden test so it's not affected by changes to compare lowering. · a9f265ee
Benjamin Kramer authored Dec 25, 2012
```
This only failed on hosts that don't have SSE41.

llvm-svn: 171066
```
a9f265ee
X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity. · 81b5a8fd
Benjamin Kramer authored Dec 25, 2012
```
llvm-svn: 171064
```
81b5a8fd

X86: Custom lower <2 x i64> eq and ne when SSE41 is not available. · df4af41b

Benjamin Kramer authored Dec 25, 2012

pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack.
Small speedup on loop-vectorized viterbi (-march=core2).

llvm-svn: 171063

df4af41b

Dec 24, 2012
- Fix typo "Makre" -> "Make". · fb432580
  Nick Lewycky authored Dec 24, 2012
```
llvm-svn: 171043
```
  fb432580
- llvm/test/CodeGen/X86/fold-vex.ll: Add explicit triple. · 1b18db7e
  NAKAMURA Takumi authored Dec 24, 2012
```
llvm-svn: 171029
```
  1b18db7e
- Some x86 instructions can load/store one of the operands to memory. On SSE,... · dc0ad92b
  Nadav Rotem authored Dec 24, 2012
```
Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned.
When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding
tables and removes the alignment restrictions from VEX-encoded instructions.

llvm-svn: 171024
```
  dc0ad92b
- LoopVectorizer: When checking for vectorizable types, also check · 5f7c12cf
  Nadav Rotem authored Dec 24, 2012
```
the StoreInst operands.

PR14705.

llvm-svn: 171023
```
  5f7c12cf
- LoopVectorizer: Fix an endless loop in the code that looks for reductions. · bd5d1d83
  Nadav Rotem authored Dec 24, 2012
```
The bug was in the code that detects PHIs in if-then-else block sequence.

PR14701.

llvm-svn: 171008
```
  bd5d1d83
Dec 23, 2012

CostModel: Change the default target-independent implementation for finding · cf9999d9

Nadav Rotem authored Dec 23, 2012

the cost of arithmetic functions. We now assume that the cost of arithmetic
operations that are marked as Legal or Promote is low, but ops that are
marked as custom are higher.

llvm-svn: 171002

cf9999d9