Commits · 19e30d5a7d3369feae36cbe7f4f4c1f358881c41 · Roger Ferrer / llvm-epi-0.8

Jan 21, 2011
- Actually check memcpy lengths, instead of just commenting about · 19e30d5a
  Dan Gohman authored Jan 21, 2011
```
how they should be checked.

llvm-svn: 123999
```
  19e30d5a
- Just because we have determined that an (fcmp | fcmp) is true for A < B, · a834200d
  Owen Anderson authored Jan 21, 2011
```
A == B, and A > B, does not mean we can fold it to true.  We still need to
check for A ? B (A unordered B).

llvm-svn: 123993
```
  a834200d
- SCCP doesn't actually preserve the CFG. It will delete and insert terminator · ae0275e0
  Nick Lewycky authored Jan 21, 2011
```
instructions.

llvm-svn: 123973
```
  ae0275e0
- fix PR9013, an infinite loop in instcombine. · b5e15d19
  Chris Lattner authored Jan 21, 2011
```
llvm-svn: 123968
```
  b5e15d19
- update obsolete comment. · f4ca47bd
  Chris Lattner authored Jan 21, 2011
```
llvm-svn: 123965
```
  f4ca47bd
- Don't try to pull vector bitcasts that change the number of elements through · 6a083cf8
  Nick Lewycky authored Jan 21, 2011
```
a select. A vector select is pairwise on each element so we'd need a new
condition with the right number of elements to select on. Fixes PR8994.

llvm-svn: 123963
```
  6a083cf8
Jan 20, 2011

At -O123 the early-cse pass is run before instcombine has run. According to my · 8fb2c382

Duncan Sands authored Jan 20, 2011

auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0.
This patch adds this transform and some related logic to InstructionSimplify
and removes some of the logic from instcombine (unfortunately not all because
there are several situations in which instcombine can improve things by making
new instructions, whereas instsimplify is not allowed to do this). At -O2 this
often results in more than 15% more simplifications by early-cse, and results in
hundreds of lines of bitcode being eliminated from the testsuite. I did see some
small negative effects in the testsuite, for example a few additional instructions
in three programs. One program, 483.xalancbmk, got an additional 35 instructions,
which seems to be due to a function getting an additional instruction and then
being inlined all over the place.

llvm-svn: 123911

8fb2c382

Jan 19, 2011
- Add unnamed_addr when we can show that address of a global is not used. · fc355bc0
  Rafael Espindola authored Jan 19, 2011
```
llvm-svn: 123834
```
  fc355bc0
Jan 18, 2011
- fix rdar://8878965, a regression I introduced with the recent · 86d56c65
  Chris Lattner authored Jan 18, 2011
```
llvm.objectsize changes.

llvm-svn: 123771
```
  86d56c65
- Convert a std::map to a DenseMap for another 1.7% speedup on -scalarrepl. · fc210c79
  Cameron Zwarich authored Jan 18, 2011
```
llvm-svn: 123732
```
  fc210c79
- Make a std::vector a SmallVector<*, 32> like the other vectors in the same · 6968c41a
  Cameron Zwarich authored Jan 18, 2011
```
function. This seems to be about a 1.5% speedup of -scalarrepl on test-suite
with SPEC2000 and SPEC2006.

llvm-svn: 123731
```
  6968c41a
- Reduce indentation and remove commented out code. · ecd5b9ab
  Rafael Espindola authored Jan 18, 2011
```
llvm-svn: 123729
```
  ecd5b9ab
- Remove code for updating dominance frontiers and some outdated references to · b703654e
  Cameron Zwarich authored Jan 18, 2011
```
dominance and post-dominance frontiers.

llvm-svn: 123725
```
  b703654e
- Remove outdated references to dominance frontiers. · 4694e695
  Cameron Zwarich authored Jan 18, 2011
```
llvm-svn: 123724
```
  4694e695
Jan 17, 2011

Remove dead code, that I apparently wrote a while back. We seem to be doing well enough · 459e0799

Owen Anderson authored Jan 17, 2011

without whatever this was trying to do.  When/if someone has the time to do some empirical
evaluations, it might be worth it to figure out what this code was trying to do and see if
it's worth resurrecting/fixing.

llvm-svn: 123684

459e0799

Roll r123609 back in with two changes that fix test failures with expensive · b410858a

Cameron Zwarich authored Jan 17, 2011

checks enabled:

1) Use '<' to compare integers in a comparison function rather than '<='.

2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.

The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.

llvm-svn: 123662

b410858a

Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot. · 67431d79
Cameron Zwarich authored Jan 17, 2011
```
llvm-svn: 123618
```
67431d79

Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to · 814cd923

Cameron Zwarich authored Jan 17, 2011

eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.

Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.

llvm-svn: 123609

814cd923

Jan 16, 2011

Teach DAE to look for functions whose arguments are unused, and change all... · d3db8334

Anders Carlsson authored Jan 16, 2011

Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead.

llvm-svn: 123596

d3db8334

tidy up a comment, as suggested by duncan · 7c9f4c9c
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123590
```
7c9f4c9c

Don't merge two constants if we care about the address of both. · 751677a0

Rafael Espindola authored Jan 16, 2011

This fixes the original testcase in PR8927. It also causes a clang
binary built with a patched clang to increase in size by 0.21%.

We can probably get some of the size back by writing a pass that
detects that a global never has its pointer compared and adds
unnamed_addr to it (maybe extend global opt). It is also possible that
there are some other cases clang could add unnamed_addr to.

I will investigate extending globalopt next.

llvm-svn: 123584

751677a0

fix PR8932, a case where arg promotion could infinitely promote. · e5f8de86
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123574
```
e5f8de86
simplify a little · ed1fb92c
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123573
```
ed1fb92c

if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, · 6fab2e94

Chris Lattner authored Jan 16, 2011

then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.

llvm-svn: 123571

6fab2e94

Use an irbuilder to get some trivial constant folding when doing a store · 7cd8cf7d
Chris Lattner authored Jan 16, 2011
```
of a constant.

llvm-svn: 123570
```
7cd8cf7d
remove a dead check, this was needed before we had an explicit veto on uses of phis. · adb1a233
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123569
```
adb1a233

enhance FoldOpIntoPhi in instcombine to try harder when a phi has · d55581de

Chris Lattner authored Jan 16, 2011

multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568

d55581de

remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the · ea7131a0
Chris Lattner authored Jan 16, 2011
```
first line of the function because it isn't a good idea, even for compares.

llvm-svn: 123566
```
ea7131a0
more cleanups: use the IR builder. · ff2e7377
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123565
```
ff2e7377
tidy up code. · 25ce2805
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123564
```
25ce2805
Improve the safety of my globalopt enhancement by ensuring that the bitcast · 4e54efd6
Owen Anderson authored Jan 16, 2011
```
of the stored value to the new store type is always.  Also, add a testcase.

llvm-svn: 123563
```
4e54efd6
simplify this code, it is still broken but will follow up on llvm-commits. · 8b4952fc
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123558
```
8b4952fc
remove the partial specialization pass. It is unmaintained and has bugs. · 1e209b87
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123554
```
1e209b87

Jan 15, 2011
- Add missing whitespace. · 4a1ff16b
  Nick Lewycky authored Jan 15, 2011
```
llvm-svn: 123543
```
  4a1ff16b
- Make constmerge a two-pass algorithm so that it won't miss merging · 0296a481
  Nick Lewycky authored Jan 15, 2011
```
opporuntities. Fixes PR8978.

llvm-svn: 123541
```
  0296a481
- Try to unbreak selfhost. · ed5f2e50
  Benjamin Kramer authored Jan 15, 2011
```
llvm-svn: 123537
```
  ed5f2e50
- Add a cache that protects mergefunc's internals from more surprises in DenseSet. · 540f9536
  Nick Lewycky authored Jan 15, 2011
```
Also, replace tabs with spaces. Yes, it's 2011.

llvm-svn: 123535
```
  540f9536
- temporarily revert r123526. While working on a follow-on patch I · af263907
  Chris Lattner authored Jan 15, 2011
```
realize that ConstantFoldTerminator doesn't preserve dominfo.

llvm-svn: 123527
```
  af263907
- fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code · 8df83c4a
  Chris Lattner authored Jan 15, 2011
```
The basic issue is that isel (very reasonably!) expects conditional branches
to be folded, so CGP leaving around a bunch dead computation feeding
conditional branches isn't such a good idea.  Just fold branches on constants
into unconditional branches.

llvm-svn: 123526
```
  8df83c4a
- simplify code, no functionality change. · ee588def
  Chris Lattner authored Jan 15, 2011
```
llvm-svn: 123525
```
  ee588def