Commits · bf0aa927cca12d8c335dd00a300c582ba2446e10 · Roger Ferrer / llvm-epi-0.8

Jan 02, 2011

split dom frontier handling stuff out to its own DominanceFrontier header, · bf0aa927
Chris Lattner authored Jan 02, 2011
```
so that Dominators.h is *just* domtree.  Also prune #includes a bit.

llvm-svn: 122714
```
bf0aa927
sketch out a new early cse pass. No functionality yet. · 704541bb
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122713
```
704541bb

fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make · 9c69406f

Chris Lattner authored Jan 02, 2011

sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy.  Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.

llvm-svn: 122712

9c69406f

If a loop iterates exactly once (has backedge count = 0) then don't · 5702a43c
Chris Lattner authored Jan 02, 2011
```
mess with it.  We'd rather peel/unroll it than convert all of its 
stores into memsets.

llvm-svn: 122711
```
5702a43c

Try to reuse the value when lowering memset. · 25e6e06e

Benjamin Kramer authored Jan 02, 2011

This allows us to compile:
  void test(char *s, int a) {
    __builtin_memset(s, a, 15);
  }
into 1 mul + 3 stores instead of 3 muls + 3 stores.

llvm-svn: 122710

25e6e06e

Lower the i8 extension in memset to a multiply instead of a potentially long... · 2fdea4c8

Benjamin Kramer authored Jan 02, 2011

Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors.

We could implement a DAGCombine to turn x * 0x0101 back into logic operations
on targets that doesn't support the multiply or it is slow (p4) if someone cares
enough.

Example code:
  void test(char *s, int a) {
      __builtin_memset(s, a, 4);
  }
before:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    movl  %eax, %ecx
    shll  $8, %ecx
    orl %eax, %ecx
    movl  %ecx, %eax
    shll  $16, %eax
    orl %ecx, %eax
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret
after:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    imull $16843009, %eax, %eax   ## imm = 0x1010101
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret

llvm-svn: 122707

2fdea4c8

A workaround for a bug in cmake 2.8.3 diagnosed on PR 8885. · 68b7bb95
Oscar Fuentes authored Jan 02, 2011
```
llvm-svn: 122706
```
68b7bb95
Also remove functions that use complex constant expressions in terms of · 5361b841
Nick Lewycky authored Jan 02, 2011
```
another function.

llvm-svn: 122705
```
5361b841

enhance loop idiom recognition to scan *all* unconditionally executed · 8455b6e4

Chris Lattner authored Jan 02, 2011

blocks in a loop, instead of just the header block.  This makes it more
aggressive, able to handle Duncan's Ada examples.

llvm-svn: 122704

8455b6e4

make inSubLoop much more efficient. · 0cdc6f62
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122703
```
0cdc6f62

rip out isExitBlockDominatedByBlockInLoop, calling DomTree::dominates instead. · 27497ece

Chris Lattner authored Jan 02, 2011

isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was 
*just* a tree and didn't have DFS numbers.  Checking DFS numbers is faster
and easier than "limiting the search of the tree".

llvm-svn: 122702

27497ece

add a list of opportunities for future improvement. · 0469e01c
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122701
```
0469e01c
update a bunch of entries. · 51415d26
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122700
```
51415d26

Fix PR8702 by not having LoopSimplify claim to preserve LCSSA form. As described · 64f1c0dc

Duncan Sands authored Jan 02, 2011

in the PR, the pass could break LCSSA form when inserting preheaders.  It probably
would be easy enough to fix this, but since currently we always go into LCSSA form
after running this pass, doing so is not urgent.

llvm-svn: 122695

64f1c0dc

Allow loop-idiom to run on multiple BB loops, but still only scan the loop · ddf58010

Chris Lattner authored Jan 02, 2011

header for now for memset/memcpy opportunities.  It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for 
loops" into 2 basic block loops that loop-idiom was ignoring.

With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:

        for (j=0; j<MAX_history; ++j) {
          history_new[i][j+1] = history[2*i][j];
        }

Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine.  Woo.

llvm-svn: 122685

ddf58010

remove debugging code. · 5b5a043d
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122683
```
5b5a043d
add some -stats output. · 12f91bef
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122682
```
12f91bef

improve loop rotation to use CodeMetrics to analyze the · 679572e5

Chris Lattner authored Jan 02, 2011

size of a loop header instead of its own code size estimator.
This allows it to handle bitcasts etc more precisely.

llvm-svn: 122681

679572e5

teach loop idiom recognition to form memcpy's from simple loops. · 85b6d81d
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122678
```
85b6d81d

Remove functions from the FnSet when one of their callee's is being merged. This · 4e250c82

Nick Lewycky authored Jan 02, 2011

maintains the guarantee that the DenseSet expects two elements it contains to
not go from inequal to equal under its nose.

As a side-effect, this also lets us switch from iterating to a fixed-point to
actually maintaining a work queue of functions to look at again, and we don't
add thunks to our work queue so we don't need to detect and ignore them.

llvm-svn: 122677

4e250c82

Jan 01, 2011
- a missed __builtin_object_size case. · 6c3fc0a5
  Chris Lattner authored Jan 01, 2011
```
llvm-svn: 122676
```
  6c3fc0a5
- various updates. · e5d5a41a
  Chris Lattner authored Jan 01, 2011
```
llvm-svn: 122675
```
  e5d5a41a
- fix a globalopt crash on two Adobe-C++ testcases that the recent · 1903c42b
  Chris Lattner authored Jan 01, 2011
```
loop idiom pass exposed.

llvm-svn: 122674
```
  1903c42b
- Add support for the 'H' modifier. · d606e547
  Rafael Espindola authored Jan 01, 2011
```
llvm-svn: 122667
```
  d606e547
- Model operand restrictions of mul-like instructions on ARMv5 via · 62acecd7
  Anton Korobeynikov authored Jan 01, 2011
```
earlyclobber stuff. This should fix PRs 2313 and 8157.

Unfortunately, no testcase, since it'd be dependent on register
assignments.

llvm-svn: 122663
```
  62acecd7
- add a validity check that was missed, fixing a crash on the · a3514441
  Chris Lattner authored Jan 01, 2011
```
new testcase.

llvm-svn: 122662
```
  a3514441
- Revert commit 122654 at the request of Chris, who reckons that instsimplify · 772749ae
  Duncan Sands authored Jan 01, 2011
```
is the wrong hammer for this nail, and is probably right.

llvm-svn: 122661
```
  772749ae
- improve validity check to handle constant-trip-count loops more · 91a44358
  Chris Lattner authored Jan 01, 2011
```
aggressively.  In practice, this doesn't help anything though,
see the todo.

llvm-svn: 122660
```
  91a44358
- implement the "no aliasing accesses in loop" safety check. This pass · 8b3baf6d
  Chris Lattner authored Jan 01, 2011
```
should be correct now.

llvm-svn: 122659
```
  8b3baf6d
- Fix PR8878. · 36864735
  Rafael Espindola authored Jan 01, 2011
```
llvm-svn: 122658
```
  36864735
- Fix a README item by having InstructionSimplify do a mild form of value · e3c53958
  Duncan Sands authored Jan 01, 2011
```
numbering, in which it considers (for example) "%a = add i32 %x, %y" and
"%b = add i32 %x, %y" to be equal because the operands are equal and the
result of the instructions only depends on the values of the operands.
This has almost no effect (it removes 4 instructions from gcc-as-one-file),
and perhaps slows down compilation: I measured a 0.4% slowdown on the large
gcc-as-one-file testcase, but it wasn't statistically significant.

llvm-svn: 122654
```
  e3c53958
- ptx: remove reg-reg addressing mode and st.const · 5451fc91
  Che-Liang Chiou authored Jan 01, 2011
```
llvm-svn: 122653
```
  5451fc91
- ptx: add store instruction · 15e8d2c5
  Che-Liang Chiou authored Jan 01, 2011
```
llvm-svn: 122652
```
  15e8d2c5
Dec 31, 2010
- Add to the list of cmake files the object file, not the asm file. This · a8eb6043
  Oscar Fuentes authored Dec 31, 2010
```
is necessary for executing the custom command that runs the
assember. Fixes PR8877.

llvm-svn: 122649
```
  a8eb6043
- Simplify this pass by using a depth-first iterator to ensure that all · 2c440fa4
  Duncan Sands authored Dec 31, 2010
```
operands are visited before the instructions themselves.

llvm-svn: 122647
```
  2c440fa4
- Zap dead instructions harder. · 6cc7126e
  Duncan Sands authored Dec 31, 2010
```
llvm-svn: 122645
```
  6cc7126e
Dec 30, 2010
- Make a bunch of symbols internal. · 570dd787
  Benjamin Kramer authored Dec 30, 2010
```
llvm-svn: 122642
```
  570dd787
- Add another non-commutable instruction that gas accepts commuted forms for. · ee0432ce
  Nick Lewycky authored Dec 30, 2010
```
Fixes PR8861.

llvm-svn: 122641
```
  ee0432ce
- ptx: add state spaces · 3ee05013
  Che-Liang Chiou authored Dec 30, 2010
```
llvm-svn: 122638
```
  3ee05013
- include the module identifier when emitting this warning, PR8865. · e240995e
  Chris Lattner authored Dec 30, 2010
```
llvm-svn: 122637
```
  e240995e