Commits · 5b3638b6e74f5791c03a0fc2517e9528460e0786 · Roger Ferrer / llvm-epi-0.8

Dec 01, 2008

Eliminate use of setvector for the DeadInsts set, just use a smallvector. · 2aebea57
Chris Lattner authored Dec 01, 2008
```
This is a lot cheaper and conceptually simpler.

llvm-svn: 60332
```
2aebea57
DeleteTriviallyDeadInstructions is always passed the · 4da78e37
Chris Lattner authored Dec 01, 2008
```
DeadInsts ivar, just use it directly.

llvm-svn: 60330
```
4da78e37

simplify DeleteTriviallyDeadInstructions again, unlike my previous · a68a5a47

Chris Lattner authored Dec 01, 2008

buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then 
notifying.

llvm-svn: 60329

a68a5a47

simplify these patterns using m_Specific. No need to grep for · 9e6b2434
Chris Lattner authored Dec 01, 2008
```
xor in testcase (or is a substring).

llvm-svn: 60328
```
9e6b2434

Teach jump threading to clean up after itself, DCE and constfolding the · 88a1f021

Chris Lattner authored Dec 01, 2008

new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.

llvm-svn: 60327

88a1f021

Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs · 084b3a47

Chris Lattner authored Dec 01, 2008

instead of using FoldPHIArgBinOpIntoPHI.  In addition to being more
obvious, this also fixes a problem where instcombine wouldn't merge two
phis that had different variable indices.  This prevented instcombine
from factoring big chunks of code in 403.gcc.  For example:

 insn_cuid.exit:                
-       %tmp336 = load i32** @uid_cuid, align 4      
-       %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3    
-       %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32*               
-       %tmp339 = load i32* %tmp338, align 4           
-       %tmp340 = getelementptr i32* %tmp336, i32 %tmp339     
        br label %bb62
 
 bb61:       
-       %tmp341 = load i32** @uid_cuid, align 4     
-       %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3        
-       %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32*           
-       %tmp344 = load i32* %tmp343, align 4        
-       %tmp345 = getelementptr i32* %tmp341, i32 %tmp344          
        br label %bb62
 
 bb62:      
-       %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ]         
+       %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ]         
+       %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3     
+       %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32*  
+       %tmp341.pn = load i32** @uid_cuid     
+       %tmp344.pn = load i32* %tmp344.pn.in 
+       %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn   
        %iftmp.62.0 = load i32* %iftmp.62.0.in     

llvm-svn: 60325

084b3a47

Teach inst combine to merge GEPs through PHIs. This is really · 9d02a70a

Chris Lattner authored Dec 01, 2008

important because it is sinking the loads using the GEPs, but
not the GEPs themselves.  This triggers 647 times on 403.gcc
and makes the .s file much much nicer.  For example before:

        je      LBB1_87 ## bb78
LBB1_62:        ## bb77
        leal    84(%esi), %eax
LBB1_63:        ## bb79
        movl    (%eax), %eax
...
LBB1_87:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
        jmp     LBB1_62 ## bb77


after:

        jne     LBB1_63 ## bb79
LBB1_62:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
LBB1_63:        ## bb79
        movl    84(%esi), %eax

The input code was (and the GEPs are merged and
the PHI is now eliminated by instcombine):

        br i1 %tmp233, label %bb78, label %bb77
bb77:           
        %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb78:           
        call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind
        %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb79:           
        %iftmp.12.0.in = phi %struct.rtx_def** [ %tmp235, %bb78 ], [ %tmp234, %bb77 ]           
        %iftmp.12.0 = load %struct.rtx_def** %iftmp.12.0.in             

llvm-svn: 60322

9d02a70a

Make GVN be more intelligent about redundant load · 9ce8995d

Chris Lattner authored Dec 01, 2008

elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal.  This makes load elimination
more aggressive.  For example, on 403.gcc, we had:

<     68 gvn    - Number of instructions PRE'd
< 152718 gvn    - Number of instructions deleted
<  49699 gvn    - Number of loads deleted
<   6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses

now we have:

>     64 gvn    - Number of instructions PRE'd
> 153623 gvn    - Number of instructions deleted
>  49856 gvn    - Number of loads deleted
>   5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses

That's an extra 157 loads deleted and extra 905 other instructions nuked.

This slows down GVN very slightly, from 3.91 to 3.96s.

llvm-svn: 60314

9ce8995d

Reimplement the non-local dependency data structure in terms of a sorted · 7e61dafc

Chris Lattner authored Dec 01, 2008

vector instead of a densemap.  This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster.  This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc

This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case.  Here's the stats
for 403.gcc:

  6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses

yay for caching :)

llvm-svn: 60313

7e61dafc

Implement ((A|B)&1)|(B&-2) -> (A&1) | B transformation. This also takes care of · 5b902c5b
Bill Wendling authored Dec 01, 2008
```
permutations of this pattern.

llvm-svn: 60312
```
5b902c5b
Cache analyses in ivars and add some useful DEBUG output. · 8541edec
Chris Lattner authored Dec 01, 2008
```
This speeds up GVN from 4.0386s to 3.9376s.

llvm-svn: 60310
```
8541edec
improve indentation, do cheap checks before expensive ones, · 80c7d81e
Chris Lattner authored Nov 30, 2008
```
remove some fixme's.  This speeds up GVN very slightly on 403.gcc 
(4.06->4.03s)

llvm-svn: 60309
```
80c7d81e

Nov 30, 2008
- Minor cleanup: use getTrue and getFalse where appropriate. No · 11c15a5d
  Eli Friedman authored Nov 30, 2008
```
functional change.

llvm-svn: 60307
```
  11c15a5d
- Some minor cleanups to instcombine; no functionality change. · 55e4becb
  Eli Friedman authored Nov 30, 2008
```
Note that the FoldOpIntoPhi call is dead because it's impossible for the 
first operand of a subtraction to be both a ConstantInt and a PHINode.

llvm-svn: 60306
```
  55e4becb
- Add instruction combining for ((A&~B)|(~A&B)) -> A^B and all permutations. · de89bc27
  Bill Wendling authored Nov 30, 2008
```
llvm-svn: 60291
```
  de89bc27
- Implement (A&((~A)|B)) -> A&B transformation in the instruction combiner. This · 9eef421e
  Bill Wendling authored Nov 30, 2008
```
takes care of all permutations of this pattern.

llvm-svn: 60290
```
  9eef421e
- Forgot one remaining call to getSExtValue(). · 2fe32298
  Bill Wendling authored Nov 30, 2008
```
llvm-svn: 60289
```
  2fe32298
- getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all · 2d2e7861
  Bill Wendling authored Nov 30, 2008
```
APInt calls instead.

This fixes PR3144.

llvm-svn: 60288
```
  2d2e7861
- Optimize memmove and memset into the LLVM builtins. Note that these · 09bc6109
  Eli Friedman authored Nov 30, 2008
```
only show up in code from front-ends besides llvm-gcc, like clang.

llvm-svn: 60287
```
  09bc6109
- Don't make TwoToExp signed by default. · 7abf352f
  Bill Wendling authored Nov 30, 2008
```
llvm-svn: 60279
```
  7abf352f
- From Hacker's Delight: · af200e92
  Bill Wendling authored Nov 30, 2008
```
"For signed integers, the determination of overflow of x*y is not so simple. If
x and y have the same sign, then overflow occurs iff xy > 2**31 - 1. If they
have opposite signs, then overflow occurs iff xy < -2**31."

In this case, x == -1.

llvm-svn: 60278
```
  af200e92
- Instcombine was illegally transforming -X/C into X/-C when either X or C · 70635ade
  Bill Wendling authored Nov 30, 2008
```
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.

llvm-svn: 60275
```
  70635ade
- Fix a fixme by making memdep's handling of allocations more logical. · 3ff6d015
  Chris Lattner authored Nov 30, 2008
```
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.

llvm-svn: 60267
```
  3ff6d015
- Eliminate the dropInstruction method, which is not needed any more. · 63bd586d
  Chris Lattner authored Nov 29, 2008
```
Fix a subtle iterator invalidation bug I introduced in the last commit.

llvm-svn: 60258
```
  63bd586d
Nov 29, 2008

Change MemDep::getNonLocalDependency to return its results as · 1c6b62eb
Chris Lattner authored Nov 29, 2008
```
a smallvector instead of a DenseMap.  This speeds up GVN by 5%
on 403.gcc.

llvm-svn: 60255
```
1c6b62eb
reimplement getNonLocalDependency with a simpler worklist · f280b0c7
Chris Lattner authored Nov 29, 2008
```
formulation that is faster and doesn't require nonLazyHelper.
Much less code.

llvm-svn: 60253
```
f280b0c7
Fix a thinko that manifested as a crash on clamav last night. · 8c5ff516
Chris Lattner authored Nov 29, 2008
```
llvm-svn: 60251
```
8c5ff516

Split getDependency into getDependency and getDependencyFrom, the · 51ba8d06

Chris Lattner authored Nov 29, 2008

former does caching, the later doesn't.  This dramatically simplifies
the logic in getDependency and getDependencyFrom.

llvm-svn: 60234

51ba8d06

Temporarily revert r60195. It's causing an optimized bootstrap of llvm-gcc to fail. · 469e3aa6
Bill Wendling authored Nov 29, 2008
```
llvm-svn: 60233
```
469e3aa6

Introduce and use a new MemDepResult class to hold the results of a memdep · 7f9c8a0f

Chris Lattner authored Nov 29, 2008

query.  This makes it crystal clear what cases can escape from MemDep that
the clients have to handle.  This also gives the clients a nice simplified
interface to it that is easy to poke at.

This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType
private, yay.

llvm-svn: 60231

7f9c8a0f

Reimplement the internal abstraction used by MemDep in terms · de04e117

Chris Lattner authored Nov 29, 2008

of a pointer/int pair instead of a manually bitmangled pointer.
This forces clients to think a little more about checking the 
appropriate pieces and will be useful for internal 
implementation improvements later.

I'm not particularly happy with this.  After going through this
I don't think that the clients of memdep should be exposed to
the internal type at all.  I'll fix this in a subsequent commit.

This has no functionality change.

llvm-svn: 60230

de04e117

Nov 28, 2008
- don't revisit instructions off the beginning of the block. · f3f6a801
  Chris Lattner authored Nov 28, 2008
```
llvm-svn: 60221
```
  f3f6a801
- simplify some code, remove escaped newline. · f2a8ba4c
  Chris Lattner authored Nov 28, 2008
```
llvm-svn: 60213
```
  f2a8ba4c
- don't call MergeBasicBlockIntoOnlyPred on a block whose only · 8a172daa
  Chris Lattner authored Nov 28, 2008
```
predecessor is itself.  This doesn't make sense, and this is
a dead infinite loop anyway.

llvm-svn: 60210
```
  8a172daa
- rewrite a big chunk of how DSE does recursive dead operand · 1adb6759
  Chris Lattner authored Nov 28, 2008
```
elimination to use more modern infrastructure.  Also do a bunch
of small cleanups.

llvm-svn: 60201
```
  1adb6759
- Simplify LoopStrengthReduce::DeleteTriviallyDeadInstructions by · c077a2a5
  Chris Lattner authored Nov 27, 2008
```
making it use RecursivelyDeleteTriviallyDeadInstructions to do
the heavy lifting.

llvm-svn: 60195
```
  c077a2a5
- use continue to reduce indentation · 96e2dbe0
  Chris Lattner authored Nov 27, 2008
```
llvm-svn: 60192
```
  96e2dbe0
Nov 27, 2008

remove doConstantPropagation and dceInstruction, they are just · c6c481cd

Chris Lattner authored Nov 27, 2008

wrappers around the interesting code and use an obscure iterator
abstraction that dates back many many years.

Move EraseDeadInstructions to Transforms/Utils and name it
RecursivelyDeleteTriviallyDeadInstructions.

llvm-svn: 60191

c6c481cd

simplify code. · 5ef9ebf7
Chris Lattner authored Nov 27, 2008
```
llvm-svn: 60190
```
5ef9ebf7
simplify this logic. · c92fa42d
Chris Lattner authored Nov 27, 2008
```
llvm-svn: 60189
```
c92fa42d