Commits · 41377175d3ae2b27ff6e19d998cc9c7a891c6543 · Roger Ferrer / llvm-epi-0.8

Apr 29, 2008
- Remove debugging code. · 41377175
  Owen Anderson authored Apr 29, 2008
```
llvm-svn: 50383
```
  41377175
- Add dead loop elimination, which removes dead loops for which we can compute · 94ad7024
  Owen Anderson authored Apr 29, 2008
```
the trip count.

llvm-svn: 50382
```
  94ad7024
Apr 28, 2008

Fix DSE to not eliminate volatile loads with no uses. · 8cb19d96
Dan Gohman authored Apr 28, 2008
```
llvm-svn: 50370
```
8cb19d96

Teach InstCombine's ComputeMaskedBits what SelectionDAG's · 72ec3f45

Dan Gohman authored Apr 28, 2008

ComputeMaskedBits knows about cttz, ctlz, and ctpop. Teach
SelectionDAG's ComputeMaskedBits what InstCombine's knows
about SRem. And teach them both some things about high bits
in Mul, UDiv, URem, and Sub. This allows instcombine and
dagcombine to eliminate sign-extension operations in
several new cases.

llvm-svn: 50358

72ec3f45

Fix PR2256, yet another miscompilation in simplifycfg of i · 8be72700
Chris Lattner authored Apr 28, 2008
```
multiple return values.

Bill, please pull this into Tak.

llvm-svn: 50332
```
8be72700

Apr 27, 2008

Implement a signficant optimization for inline asm: · 22379734

Chris Lattner authored Apr 27, 2008

When choosing between constraints with multiple options,
like "ir", test to see if we can use the 'i' constraint and
go with that if possible.  This produces more optimal ASM in
all cases (sparing a register and an instruction to load it),
and fixes inline asm like this:

void test () {
  asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
}

Previously we would dump "42" into a memory location (which
is ok for the 'm' constraint) which would cause a problem
because the 'c' modifier is not valid on memory operands.

Isn't it great how inline asm turns 'missed optimization'
into 'compile failed'??

Incidentally, this was the todo in 
PowerPC/2007-04-24-InlineAsm-I-Modifier.ll

Please do NOT pull this into Tak.

llvm-svn: 50315

22379734

Move a bunch of inline asm code out of line. · 4793515a
Chris Lattner authored Apr 27, 2008
```
llvm-svn: 50313
```
4793515a

Apr 26, 2008

When SRoA'ing a global variable, make sure the new globals get the · 67ca6f63

Chris Lattner authored Apr 26, 2008

appropriate alignment.  This fixes a miscompilation of 252.eon on
x86-64 (rdar://5891920).

Bill, please pull this into Tak.

llvm-svn: 50308

67ca6f63

Apr 25, 2008

change comments per review · 0d1d3df5
Dale Johannesen authored Apr 25, 2008
```
llvm-svn: 50300
```
0d1d3df5

Remove the code from CodeGenPrepare that moved getresult instructions · ca95a5f4

Dan Gohman authored Apr 25, 2008

to the block that defines their operands. This doesn't work in the
case that the operand is an invoke, because invoke is a terminator
and must be the last instruction in a block.

Replace it with support in SelectionDAGISel for copying struct values
into sequences of virtual registers.

llvm-svn: 50279

ca95a5f4

Feedback from chris · ca270ad9
Nate Begeman authored Apr 25, 2008
```
llvm-svn: 50271
```
ca270ad9
Remove 'unwinds to' support from mainline. This patch undoes r47802 r47989 · 4d43d3c7
Nick Lewycky authored Apr 25, 2008
```
r48047 r48084 r48085 r48086 r48088 r48096 r48099 r48109 and r48123.

llvm-svn: 50265
```
4d43d3c7
Teach the PruningFunctionCloner how to look through loads with · 6fed3b20
Nate Begeman authored Apr 25, 2008
```
ConstantExpression GEPs pointing into constant globals.

llvm-svn: 50256
```
6fed3b20

Don't infininitely thread branches when a threaded edge · f7de5284

Chris Lattner authored Apr 25, 2008

goes back to the block, e.g.:

  Threading edge through bool from 'bb37.us.thread3829' to 'bb37.us' with cost: 1, across block:

bb37.us:		; preds = %bb37.us.thread3829, %bb37.us, %bb33
	%D1361.1.us = phi i32 [ %tmp36, %bb33 ], [ %D1361.1.us, %bb37.us ], [ 0, %bb37.us.thread3829 ]		; <i32> [#uses=2]
	%tmp39.us = icmp eq i32 %D1361.1.us, 0		; <i1> [#uses=1]
	br i1 %tmp39.us, label %bb37.us, label %bb42.us

llvm-svn: 50251

f7de5284

Apr 24, 2008
- Adjust inline cost computation to be less aggressive. · 608eeef5
  Evan Cheng authored Apr 24, 2008
```
llvm-svn: 50222
```
  608eeef5
- code restructuring, not functionality change. · 97951ac5
  Chris Lattner authored Apr 24, 2008
```
llvm-svn: 50203
```
  97951ac5
- Don't replace multiple result of calls with undef, · 12f1e007
  Chris Lattner authored Apr 24, 2008
```
sccp tracks getresult values, not call values in this
case.

llvm-svn: 50202
```
  12f1e007
- code cleanup, no functionality change. · 769203cb
  Chris Lattner authored Apr 24, 2008
```
llvm-svn: 50201
```
  769203cb
- Split some code out of the main SimplifyCFG loop into its own function. · 86bbf338
  Chris Lattner authored Apr 24, 2008
```
Fix said code to handle merging return instructions together correctly
when handling multiple return values.

llvm-svn: 50199
```
  86bbf338
Apr 23, 2008
- Check type instead of no. of operands. · 8f83081f
  Devang Patel authored Apr 23, 2008
```
llvm-svn: 50179
```
  8f83081f
- Rewrite previous patch to suit Chris's preference. · f6e15a47
  Dale Johannesen authored Apr 23, 2008
```
llvm-svn: 50174
```
  f6e15a47
- simplify code for propagation of constant arguments into · a82d691c
  Chris Lattner authored Apr 23, 2008
```
callees.

llvm-svn: 50142
```
  a82d691c
- Fix a number of bugs in ipconstantprop, simplify the code, fit in 80 cols, · 5f1802cf
  Chris Lattner authored Apr 23, 2008
```
fix read after free bug (PR2238).

llvm-svn: 50141
```
  5f1802cf
- Rewrite multiple return value handling in SCCP. Before, the -sccp pass · 5a58a4dc
  Chris Lattner authored Apr 23, 2008
```
would turn every getresult instruction into undef.  This helps with
rdar://5778210

llvm-svn: 50140
```
  5a58a4dc
- Do not change the type of a ByVal argument to a · 493527d8
  Dale Johannesen authored Apr 23, 2008
```
type of a different size.

llvm-svn: 50121
```
  493527d8
- Don't do: "(X & 4) >> 1 == 2 --> (X & 4) == 4" if there are more than one... · 1c89ca72
  Evan Cheng authored Apr 23, 2008
```
Don't do: "(X & 4) >> 1 == 2  --> (X & 4) == 4" if there are more than one uses of the shift result.

llvm-svn: 50118
```
  1c89ca72
Apr 22, 2008

Start doing the significantly useful part of jump threading: handle cases · 37e9c187

Chris Lattner authored Apr 22, 2008

where a comparison has a phi input and that phi is a constant.  For example,
stuff like:

  Threading edge through bool from 'bb2149' to 'bb2231' with cost: 1, across block:
bb2237:		; preds = %bb2231, %bb2149
	%tmp2328.rle = phi i32 [ %tmp2232, %bb2231 ], [ %tmp2232439, %bb2149 ]		; <i32> [#uses=2]
	%done.0 = phi i32 [ %done.2, %bb2231 ], [ 0, %bb2149 ]		; <i32> [#uses=1]
	%tmp2239 = icmp eq i32 %done.0, 0		; <i1> [#uses=1]
	br i1 %tmp2239, label %bb2231, label %bb2327

or

bb38.i298:		; preds = %bb33.i295, %bb1693
	%tmp39.i296.rle = phi %struct.ibox* [ null, %bb1693 ], [ %tmp39.i296.rle1109, %bb33.i295 ]		; <%struct.ibox*> [#uses=2]
	%minspan.1.i291.reg2mem.1 = phi i32 [ 32000, %bb1693 ], [ %minspan.0.i288, %bb33.i295 ]		; <i32> [#uses=1]
	%tmp40.i297 = icmp eq %struct.ibox* %tmp39.i296.rle, null		; <i1> [#uses=1]
	br i1 %tmp40.i297, label %implfeeds.exit311, label %bb43.i301

This triggers thousands of times in spec.

llvm-svn: 50110

37e9c187

Dig through multiple levels of AND to thread jumps if needed. · d5425e8f
Chris Lattner authored Apr 22, 2008
```
llvm-svn: 50106
```
d5425e8f

Teach jump threading to thread through blocks like: · 3df4c15d

Chris Lattner authored Apr 22, 2008

  br (and X, phi(Y, Z, false)), label L1, label L2

This triggers once on 252.eon and 6 times on 176.gcc.  Blocks 
in question often look like this:

bb262:		; preds = %bb261, %bb248
	%iftmp.251.0 = phi i1 [ true, %bb261 ], [ false, %bb248 ]		; <i1> [#uses=4]
	%tmp270 = icmp eq %struct.rtx_def* %tmp.0.i, null		; <i1> [#uses=1]
	%bothcond = or i1 %iftmp.251.0, %tmp270		; <i1> [#uses=1]
	br i1 %bothcond, label %bb288, label %bb273

In this case, it is clear that it doesn't matter if tmp.0.i is null when coming from bb261.  When coming from bb248, it is all that matters.


Another random example:

check_asm_operands.exit:		; preds = %check_asm_operands.exit.thr_comm, %bb30.i, %bb12.i, %bb6.i413
	%tmp.0.i420 = phi i1 [ true, %bb6.i413 ], [ true, %bb12.i ], [ true, %bb30.i ], [ false, %check_asm_operands.exit.thr_comm ; <i1> [#uses=1]
	call void @llvm.stackrestore( i8* %savedstack ) nounwind 
	%tmp4389 = icmp eq i32 %added_sets_1.0, 0		; <i1> [#uses=1]
	%tmp4394 = icmp eq i32 %added_sets_2.0, 0		; <i1> [#uses=1]
	%bothcond80 = and i1 %tmp4389, %tmp4394		; <i1> [#uses=1]
	%bothcond81 = and i1 %bothcond80, %tmp.0.i420		; <i1> [#uses=1]
	br i1 %bothcond81, label %bb4398, label %bb4397

Here is the case from 252.eon:

bb290.i.i:		; preds = %bb23.i57.i.i, %bb8.i39.i.i, %bb100.i.i, %bb100.i.i, %bb85.i.i110
	%myEOF.1.i.i = phi i1 [ true, %bb100.i.i ], [ true, %bb100.i.i ], [ true, %bb85.i.i110 ], [ true, %bb8.i39.i.i ], [ false, %bb23.i57.i.i ]		; <i1> [#uses=2]
	%i.4.i.i = phi i32 [ %i.1.i.i, %bb85.i.i110 ], [ %i.0.i.i, %bb100.i.i ], [ %i.0.i.i, %bb100.i.i ], [ %i.3.i.i, %bb8.i39.i.i ], [ %i.3.i.i, %bb23.i57.i.i ]		; <i32> [#uses=3]
	%tmp292.i.i = load i8* %tmp16.i.i100, align 1		; <i8> [#uses=1]
	%tmp293.not.i.i = icmp ne i8 %tmp292.i.i, 0		; <i1> [#uses=1]
	%bothcond.i.i = and i1 %tmp293.not.i.i, %myEOF.1.i.i		; <i1> [#uses=1]
	br i1 %bothcond.i.i, label %bb202.i.i, label %bb301.i.i
  Factoring out 3 common predecessors.

On the path from any blocks other than bb23.i57.i.i, the load and compare 
are dead.

llvm-svn: 50096

3df4c15d

refactor some code, no functionality change. · e369c35a
Chris Lattner authored Apr 22, 2008
```
llvm-svn: 50094
```
e369c35a
remove dead code. · 8fb13cbe
Chris Lattner authored Apr 22, 2008
```
llvm-svn: 50080
```
8fb13cbe

optimize "p != gep p, ..." better. This allows us to compile · c3a43935

Chris Lattner authored Apr 22, 2008

getelementptr-seteq.ll into:

define i1 @test(i64 %X, %S* %P) {
	%C = icmp eq i64 %X, -1		; <i1> [#uses=1]
	ret i1 %C
}

instead of:

define i1 @test(i64 %X, %S* %P) {
	%A.idx.mask = and i64 %X, 4611686018427387903		; <i64> [#uses=1]
	%C = icmp eq i64 %A.idx.mask, 4611686018427387903		; <i1> [#uses=1]
	ret i1 %C
}

And fixes the second half of PR2235.  This speeds up the insertion sort
case by 45%, from 1.12s to 0.77s.  In practice, this will significantly
speed up for loops structured like:

for (double *P = Base + N; P != Base; --P)
  ...

Which happens frequently for C++ iterators.

llvm-svn: 50079

c3a43935

Apr 21, 2008
- fix grammar-o, thanks to Duncan for noticing. · bab7bec9
  Chris Lattner authored Apr 21, 2008
```
llvm-svn: 50047
```
  bab7bec9
- Remove unneeded #include's. · a5b96ece
  Owen Anderson authored Apr 21, 2008
```
llvm-svn: 50035
```
  a5b96ece
- Refactor memcpyopt based on Chris' suggestions. Consolidate several functions · 6a7355ca
  Owen Anderson authored Apr 21, 2008
```
and simplify code that was fallout from the separation of memcpyopt and gvn.

llvm-svn: 50034
```
  6a7355ca
- don't assume that the argument passed to fprintf("%s" is a string. This · ad0d42ba
  Chris Lattner authored Apr 21, 2008
```
fixes a crash in opt on 433.milc.

llvm-svn: 50023
```
  ad0d42ba
- Use the new SplitBlockPredecessors to implement a todo. · f6236cc2
  Chris Lattner authored Apr 21, 2008
```
llvm-svn: 50022
```
  f6236cc2
- Move SplitBlockPredecessors out of loopsimplify into BasicBlockUtils.h · a5b11705
  Chris Lattner authored Apr 21, 2008
```
as a global helper function.  At the same type, switch it from taking
a vector of predecessors to an arbitrary sequential input.  This allows
us to switch LoopSimplify to use a SmallVector for various temporary
vectors that it passed into SplitBlockPredecessors.

llvm-svn: 50020
```
  a5b11705
- Move domtree/frontier updating earlier, allowing us to use it to update phi · d418b06a
  Chris Lattner authored Apr 21, 2008
```
nodes, removing a hack.

llvm-svn: 50019
```
  d418b06a
- Factor dominator tree and frontier updating into SplitBlockPredecessors · 96e9e222
  Chris Lattner authored Apr 21, 2008
```
instead of doing it after every call.

llvm-svn: 50018
```
  96e9e222