Commits · 045f81981a61d283af9d6df1050cef6e7c7ff629 · Roger Ferrer / llvm-epi-0.8

Jan 22, 2010
- Revert LoopStrengthReduce.cpp to pre-r94061 for now. · 045f8198
  Dan Gohman authored Jan 22, 2010
```
llvm-svn: 94123
```
  045f8198
- No need to look through bitcasts for DbgInfoIntrinsic · 7b151e9f
  Victor Hernandez authored Jan 21, 2010
```
llvm-svn: 94114
```
  7b151e9f
- DbgInfoIntrinsic no longer appear in an instruction's use list · ae4d9497
  Victor Hernandez authored Jan 21, 2010
```
llvm-svn: 94113
```
  ae4d9497
- No need to look through bitcasts for DbgInfoIntrinsic · 5f5abd59
  Victor Hernandez authored Jan 21, 2010
```
llvm-svn: 94112
```
  5f5abd59
- DbgInfoIntrinsics no longer appear in an instruction's use list; so clean up... · 1df65186
  Victor Hernandez authored Jan 21, 2010
```
DbgInfoIntrinsics no longer appear in an instruction's use list; so clean up looking for them in use iterations and remove OnlyUsedByDbgInfoIntrinsics()

llvm-svn: 94111
```
  1df65186
- When inserting expressions for post-increment users which contain · b1ee154b
  Dan Gohman authored Jan 21, 2010
```
loop-variant components, adds must be inserted after the increment.
Keep track of the increment position for this case, and insert
these adds in the correct location.

llvm-svn: 94110
```
  b1ee154b
Jan 21, 2010

Include IVUsers information in LSR's debug output. · cb8d577e
Dan Gohman authored Jan 21, 2010
```
llvm-svn: 94108
```
cb8d577e

Prune the search for candidate formulae if the number of register · 29916e02

Dan Gohman authored Jan 21, 2010

operands exceeds the number of registers used in the initial
solution, as that wouldn't lead to a profitable solution anyway.

llvm-svn: 94107

29916e02

Add a comment. · c903499f
Dan Gohman authored Jan 21, 2010
```
llvm-svn: 94104
```
c903499f

It turns out that this #include is needed because otherwise · 24716b6c

Chris Lattner authored Jan 21, 2010

ValueMapper.cpp ends up calling an out of line 
__ZNK4llvm12PATypeHolder3getEv, which is a template and llvm-config
determines arbitrarily to use the one in libipo.  This sucks, but
keeping the #include is a reasonable workaround.

llvm-svn: 94103

24716b6c

unbreak the build, apparently without this transformutils starts depending on libipa? · 9889b4be
Chris Lattner authored Jan 21, 2010
```
llvm-svn: 94102
```
9889b4be
tidy up · e39837d5
Chris Lattner authored Jan 21, 2010
```
llvm-svn: 94101
```
e39837d5
Don't need to include IntrinsicInst.h any more · a9ad174b
Victor Hernandez authored Jan 21, 2010
```
llvm-svn: 94092
```
a9ad174b
No need to map NULL operands of metadata · d089f4e1
Victor Hernandez authored Jan 21, 2010
```
llvm-svn: 94091
```
d089f4e1

Re-implement the main strength-reduction portion of LoopStrengthReduction. · 51ad99d2

Dan Gohman authored Jan 21, 2010

This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.

llvm-svn: 94061

51ad99d2

Add strcpy_chk -> strcpy support for "don't know" object size · fa863258
Eric Christopher authored Jan 21, 2010
```
answers.  This will update as object size checking gets better information.

llvm-svn: 94059
```
fa863258
simplify this code. · 3c5bf713
Chris Lattner authored Jan 20, 2010
```
llvm-svn: 94048
```
3c5bf713

Jan 20, 2010
- Move per-function inline threshold calculation to a method. · 8a19d3c9
  Jakob Stoklund Olesen authored Jan 20, 2010
```
No functional change except the forgotten test for
InlineLimit.getNumOccurrences() == 0 in the CurrentThreshold2 calculation.

llvm-svn: 94007
```
  8a19d3c9
- Switch Elts from vector to SmallVector · f2462407
  Victor Hernandez authored Jan 20, 2010
```
llvm-svn: 93989
```
  f2462407
- Map operands of all function-local metadata, not just metadata passed to... · 5fa88d4e
  Victor Hernandez authored Jan 20, 2010
```
Map operands of all function-local metadata, not just metadata passed to llvm.dbg.declare intrinsics

llvm-svn: 93979
```
  5fa88d4e
Jan 19, 2010

When doing address-mode sinking, expand the base register first, rather · ca19445d

Dan Gohman authored Jan 19, 2010

than the scaled register. This makes it more likely that subsequent
AddrModeMatcher queries will match the new address the same way as the
old, instead of accidentally matching what had been the base register
as the new scaled register, and then failing to match the scaled register.
This fixes some problems with address-mode sinking multiple muls into a
block, which will be a lot more common with some upcoming
LoopStrengthReduction changes.

llvm-svn: 93935

ca19445d

optimize ~(~X >>s Y) --> (X >>s Y), patch by Edmund Grimley · 18f49ce2
Chris Lattner authored Jan 19, 2010
```
Evans!

llvm-svn: 93884
```
18f49ce2

Fix a crash in scalarrepl for memcpy/memmove where the source and destination · 58d59fe3

Bob Wilson authored Jan 19, 2010

are the same.  I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value.  Radar 7552893.

llvm-svn: 93848

58d59fe3

Fix comment. · 84bd316b
Eric Christopher authored Jan 19, 2010
```
llvm-svn: 93831
```
84bd316b

Jan 18, 2010

my instcombine transformations to make extension elimination more · 43f2fa62

Chris Lattner authored Jan 18, 2010

aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)),
make sure to transform a couple more things into that canonical form,
and catch a case where we missed turning zext/shl/ashr into a single sext.

llvm-svn: 93787

43f2fa62

While mapping llvm.dbg.declare intrinsic manually map its operand, if possible, · 696cb8d4
Devang Patel authored Jan 18, 2010
```
because it points to an alloca instruction through metadata.

llvm-svn: 93757
```
696cb8d4

Jan 17, 2010
- Convert some of the dynamic opcode lookups into static ones. · cdea3572
  Owen Anderson authored Jan 17, 2010
```
llvm-svn: 93693
```
  cdea3572
- Fix comment. · fa1edea9
  Owen Anderson authored Jan 17, 2010
```
llvm-svn: 93679
```
  fa1edea9
Jan 15, 2010
- Fix a comment typo. · e0da4b6c
  Bob Wilson authored Jan 15, 2010
```
llvm-svn: 93560
```
  e0da4b6c
Jan 14, 2010

When the visitSub method was split into visitSub and visitFSub, this xform was · ad7a5b07

Bill Wendling authored Jan 13, 2010

added to the FSub version. However, the original version of this xform guarded
against doing this for floating point (!Op0->getType()->isFPOrFPVector()).

This is causing LLVM to perform incorrect xforms for code like:

void func(double *rhi, double *rlo, double xh, double xl, double yh, double yl){
  double mh, ml;
  double c = 134217729.0;
  double up, u1, u2, vp, v1, v2;
        
  up = xh*c;
  u1 = (xh - up) + up;
  u2 = xh - u1;
        
  vp = yh*c;
  v1 = (yh - vp) + vp;
  v2 = yh - v1;
        
  mh = xh*yh;
  ml = (((u1*v1 - mh) + (u1*v2)) + (u2*v1)) + (u2*v2);
  ml += xh*yl + xl*yh;
        
  *rhi = mh + ml;
  *rlo = (mh - (*rhi)) + ml;
}

The last line was optimized away, but rl is intended to be the difference
between the infinitely precise result of mh + ml and after it has been rounded
to double precision.

llvm-svn: 93369

ad7a5b07

Jan 12, 2010

1) Use the new SimplifyInstructionsInBlock routine instead of the copy · 573da8ac

Chris Lattner authored Jan 12, 2010

in JT.

2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go.  This allows us to 
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.

llvm-svn: 93253

573da8ac

add a helper function. · 7c743f2c
Chris Lattner authored Jan 12, 2010
```
llvm-svn: 93251
```
7c743f2c
tidy up · af7855d5
Chris Lattner authored Jan 12, 2010
```
llvm-svn: 93222
```
af7855d5

Teach jump threading to duplicate small blocks when the branch · eb73bdb2

Chris Lattner authored Jan 12, 2010

condition is a xor with a phi node.  This eliminates nonsense
like this from 176.gcc in several places:

 LBB166_84:
        testl   %eax, %eax
-       setne   %al
-       xorb    %cl, %al
-       notb    %al
-       testb   $1, %al
-       je      LBB166_85
+       je      LBB166_69
+       jmp     LBB166_85

This is rdar://7391699

llvm-svn: 93221

eb73bdb2

some cleanup, and make it obvious that ProcessJumpOnPHI only works · 6a19ed0b
Chris Lattner authored Jan 11, 2010
```
on branches by renaming it and checking for a branch at the call site.

llvm-svn: 93208
```
6a19ed0b

Jan 11, 2010

reenable the piece that turns trunc(zext(x)) -> x even if zext has multiple uses, · d1a3efed
Chris Lattner authored Jan 11, 2010
```
codegen has no apparent problem with the trunc version of this, because it turns
into a simple subreg idiom

llvm-svn: 93202
```
d1a3efed

Disable folding sext(trunc(x)) -> x (and other similar cast/cast cases) when the · a6b1356c

Chris Lattner authored Jan 11, 2010

trunc has multiple uses.  Codegen is not able to coalesce the subreg case 
correctly and so this leads to higher register pressure and spilling (see PR5997).

This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%.

llvm-svn: 93200

a6b1356c

add one more bitfield optimization, allowing clang to generate · 95188694

Chris Lattner authored Jan 11, 2010

good code on PR4216:

_test_bitfield:                                             ## @test_bitfield
	orl	$32962, %edi
	movl	$4294941946, %eax
	andq	%rdi, %rax
	ret

instead of:

_test_bitfield:
        movl    $4294941696, %ecx
        movl    %edi, %eax
        orl     $194, %edi
        orl     $32768, %eax
        andq    $250, %rdi
        andq    %rax, %rcx
        movq    %rdi, %rax
        orq     %rcx, %rax
        ret

Evan is looking into the remaining andq+imm -> andl optimization.

llvm-svn: 93147

95188694

Extend CanEvaluateZExtd to handle and/or/xor more aggressively in the · 0a854204

Chris Lattner authored Jan 11, 2010

BitsToClear case.  This allows it to promote expressions which have an
and/or/xor after the lshr, promoting cases like test2 (from PR4216) 
and test3 (random extample extracted from a spec benchmark).

clang now compiles the code in PR4216 into:

_test_bitfield:                                             ## @test_bitfield
	movl	%edi, %eax
	orl	$194, %eax
	movl	$4294902010, %ecx
	andq	%rax, %rcx
	orl	$32768, %edi
	andq	$39936, %rdi
	movq	%rdi, %rax
	orq	%rcx, %rax
	ret

instead of:

_test_bitfield:                                             ## @test_bitfield
	movl	%edi, %eax
	orl	$194, %eax
	movl	$4294902010, %ecx
	andq	%rax, %rcx
	shrl	$8, %edi
	orl	$128, %edi
	shlq	$8, %rdi
	andq	$39936, %rdi
	movq	%rdi, %rax
	orq	%rcx, %rax
	ret

which is still not great, but is progress.

llvm-svn: 93145

0a854204

Remove the dead TD argument to CanEvaluateZExtd, and add a · 12bd8992

Chris Lattner authored Jan 11, 2010

new BitsToClear result which allows us to start promoting
expressions that end with a lshr-by-constant.  This is
conservatively correct and better than what we had before
(see testcases) but still needs to be extended further.

llvm-svn: 93144

12bd8992