Commits · 030f02021b6359ec5641622cf1aa63d873ecf55a · Roger Ferrer / llvm-epi-0.8

Sep 01, 2010

licm is wasting time hoisting constant foldable operations, · 030f0202

Chris Lattner authored Aug 31, 2010

instead of hoisting them, just fold them away.  This occurs in the
testcase for PR8041, for example.

llvm-svn: 112669

030f0202

Aug 31, 2010

Merge 2010-08-31-InfiniteRecursion.ll into crash.ll. · a5e6b3ec
Owen Anderson authored Aug 31, 2010
```
llvm-svn: 112635
```
a5e6b3ec
Add a test for the duplicated-conditional situation illutrated by PR5652. · 799a08ae
Owen Anderson authored Aug 31, 2010
```
llvm-svn: 112621
```
799a08ae
merge two tests. · e2295f1c
Chris Lattner authored Aug 31, 2010
```
llvm-svn: 112617
```
e2295f1c
Manually reduce this testcase. · 3931c859
Owen Anderson authored Aug 31, 2010
```
llvm-svn: 112615
```
3931c859
merge two tests and convert to filecheck. · fbcd165b
Chris Lattner authored Aug 31, 2010
```
llvm-svn: 112613
```
fbcd165b

Add a micro-test for the transforms I added to JumpThreading. · ada06237

Owen Anderson authored Aug 31, 2010

I have not been able to find a way to test each in isolation, for a few reasons:
1) The ability to look-through non-i1 BinaryOperator's requires the ability to look through non-constant
ICmps in order for it to ever trigger.
2) The ability to do LVI-powered PHI value determination only matters in cases that ProcessBranchOnPHI
can't handle. Since it already handles all the cases without other instructions in the def-use chain
between the PHI and the branch, it requires the ability to look through ICmps and/or BinaryOperators
as well.

llvm-svn: 112611

ada06237

Rename test directory to reflect new pass name. · 064b139c
Owen Anderson authored Aug 31, 2010
```
llvm-svn: 112592
```
064b139c
Rename ValuePropagation to a more descriptive CorrelatedValuePropagation. · 48d58ad6
Owen Anderson authored Aug 31, 2010
```
llvm-svn: 112591
```
48d58ad6

More Chris-inspired JumpThreading fixes: use ConstantExpr to correctly... · 3997a07f

Owen Anderson authored Aug 31, 2010

More Chris-inspired JumpThreading fixes: use ConstantExpr to correctly constant-fold undef, and be more careful with its return value.
This actually exposed an infinite recursion bug in ComputeValueKnownInPredecessors which theoretically already existed (in JumpThreading's
handling of and/or of i1's), but never manifested before. This patch adds a tracking set to prevent this case.

llvm-svn: 112589

3997a07f

Remove r111665, which implemented store-narrowing in InstCombine. Chris... · 376597c1

Owen Anderson authored Aug 31, 2010

Remove r111665, which implemented store-narrowing in InstCombine. Chris discovered a miscompilation in it, and it's not easily
fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine.

llvm-svn: 112575

376597c1

Combine these two tests, and make sure there's a newline at the end of the file. · 70b17c50
Owen Anderson authored Aug 30, 2010
```
llvm-svn: 112554
```
70b17c50

Aug 30, 2010
- Correct bogus module triple specifications. · 68c30907
  Duncan Sands authored Aug 30, 2010
```
llvm-svn: 112469
```
  68c30907
Aug 29, 2010
- LICM does get dead instructions input to it. Instead of sinking them · 263f8046
  Chris Lattner authored Aug 29, 2010
```
out of loops, just delete them.

llvm-svn: 112451
```
  263f8046
Aug 28, 2010

remove the ABCD and SSI passes. They don't have any clients that · 504e5100

Chris Lattner authored Aug 28, 2010

I'm aware of, aren't maintained, and LVI will be replacing their value.
nlewycky approved this on irc.

llvm-svn: 112355

504e5100

handle the constant case of vector insertion. For something · d0214f3e

Chris Lattner authored Aug 28, 2010

like this:

struct S { float A, B, C, D; };

struct S g;
struct S bar() { 
  struct S A = g;
  ++A.B;
  A.A = 42;
  return A;
}

we now generate:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	pshufd	$16, %xmm2, %xmm2
	movss	LCPI1_1(%rip), %xmm0
	pshufd	$16, %xmm0, %xmm0
	unpcklps	%xmm2, %xmm0
	ret

instead of:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	movd	%xmm2, %eax
	shlq	$32, %rax
	addq	$1109917696, %rax       ## imm = 0x42280000
	movd	%rax, %xmm0
	ret

llvm-svn: 112345

d0214f3e

optimize bitcasts from large integers to vector into vector · dd660104

Chris Lattner authored Aug 28, 2010

element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi.  We now compile something like
this:

struct S { float A, B, C, D; };
struct S g;
struct S bar() { 
  struct S A = g;
  ++A.A;
  ++A.C;
  return A;
}

into all nice vector operations:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	12(%rax), %xmm3
	pshufd	$16, %xmm2, %xmm2
	unpcklps	%xmm2, %xmm0
	addss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	pshufd	$16, %xmm3, %xmm2
	unpcklps	%xmm2, %xmm1
	ret

instead of icky integer operations:

_bar:                                   ## @bar
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %ecx
	movl	4(%rax), %edx
	movl	12(%rax), %esi
	shlq	$32, %rdx
	addq	%rcx, %rdx
	movd	%rdx, %xmm0
	addss	8(%rax), %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rsi
	addq	%rax, %rsi
	movd	%rsi, %xmm1
	ret

This resolves rdar://8360454

llvm-svn: 112343

dd660104

Add a prototype of a new peephole optimizing pass that uses LazyValue info to... · cf7f9411

Owen Anderson authored Aug 27, 2010

Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's.
This pass addresses the missed optimizations from PR2581 and PR4420.

llvm-svn: 112325

cf7f9411

tidy up test. · 954e9557
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112321
```
954e9557

Enhance the shift propagator to handle the case when you have: · 6c1395f6

Chris Lattner authored Aug 27, 2010

A = shl x, 42
...
B = lshr ..., 38

which can be transformed into:
A = shl x, 4
...

iff we can prove that the would-be-shifted-in bits
are already zero.  This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.

llvm-svn: 112314

6c1395f6

Implement a pretty general logical shift propagation · 18d7fc8f

Chris Lattner authored Aug 27, 2010

framework, which is good at ripping through bitfield
operations.  This generalize a bunch of the existing
xforms that instcombine does, such as 
  (x << c) >> c -> and
to handle intermediate logical nodes.  This is useful for
ripping up the "promote to large integer" code produced by
SRoA.

llvm-svn: 112304

18d7fc8f

Aug 27, 2010

merge and filecheckize test · 606b76eb
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112289
```
606b76eb
merge two tests · c665156e
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112288
```
c665156e

teach the truncation optimization that an entire chain of · 73984346

Chris Lattner authored Aug 27, 2010

computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.

llvm-svn: 112285

73984346

Add an instcombine to clean up a common pattern produced · 90cd746e

Chris Lattner authored Aug 27, 2010

by the SRoA "promote to large integer" code, eliminating
some type conversions like this:

   %94 = zext i16 %93 to i32                       ; <i32> [#uses=2]
   %96 = lshr i32 %94, 8                           ; <i32> [#uses=1]
   %101 = trunc i32 %96 to i8                      ; <i8> [#uses=1]

This also unblocks other xforms from happening, now clang is able to compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm2
	addss	%xmm0, %xmm2
	movdqa	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	pshufd	$1, %xmm1, %xmm0
	addss	%xmm3, %xmm0
	ret

on x86-64, instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

This seems pretty close to optimal to me, at least without
using horizontal adds.  This also triggers in lots of other
code, including SPEC.

llvm-svn: 112278

90cd746e

Use LVI to eliminate conditional branches where we've tested a related... · 6ebbd923

Owen Anderson authored Aug 27, 2010

Use LVI to eliminate conditional branches where we've tested a related condition previously.  Update tests for this change.
This fixes PR5652.

llvm-svn: 112270

6ebbd923

filecheckize · c188b96b
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112235
```
c188b96b
rename test. · 387d6bcd
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112234
```
387d6bcd
optimize "integer extraction out of the middle of a vector" as produced · bfd22281
Chris Lattner authored Aug 26, 2010
```
by SRoA.  This is part of rdar://7892780, but needs another xform to
expose this.

llvm-svn: 112232
```
bfd22281

Aug 26, 2010

optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x' · d4ebd6df

Chris Lattner authored Aug 26, 2010

is a vector to be a vector element extraction.  This allows clang to
compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	movd	%eax, %xmm0
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movd	%xmm1, %rax
	movd	%eax, %xmm1
	addss	%xmm2, %xmm1
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm1, %xmm0
	ret

... eliminating half of the horribleness.

llvm-svn: 112227

d4ebd6df

filecheckize · 3c19d3d5
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112225
```
3c19d3d5
rename test · 7717c616
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112224
```
7717c616
Make JumpThreading smart enough to properly thread StrSwitch when it's compiled with clang++. · bd2ecc7e
Owen Anderson authored Aug 26, 2010
```
llvm-svn: 112198
```
bd2ecc7e

Aug 25, 2010

DIGlobalVariable can be used to encode debug info for globals that are... · 01262e12

Devang Patel authored Aug 25, 2010

DIGlobalVariable can be used to encode debug info for  globals that are directly folded into a constant by FE.

llvm-svn: 112072

01262e12

In the default address space, any GEP off of null results in a trap value if... · 4afea9e3

Owen Anderson authored Aug 25, 2010

In the default address space, any GEP off of null results in a trap value if you try to load it.  Thus,
any load in the default address space that completes implies that the base value that it GEP'd from
was not null.

llvm-svn: 112015

4afea9e3

Aug 20, 2010
- Re-apply r111568 with a fix for the clang self-host. · 84c29a09
  Owen Anderson authored Aug 20, 2010
```
llvm-svn: 111665
```
  84c29a09
- Previous revert failed to remove this file. · 3323651e
  Owen Anderson authored Aug 19, 2010
```
llvm-svn: 111582
```
  3323651e
- Revert r111568 to unbreak clang self-host. · 43057cd5
  Owen Anderson authored Aug 19, 2010
```
llvm-svn: 111571
```
  43057cd5
- When a set of bitmask operations, typically from a bitfield initialization,... · bb723b22
  Owen Anderson authored Aug 19, 2010
```
When a set of bitmask operations, typically from a bitfield initialization, only modifies the low bytes of a value,
we can narrow the store to only over-write the affected bytes.

llvm-svn: 111568
```
  bb723b22
Aug 19, 2010
- Fixed and reactivated a partial specialization test · d4b6ab98
  Kenneth Uildriks authored Aug 19, 2010
```
llvm-svn: 111516
```
  d4b6ab98