Commits · d1da5cdfee4c6deaad7a9a191f09416fb1c418b6 · Roger Ferrer / llvm-epi-0.8

Aug 29, 2010
- licm preserves the cfg, it doesn't have to explicitly say it · f94f6bb0
  Chris Lattner authored Aug 29, 2010
```
preserves domfrontier.  It does preserve AA though.

llvm-svn: 112419
```
  f94f6bb0
- now that it doesn't use the PromoteMemToReg function, LICM doesn't · abe61ef3
  Chris Lattner authored Aug 29, 2010
```
require DomFrontier.  Dropping this doesn't actually save any runs
of the pass though.

llvm-svn: 112418
```
  abe61ef3
- completely rewrite the memory promotion algorithm in LICM. · 1dc98b47
  Chris Lattner authored Aug 29, 2010
```
Among other things, this uses SSAUpdater instead of 
PromoteMemToReg.

llvm-svn: 112417
```
  1dc98b47
- use getUniqueExitBlocks instead of a manual set. · 9c3931a5
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112412
```
  9c3931a5
- reimplement LICM::sink to use SSAUpdater instead of PromoteMemToReg. · 85bf5421
  Chris Lattner authored Aug 29, 2010
```
This leads to much simpler code.

llvm-svn: 112410
```
  85bf5421
- implement SSAUpdater::RewriteUseAfterInsertions, a helpful form of RewriteUse. · c3fb03e2
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112409
```
  c3fb03e2
- remove dead proto · b50407f1
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112408
```
  b50407f1
- reduce indentation in LICM::sink by using early exits, use · cd96b4df
  Chris Lattner authored Aug 29, 2010
```
getUniqueExitBlocks instead of getExitBlocks and a manual
set to eliminate dupes.

llvm-svn: 112405
```
  cd96b4df
- modernize this pass a bit: use efficient set/map and reduce indentation. · 188cc5a0
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112404
```
  188cc5a0
Aug 28, 2010

remove unions from LLVM IR. They are severely buggy and not · 13ee795c
Chris Lattner authored Aug 28, 2010
```
being actively maintained, improved, or extended.

llvm-svn: 112356
```
13ee795c

remove the ABCD and SSI passes. They don't have any clients that · 504e5100

Chris Lattner authored Aug 28, 2010

I'm aware of, aren't maintained, and LVI will be replacing their value.
nlewycky approved this on irc.

llvm-svn: 112355

504e5100

for completeness, allow undef also. · 50df36ac
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112351
```
50df36ac
squish dead code. · 95bb297c
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112350
```
95bb297c

handle the constant case of vector insertion. For something · d0214f3e

Chris Lattner authored Aug 28, 2010

like this:

struct S { float A, B, C, D; };

struct S g;
struct S bar() { 
  struct S A = g;
  ++A.B;
  A.A = 42;
  return A;
}

we now generate:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	pshufd	$16, %xmm2, %xmm2
	movss	LCPI1_1(%rip), %xmm0
	pshufd	$16, %xmm0, %xmm0
	unpcklps	%xmm2, %xmm0
	ret

instead of:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	movd	%xmm2, %eax
	shlq	$32, %rax
	addq	$1109917696, %rax       ## imm = 0x42280000
	movd	%rax, %xmm0
	ret

llvm-svn: 112345

d0214f3e

optimize bitcasts from large integers to vector into vector · dd660104

Chris Lattner authored Aug 28, 2010

element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi.  We now compile something like
this:

struct S { float A, B, C, D; };
struct S g;
struct S bar() { 
  struct S A = g;
  ++A.A;
  ++A.C;
  return A;
}

into all nice vector operations:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	12(%rax), %xmm3
	pshufd	$16, %xmm2, %xmm2
	unpcklps	%xmm2, %xmm0
	addss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	pshufd	$16, %xmm3, %xmm2
	unpcklps	%xmm2, %xmm1
	ret

instead of icky integer operations:

_bar:                                   ## @bar
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %ecx
	movl	4(%rax), %edx
	movl	12(%rax), %esi
	shlq	$32, %rdx
	addq	%rcx, %rdx
	movd	%rdx, %xmm0
	addss	8(%rax), %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rsi
	addq	%rax, %rsi
	movd	%rsi, %xmm1
	ret

This resolves rdar://8360454

llvm-svn: 112343

dd660104

Update CMake build. Add newline at end of file. · 83f9ff04
Benjamin Kramer authored Aug 28, 2010
```
llvm-svn: 112332
```
83f9ff04

Add a prototype of a new peephole optimizing pass that uses LazyValue info to... · cf7f9411

Owen Anderson authored Aug 27, 2010

Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's.
This pass addresses the missed optimizations from PR2581 and PR4420.

llvm-svn: 112325

cf7f9411

Enhance the shift propagator to handle the case when you have: · 6c1395f6

Chris Lattner authored Aug 27, 2010

A = shl x, 42
...
B = lshr ..., 38

which can be transformed into:
A = shl x, 4
...

iff we can prove that the would-be-shifted-in bits
are already zero.  This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.

llvm-svn: 112314

6c1395f6

Implement a pretty general logical shift propagation · 18d7fc8f

Chris Lattner authored Aug 27, 2010

framework, which is good at ripping through bitfield
operations.  This generalize a bunch of the existing
xforms that instcombine does, such as 
  (x << c) >> c -> and
to handle intermediate logical nodes.  This is useful for
ripping up the "promote to large integer" code produced by
SRoA.

llvm-svn: 112304

18d7fc8f

Aug 27, 2010

remove some special shift cases that have been subsumed into the · 25a198e7
Chris Lattner authored Aug 27, 2010
```
more general simplify demanded bits logic.

llvm-svn: 112291
```
25a198e7
Fix typos in comments. · 99d4cb86
Owen Anderson authored Aug 27, 2010
```
llvm-svn: 112286
```
99d4cb86

teach the truncation optimization that an entire chain of · 73984346

Chris Lattner authored Aug 27, 2010

computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.

llvm-svn: 112285

73984346

Add an instcombine to clean up a common pattern produced · 90cd746e

Chris Lattner authored Aug 27, 2010

by the SRoA "promote to large integer" code, eliminating
some type conversions like this:

   %94 = zext i16 %93 to i32                       ; <i32> [#uses=2]
   %96 = lshr i32 %94, 8                           ; <i32> [#uses=1]
   %101 = trunc i32 %96 to i8                      ; <i8> [#uses=1]

This also unblocks other xforms from happening, now clang is able to compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm2
	addss	%xmm0, %xmm2
	movdqa	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	pshufd	$1, %xmm1, %xmm0
	addss	%xmm3, %xmm0
	ret

on x86-64, instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

This seems pretty close to optimal to me, at least without
using horizontal adds.  This also triggers in lots of other
code, including SPEC.

llvm-svn: 112278

90cd746e

Use LVI to eliminate conditional branches where we've tested a related... · 6ebbd923

Owen Anderson authored Aug 27, 2010

Use LVI to eliminate conditional branches where we've tested a related condition previously.  Update tests for this change.
This fixes PR5652.

llvm-svn: 112270

6ebbd923

optimize "integer extraction out of the middle of a vector" as produced · bfd22281
Chris Lattner authored Aug 26, 2010
```
by SRoA.  This is part of rdar://7892780, but needs another xform to
expose this.

llvm-svn: 112232
```
bfd22281

Aug 26, 2010

optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x' · d4ebd6df

Chris Lattner authored Aug 26, 2010

is a vector to be a vector element extraction.  This allows clang to
compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	movd	%eax, %xmm0
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movd	%xmm1, %rax
	movd	%eax, %xmm1
	addss	%xmm2, %xmm1
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm1, %xmm0
	ret

... eliminating half of the horribleness.

llvm-svn: 112227

d4ebd6df

Make JumpThreading smart enough to properly thread StrSwitch when it's compiled with clang++. · bd2ecc7e
Owen Anderson authored Aug 26, 2010
```
llvm-svn: 112198
```
bd2ecc7e

Reapply r112091 and r111922, support for metadata linking, with a · ca26f790

Dan Gohman authored Aug 26, 2010

fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).

This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.

llvm-svn: 112190

ca26f790

Revert r111922, "MapValue support for MDNodes. This is similar to r109117, · ce45863f
Daniel Dunbar authored Aug 26, 2010
```
except ...", it is causing *massive* performance regressions when building Clang
with itself (-O3 -g).

llvm-svn: 112158
```
ce45863f
Revert r112091, "Remap metadata attached to instructions when remapping · 95fe13c7
Daniel Dunbar authored Aug 26, 2010
```
individual ...", which depends on r111922, which I am reverting.

llvm-svn: 112157
```
95fe13c7
zap dead code. · 07afbd5a
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112130
```
07afbd5a
Rewrite ExtractGV, removing a bunch of stuff that didn't fully work, · 8f292e7a
Dan Gohman authored Aug 26, 2010
```
and was over-complicated, and replacing it with a simple implementation.

llvm-svn: 112120
```
8f292e7a
remove some llvmcontext arguments that are now dead post-refactoring. · 8df99b52
Chris Lattner authored Aug 25, 2010
```
llvm-svn: 112104
```
8df99b52

Aug 25, 2010
- Remap metadata attached to instructions when remapping individual · fd824487
  Dan Gohman authored Aug 25, 2010
```
instructions, not when remapping modules.

llvm-svn: 112091
```
  fd824487
- DIGlobalVariable can be used to encode debug info for globals that are... · 01262e12
  Devang Patel authored Aug 25, 2010
```
DIGlobalVariable can be used to encode debug info for  globals that are directly folded into a constant by FE.

llvm-svn: 112072
```
  01262e12
Aug 24, 2010

Use MapValue in the Linker instead of having a private function · a2095034

Dan Gohman authored Aug 24, 2010

which does the same thing. This eliminates redundant code and
handles MDNodes better. MDNode linking still doesn't fully
work yet though.

llvm-svn: 111941

a2095034

Turn LVI on, previously detected failures should be fixed now. · 7c853e87
Owen Anderson authored Aug 24, 2010
```
llvm-svn: 111923
```
7c853e87

MapValue support for MDNodes. This is similar to r109117, except · 69012835

Dan Gohman authored Aug 24, 2010

that it avoids a lot of unnecessary cloning by avoiding remapping
MDNode cycles when none of the nodes in the cycle actually need to
be remapped. Also it uses the new temporary MDNode mechanism.

llvm-svn: 111922

69012835

Aug 23, 2010
- Turn LVI back off, I have a testcase now. · 6ffa3f2a
  Owen Anderson authored Aug 23, 2010
```
llvm-svn: 111834
```
  6ffa3f2a
- Re-enable LazyValueInfo. Monitoring for failures. · 630add39
  Owen Anderson authored Aug 23, 2010
```
llvm-svn: 111816
```
  630add39