Commits · d6f46b8af8825be5287a41d9103ba290f34858f5 · Roger Ferrer / llvm-epi-0.8

Aug 29, 2010
- now that loop passes don't use DomFrontier, there is no reason · d6f46b8a
  Chris Lattner authored Aug 29, 2010
```
for the unroller to pretend it supports updating it.  It still
has a horrible hack for DomTree.

llvm-svn: 112444
```
  d6f46b8a
- Optionally rerun dedicated-register filtering after applying · 002ff89c
  Dan Gohman authored Aug 29, 2010
```
other filtering techniques, as those may allow it to filter
out more obviously unprofitable candidates.

llvm-svn: 112441
```
  002ff89c
- Fix several areas in LSR to do a better job keeping the main · f031792c
  Dan Gohman authored Aug 29, 2010
```
LSRInstance data structures up to date. This fixes some
pessimizations caused by stale data which will be exposed
in an upcoming change.

llvm-svn: 112440
```
  f031792c
- Refactor the three main groups of code out of · e9e0873b
  Dan Gohman authored Aug 29, 2010
```
NarrowSearchSpaceUsingHeuristics into separate functions.

llvm-svn: 112439
```
  e9e0873b
- Delete a bogus check. · 37a0f680
  Dan Gohman authored Aug 29, 2010
```
llvm-svn: 112438
```
  37a0f680
- Add some comments. · b6a520d6
  Dan Gohman authored Aug 29, 2010
```
llvm-svn: 112437
```
  b6a520d6
- Move this debug output into GenerateAllReuseFormula, to declutter · bf673e06
  Dan Gohman authored Aug 29, 2010
```
the high-level logic.

llvm-svn: 112436
```
  bf673e06
- Delete an unused declaration. · d366b6d5
  Dan Gohman authored Aug 29, 2010
```
llvm-svn: 112435
```
  d366b6d5
- Do one lookup instead of two. · 4f13bbfe
  Dan Gohman authored Aug 29, 2010
```
llvm-svn: 112434
```
  4f13bbfe
- licm preserves the cfg, it doesn't have to explicitly say it · f94f6bb0
  Chris Lattner authored Aug 29, 2010
```
preserves domfrontier.  It does preserve AA though.

llvm-svn: 112419
```
  f94f6bb0
- now that it doesn't use the PromoteMemToReg function, LICM doesn't · abe61ef3
  Chris Lattner authored Aug 29, 2010
```
require DomFrontier.  Dropping this doesn't actually save any runs
of the pass though.

llvm-svn: 112418
```
  abe61ef3
- completely rewrite the memory promotion algorithm in LICM. · 1dc98b47
  Chris Lattner authored Aug 29, 2010
```
Among other things, this uses SSAUpdater instead of 
PromoteMemToReg.

llvm-svn: 112417
```
  1dc98b47
- use getUniqueExitBlocks instead of a manual set. · 9c3931a5
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112412
```
  9c3931a5
- reimplement LICM::sink to use SSAUpdater instead of PromoteMemToReg. · 85bf5421
  Chris Lattner authored Aug 29, 2010
```
This leads to much simpler code.

llvm-svn: 112410
```
  85bf5421
- implement SSAUpdater::RewriteUseAfterInsertions, a helpful form of RewriteUse. · c3fb03e2
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112409
```
  c3fb03e2
- remove dead proto · b50407f1
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112408
```
  b50407f1
- reduce indentation in LICM::sink by using early exits, use · cd96b4df
  Chris Lattner authored Aug 29, 2010
```
getUniqueExitBlocks instead of getExitBlocks and a manual
set to eliminate dupes.

llvm-svn: 112405
```
  cd96b4df
- modernize this pass a bit: use efficient set/map and reduce indentation. · 188cc5a0
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112404
```
  188cc5a0
Aug 28, 2010

remove unions from LLVM IR. They are severely buggy and not · 13ee795c
Chris Lattner authored Aug 28, 2010
```
being actively maintained, improved, or extended.

llvm-svn: 112356
```
13ee795c

remove the ABCD and SSI passes. They don't have any clients that · 504e5100

Chris Lattner authored Aug 28, 2010

I'm aware of, aren't maintained, and LVI will be replacing their value.
nlewycky approved this on irc.

llvm-svn: 112355

504e5100

for completeness, allow undef also. · 50df36ac
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112351
```
50df36ac
squish dead code. · 95bb297c
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112350
```
95bb297c

handle the constant case of vector insertion. For something · d0214f3e

Chris Lattner authored Aug 28, 2010

like this:

struct S { float A, B, C, D; };

struct S g;
struct S bar() { 
  struct S A = g;
  ++A.B;
  A.A = 42;
  return A;
}

we now generate:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	pshufd	$16, %xmm2, %xmm2
	movss	LCPI1_1(%rip), %xmm0
	pshufd	$16, %xmm0, %xmm0
	unpcklps	%xmm2, %xmm0
	ret

instead of:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	movd	%xmm2, %eax
	shlq	$32, %rax
	addq	$1109917696, %rax       ## imm = 0x42280000
	movd	%rax, %xmm0
	ret

llvm-svn: 112345

d0214f3e

optimize bitcasts from large integers to vector into vector · dd660104

Chris Lattner authored Aug 28, 2010

element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi.  We now compile something like
this:

struct S { float A, B, C, D; };
struct S g;
struct S bar() { 
  struct S A = g;
  ++A.A;
  ++A.C;
  return A;
}

into all nice vector operations:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	12(%rax), %xmm3
	pshufd	$16, %xmm2, %xmm2
	unpcklps	%xmm2, %xmm0
	addss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	pshufd	$16, %xmm3, %xmm2
	unpcklps	%xmm2, %xmm1
	ret

instead of icky integer operations:

_bar:                                   ## @bar
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %ecx
	movl	4(%rax), %edx
	movl	12(%rax), %esi
	shlq	$32, %rdx
	addq	%rcx, %rdx
	movd	%rdx, %xmm0
	addss	8(%rax), %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rsi
	addq	%rax, %rsi
	movd	%rsi, %xmm1
	ret

This resolves rdar://8360454

llvm-svn: 112343

dd660104

Update CMake build. Add newline at end of file. · 83f9ff04
Benjamin Kramer authored Aug 28, 2010
```
llvm-svn: 112332
```
83f9ff04

Add a prototype of a new peephole optimizing pass that uses LazyValue info to... · cf7f9411

Owen Anderson authored Aug 27, 2010

Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's.
This pass addresses the missed optimizations from PR2581 and PR4420.

llvm-svn: 112325

cf7f9411

Enhance the shift propagator to handle the case when you have: · 6c1395f6

Chris Lattner authored Aug 27, 2010

A = shl x, 42
...
B = lshr ..., 38

which can be transformed into:
A = shl x, 4
...

iff we can prove that the would-be-shifted-in bits
are already zero.  This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.

llvm-svn: 112314

6c1395f6

Implement a pretty general logical shift propagation · 18d7fc8f

Chris Lattner authored Aug 27, 2010

framework, which is good at ripping through bitfield
operations.  This generalize a bunch of the existing
xforms that instcombine does, such as 
  (x << c) >> c -> and
to handle intermediate logical nodes.  This is useful for
ripping up the "promote to large integer" code produced by
SRoA.

llvm-svn: 112304

18d7fc8f

Aug 27, 2010

remove some special shift cases that have been subsumed into the · 25a198e7
Chris Lattner authored Aug 27, 2010
```
more general simplify demanded bits logic.

llvm-svn: 112291
```
25a198e7
Fix typos in comments. · 99d4cb86
Owen Anderson authored Aug 27, 2010
```
llvm-svn: 112286
```
99d4cb86

teach the truncation optimization that an entire chain of · 73984346

Chris Lattner authored Aug 27, 2010

computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.

llvm-svn: 112285

73984346

Add an instcombine to clean up a common pattern produced · 90cd746e

Chris Lattner authored Aug 27, 2010

by the SRoA "promote to large integer" code, eliminating
some type conversions like this:

   %94 = zext i16 %93 to i32                       ; <i32> [#uses=2]
   %96 = lshr i32 %94, 8                           ; <i32> [#uses=1]
   %101 = trunc i32 %96 to i8                      ; <i8> [#uses=1]

This also unblocks other xforms from happening, now clang is able to compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm2
	addss	%xmm0, %xmm2
	movdqa	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	pshufd	$1, %xmm1, %xmm0
	addss	%xmm3, %xmm0
	ret

on x86-64, instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

This seems pretty close to optimal to me, at least without
using horizontal adds.  This also triggers in lots of other
code, including SPEC.

llvm-svn: 112278

90cd746e

Use LVI to eliminate conditional branches where we've tested a related... · 6ebbd923

Owen Anderson authored Aug 27, 2010

Use LVI to eliminate conditional branches where we've tested a related condition previously.  Update tests for this change.
This fixes PR5652.

llvm-svn: 112270

6ebbd923

optimize "integer extraction out of the middle of a vector" as produced · bfd22281
Chris Lattner authored Aug 26, 2010
```
by SRoA.  This is part of rdar://7892780, but needs another xform to
expose this.

llvm-svn: 112232
```
bfd22281

Aug 26, 2010

optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x' · d4ebd6df

Chris Lattner authored Aug 26, 2010

is a vector to be a vector element extraction.  This allows clang to
compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	movd	%eax, %xmm0
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movd	%xmm1, %rax
	movd	%eax, %xmm1
	addss	%xmm2, %xmm1
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm1, %xmm0
	ret

... eliminating half of the horribleness.

llvm-svn: 112227

d4ebd6df

Make JumpThreading smart enough to properly thread StrSwitch when it's compiled with clang++. · bd2ecc7e
Owen Anderson authored Aug 26, 2010
```
llvm-svn: 112198
```
bd2ecc7e

Reapply r112091 and r111922, support for metadata linking, with a · ca26f790

Dan Gohman authored Aug 26, 2010

fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).

This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.

llvm-svn: 112190

ca26f790

Revert r111922, "MapValue support for MDNodes. This is similar to r109117, · ce45863f
Daniel Dunbar authored Aug 26, 2010
```
except ...", it is causing *massive* performance regressions when building Clang
with itself (-O3 -g).

llvm-svn: 112158
```
ce45863f
Revert r112091, "Remap metadata attached to instructions when remapping · 95fe13c7
Daniel Dunbar authored Aug 26, 2010
```
individual ...", which depends on r111922, which I am reverting.

llvm-svn: 112157
```
95fe13c7
zap dead code. · 07afbd5a
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112130
```
07afbd5a