Commits · cc9cbc66a3c232c7c8e01fd128c7991cd97d249e · Roger Ferrer / llvm-epi-0.8

Aug 29, 2010
- rework the ownership of subloop alias information: instead of · cc9cbc66
  Chris Lattner authored Aug 29, 2010
```
keeping them around until the pass is destroyed, keep them
around a) just when useful (not for outer loops) and b) destroy
them right after we use them.  This should reduce memory use
and fixes potential bugs where a loop is deleted and another
loop gets allocated to the same address.

llvm-svn: 112446
```
  cc9cbc66
- apparently unswitch had the same "Feature". Stop its · bc1a65ac
  Chris Lattner authored Aug 29, 2010
```
claims that it preserves domfrontier if it doesn't really.

llvm-svn: 112445
```
  bc1a65ac
- now that loop passes don't use DomFrontier, there is no reason · d6f46b8a
  Chris Lattner authored Aug 29, 2010
```
for the unroller to pretend it supports updating it.  It still
has a horrible hack for DomTree.

llvm-svn: 112444
```
  d6f46b8a
- Optionally rerun dedicated-register filtering after applying · 002ff89c
  Dan Gohman authored Aug 29, 2010
```
other filtering techniques, as those may allow it to filter
out more obviously unprofitable candidates.

llvm-svn: 112441
```
  002ff89c
- Fix several areas in LSR to do a better job keeping the main · f031792c
  Dan Gohman authored Aug 29, 2010
```
LSRInstance data structures up to date. This fixes some
pessimizations caused by stale data which will be exposed
in an upcoming change.

llvm-svn: 112440
```
  f031792c
- Refactor the three main groups of code out of · e9e0873b
  Dan Gohman authored Aug 29, 2010
```
NarrowSearchSpaceUsingHeuristics into separate functions.

llvm-svn: 112439
```
  e9e0873b
- Delete a bogus check. · 37a0f680
  Dan Gohman authored Aug 29, 2010
```
llvm-svn: 112438
```
  37a0f680
- Add some comments. · b6a520d6
  Dan Gohman authored Aug 29, 2010
```
llvm-svn: 112437
```
  b6a520d6
- Move this debug output into GenerateAllReuseFormula, to declutter · bf673e06
  Dan Gohman authored Aug 29, 2010
```
the high-level logic.

llvm-svn: 112436
```
  bf673e06
- Delete an unused declaration. · d366b6d5
  Dan Gohman authored Aug 29, 2010
```
llvm-svn: 112435
```
  d366b6d5
- Do one lookup instead of two. · 4f13bbfe
  Dan Gohman authored Aug 29, 2010
```
llvm-svn: 112434
```
  4f13bbfe
- licm preserves the cfg, it doesn't have to explicitly say it · f94f6bb0
  Chris Lattner authored Aug 29, 2010
```
preserves domfrontier.  It does preserve AA though.

llvm-svn: 112419
```
  f94f6bb0
- now that it doesn't use the PromoteMemToReg function, LICM doesn't · abe61ef3
  Chris Lattner authored Aug 29, 2010
```
require DomFrontier.  Dropping this doesn't actually save any runs
of the pass though.

llvm-svn: 112418
```
  abe61ef3
- completely rewrite the memory promotion algorithm in LICM. · 1dc98b47
  Chris Lattner authored Aug 29, 2010
```
Among other things, this uses SSAUpdater instead of 
PromoteMemToReg.

llvm-svn: 112417
```
  1dc98b47
- use getUniqueExitBlocks instead of a manual set. · 9c3931a5
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112412
```
  9c3931a5
- reimplement LICM::sink to use SSAUpdater instead of PromoteMemToReg. · 85bf5421
  Chris Lattner authored Aug 29, 2010
```
This leads to much simpler code.

llvm-svn: 112410
```
  85bf5421
- implement SSAUpdater::RewriteUseAfterInsertions, a helpful form of RewriteUse. · c3fb03e2
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112409
```
  c3fb03e2
- remove dead proto · b50407f1
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112408
```
  b50407f1
- reduce indentation in LICM::sink by using early exits, use · cd96b4df
  Chris Lattner authored Aug 29, 2010
```
getUniqueExitBlocks instead of getExitBlocks and a manual
set to eliminate dupes.

llvm-svn: 112405
```
  cd96b4df
- modernize this pass a bit: use efficient set/map and reduce indentation. · 188cc5a0
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112404
```
  188cc5a0
Aug 28, 2010

remove unions from LLVM IR. They are severely buggy and not · 13ee795c
Chris Lattner authored Aug 28, 2010
```
being actively maintained, improved, or extended.

llvm-svn: 112356
```
13ee795c

remove the ABCD and SSI passes. They don't have any clients that · 504e5100

Chris Lattner authored Aug 28, 2010

I'm aware of, aren't maintained, and LVI will be replacing their value.
nlewycky approved this on irc.

llvm-svn: 112355

504e5100

for completeness, allow undef also. · 50df36ac
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112351
```
50df36ac
squish dead code. · 95bb297c
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112350
```
95bb297c

handle the constant case of vector insertion. For something · d0214f3e

Chris Lattner authored Aug 28, 2010

like this:

struct S { float A, B, C, D; };

struct S g;
struct S bar() { 
  struct S A = g;
  ++A.B;
  A.A = 42;
  return A;
}

we now generate:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	pshufd	$16, %xmm2, %xmm2
	movss	LCPI1_1(%rip), %xmm0
	pshufd	$16, %xmm0, %xmm0
	unpcklps	%xmm2, %xmm0
	ret

instead of:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	movd	%xmm2, %eax
	shlq	$32, %rax
	addq	$1109917696, %rax       ## imm = 0x42280000
	movd	%rax, %xmm0
	ret

llvm-svn: 112345

d0214f3e

optimize bitcasts from large integers to vector into vector · dd660104

Chris Lattner authored Aug 28, 2010

element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi.  We now compile something like
this:

struct S { float A, B, C, D; };
struct S g;
struct S bar() { 
  struct S A = g;
  ++A.A;
  ++A.C;
  return A;
}

into all nice vector operations:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	12(%rax), %xmm3
	pshufd	$16, %xmm2, %xmm2
	unpcklps	%xmm2, %xmm0
	addss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	pshufd	$16, %xmm3, %xmm2
	unpcklps	%xmm2, %xmm1
	ret

instead of icky integer operations:

_bar:                                   ## @bar
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %ecx
	movl	4(%rax), %edx
	movl	12(%rax), %esi
	shlq	$32, %rdx
	addq	%rcx, %rdx
	movd	%rdx, %xmm0
	addss	8(%rax), %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rsi
	addq	%rax, %rsi
	movd	%rsi, %xmm1
	ret

This resolves rdar://8360454

llvm-svn: 112343

dd660104

Update CMake build. Add newline at end of file. · 83f9ff04
Benjamin Kramer authored Aug 28, 2010
```
llvm-svn: 112332
```
83f9ff04

Add a prototype of a new peephole optimizing pass that uses LazyValue info to... · cf7f9411

Owen Anderson authored Aug 27, 2010

Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's.
This pass addresses the missed optimizations from PR2581 and PR4420.

llvm-svn: 112325

cf7f9411

Enhance the shift propagator to handle the case when you have: · 6c1395f6

Chris Lattner authored Aug 27, 2010

A = shl x, 42
...
B = lshr ..., 38

which can be transformed into:
A = shl x, 4
...

iff we can prove that the would-be-shifted-in bits
are already zero.  This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.

llvm-svn: 112314

6c1395f6

Implement a pretty general logical shift propagation · 18d7fc8f

Chris Lattner authored Aug 27, 2010

framework, which is good at ripping through bitfield
operations.  This generalize a bunch of the existing
xforms that instcombine does, such as 
  (x << c) >> c -> and
to handle intermediate logical nodes.  This is useful for
ripping up the "promote to large integer" code produced by
SRoA.

llvm-svn: 112304

18d7fc8f

Aug 27, 2010

remove some special shift cases that have been subsumed into the · 25a198e7
Chris Lattner authored Aug 27, 2010
```
more general simplify demanded bits logic.

llvm-svn: 112291
```
25a198e7
Fix typos in comments. · 99d4cb86
Owen Anderson authored Aug 27, 2010
```
llvm-svn: 112286
```
99d4cb86

teach the truncation optimization that an entire chain of · 73984346

Chris Lattner authored Aug 27, 2010

computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.

llvm-svn: 112285

73984346

Add an instcombine to clean up a common pattern produced · 90cd746e

Chris Lattner authored Aug 27, 2010

by the SRoA "promote to large integer" code, eliminating
some type conversions like this:

   %94 = zext i16 %93 to i32                       ; <i32> [#uses=2]
   %96 = lshr i32 %94, 8                           ; <i32> [#uses=1]
   %101 = trunc i32 %96 to i8                      ; <i8> [#uses=1]

This also unblocks other xforms from happening, now clang is able to compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm2
	addss	%xmm0, %xmm2
	movdqa	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	pshufd	$1, %xmm1, %xmm0
	addss	%xmm3, %xmm0
	ret

on x86-64, instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

This seems pretty close to optimal to me, at least without
using horizontal adds.  This also triggers in lots of other
code, including SPEC.

llvm-svn: 112278

90cd746e

Use LVI to eliminate conditional branches where we've tested a related... · 6ebbd923

Owen Anderson authored Aug 27, 2010

Use LVI to eliminate conditional branches where we've tested a related condition previously.  Update tests for this change.
This fixes PR5652.

llvm-svn: 112270

6ebbd923

optimize "integer extraction out of the middle of a vector" as produced · bfd22281
Chris Lattner authored Aug 26, 2010
```
by SRoA.  This is part of rdar://7892780, but needs another xform to
expose this.

llvm-svn: 112232
```
bfd22281

Aug 26, 2010

optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x' · d4ebd6df

Chris Lattner authored Aug 26, 2010

is a vector to be a vector element extraction.  This allows clang to
compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	movd	%eax, %xmm0
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movd	%xmm1, %rax
	movd	%eax, %xmm1
	addss	%xmm2, %xmm1
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm1, %xmm0
	ret

... eliminating half of the horribleness.

llvm-svn: 112227

d4ebd6df

Make JumpThreading smart enough to properly thread StrSwitch when it's compiled with clang++. · bd2ecc7e
Owen Anderson authored Aug 26, 2010
```
llvm-svn: 112198
```
bd2ecc7e

Reapply r112091 and r111922, support for metadata linking, with a · ca26f790

Dan Gohman authored Aug 26, 2010

fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).

This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.

llvm-svn: 112190

ca26f790

Revert r111922, "MapValue support for MDNodes. This is similar to r109117, · ce45863f
Daniel Dunbar authored Aug 26, 2010
```
except ...", it is causing *massive* performance regressions when building Clang
with itself (-O3 -g).

llvm-svn: 112158
```
ce45863f