Commits · 4694e695401b5eaaeac34c461e36587a5e3a3d1f · Lorenzo Albano / LLVM bpEVL

Jan 18, 2011
- Remove outdated references to dominance frontiers. · 4694e695
  Cameron Zwarich authored Jan 18, 2011
```
llvm-svn: 123724
```
  4694e695
Jan 17, 2011

Remove dead code, that I apparently wrote a while back. We seem to be doing well enough · 459e0799

Owen Anderson authored Jan 17, 2011

without whatever this was trying to do.  When/if someone has the time to do some empirical
evaluations, it might be worth it to figure out what this code was trying to do and see if
it's worth resurrecting/fixing.

llvm-svn: 123684

459e0799

Roll r123609 back in with two changes that fix test failures with expensive · b410858a

Cameron Zwarich authored Jan 17, 2011

checks enabled:

1) Use '<' to compare integers in a comparison function rather than '<='.

2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.

The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.

llvm-svn: 123662

b410858a

Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot. · 67431d79
Cameron Zwarich authored Jan 17, 2011
```
llvm-svn: 123618
```
67431d79

Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to · 814cd923

Cameron Zwarich authored Jan 17, 2011

eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.

Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.

llvm-svn: 123609

814cd923

Jan 16, 2011

Teach DAE to look for functions whose arguments are unused, and change all... · d3db8334

Anders Carlsson authored Jan 16, 2011

Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead.

llvm-svn: 123596

d3db8334

tidy up a comment, as suggested by duncan · 7c9f4c9c
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123590
```
7c9f4c9c

Don't merge two constants if we care about the address of both. · 751677a0

Rafael Espindola authored Jan 16, 2011

This fixes the original testcase in PR8927. It also causes a clang
binary built with a patched clang to increase in size by 0.21%.

We can probably get some of the size back by writing a pass that
detects that a global never has its pointer compared and adds
unnamed_addr to it (maybe extend global opt). It is also possible that
there are some other cases clang could add unnamed_addr to.

I will investigate extending globalopt next.

llvm-svn: 123584

751677a0

fix PR8932, a case where arg promotion could infinitely promote. · e5f8de86
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123574
```
e5f8de86
simplify a little · ed1fb92c
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123573
```
ed1fb92c

if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, · 6fab2e94

Chris Lattner authored Jan 16, 2011

then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.

llvm-svn: 123571

6fab2e94

Use an irbuilder to get some trivial constant folding when doing a store · 7cd8cf7d
Chris Lattner authored Jan 16, 2011
```
of a constant.

llvm-svn: 123570
```
7cd8cf7d
remove a dead check, this was needed before we had an explicit veto on uses of phis. · adb1a233
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123569
```
adb1a233

enhance FoldOpIntoPhi in instcombine to try harder when a phi has · d55581de

Chris Lattner authored Jan 16, 2011

multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568

d55581de

remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the · ea7131a0
Chris Lattner authored Jan 16, 2011
```
first line of the function because it isn't a good idea, even for compares.

llvm-svn: 123566
```
ea7131a0
more cleanups: use the IR builder. · ff2e7377
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123565
```
ff2e7377
tidy up code. · 25ce2805
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123564
```
25ce2805
Improve the safety of my globalopt enhancement by ensuring that the bitcast · 4e54efd6
Owen Anderson authored Jan 16, 2011
```
of the stored value to the new store type is always.  Also, add a testcase.

llvm-svn: 123563
```
4e54efd6
simplify this code, it is still broken but will follow up on llvm-commits. · 8b4952fc
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123558
```
8b4952fc
remove the partial specialization pass. It is unmaintained and has bugs. · 1e209b87
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123554
```
1e209b87

Jan 15, 2011

Add missing whitespace. · 4a1ff16b
Nick Lewycky authored Jan 15, 2011
```
llvm-svn: 123543
```
4a1ff16b
Make constmerge a two-pass algorithm so that it won't miss merging · 0296a481
Nick Lewycky authored Jan 15, 2011
```
opporuntities. Fixes PR8978.

llvm-svn: 123541
```
0296a481
Try to unbreak selfhost. · ed5f2e50
Benjamin Kramer authored Jan 15, 2011
```
llvm-svn: 123537
```
ed5f2e50
Add a cache that protects mergefunc's internals from more surprises in DenseSet. · 540f9536
Nick Lewycky authored Jan 15, 2011
```
Also, replace tabs with spaces. Yes, it's 2011.

llvm-svn: 123535
```
540f9536
temporarily revert r123526. While working on a follow-on patch I · af263907
Chris Lattner authored Jan 15, 2011
```
realize that ConstantFoldTerminator doesn't preserve dominfo.

llvm-svn: 123527
```
af263907

fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code · 8df83c4a

Chris Lattner authored Jan 15, 2011

The basic issue is that isel (very reasonably!) expects conditional branches
to be folded, so CGP leaving around a bunch dead computation feeding
conditional branches isn't such a good idea.  Just fold branches on constants
into unconditional branches.

llvm-svn: 123526

8df83c4a

simplify code, no functionality change. · ee588def
Chris Lattner authored Jan 15, 2011
```
llvm-svn: 123525
```
ee588def

Now that instruction optzns can update the iterator as they go, we can · 1b93be50

Chris Lattner authored Jan 15, 2011

have objectsize folding recursively simplify away their result when it
folds.  It is important to catch this here, because otherwise we won't
eliminate the cross-block values at isel and other times.

llvm-svn: 123524

1b93be50

make the current instruction iterator an ivar, allowing xforms that · 7a277144

Chris Lattner authored Jan 15, 2011

potentially invalidate it (like inline asm lowering) to be sunk into
their proper place, cleaning up a ton of code.

llvm-svn: 123523

7a277144

implement an instcombine xform that canonicalizes casts outside of and-with-constant operations. · 9c10d587

Chris Lattner authored Jan 15, 2011

This fixes rdar://8808586 which observed that we used to compile:


union xy {
        struct x { _Bool b[15]; } x;
        __attribute__((packed))
        struct y {
                __attribute__((packed)) unsigned long b0to7;
                __attribute__((packed)) unsigned int b8to11;
                __attribute__((packed)) unsigned short b12to13;
                __attribute__((packed)) unsigned char b14;
        } y;
};

struct x
foo(union xy *xy)
{
        return xy->x;
}

into:

_foo:                                   ## @foo
	movq	(%rdi), %rax
	movabsq	$1095216660480, %rcx    ## imm = 0xFF00000000
	andq	%rax, %rcx
	movabsq	$-72057594037927936, %rdx ## imm = 0xFF00000000000000
	andq	%rax, %rdx
	movzbl	%al, %esi
	orq	%rdx, %rsi
	movq	%rax, %rdx
	andq	$65280, %rdx            ## imm = 0xFF00
	orq	%rsi, %rdx
	movq	%rax, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rdx, %rsi
	movl	%eax, %edx
	andl	$-16777216, %edx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rdx
	orq	%rcx, %rdx
	movabsq	$280375465082880, %rcx  ## imm = 0xFF0000000000
	movq	%rax, %rsi
	andq	%rcx, %rsi
	orq	%rdx, %rsi
	movabsq	$71776119061217280, %r8 ## imm = 0xFF000000000000
	andq	%r8, %rax
	orq	%rsi, %rax
	movzwl	12(%rdi), %edx
	movzbl	14(%rdi), %esi
	shlq	$16, %rsi
	orl	%edx, %esi
	movq	%rsi, %r9
	shlq	$32, %r9
	movl	8(%rdi), %edx
	orq	%r9, %rdx
	andq	%rdx, %rcx
	movzbl	%sil, %esi
	shlq	$32, %rsi
	orq	%rcx, %rsi
	movl	%edx, %ecx
	andl	$-16777216, %ecx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rcx
	movq	%rdx, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rcx, %rsi
	movq	%rdx, %rcx
	andq	$65280, %rcx            ## imm = 0xFF00
	orq	%rsi, %rcx
	movzbl	%dl, %esi
	orq	%rcx, %rsi
	andq	%r8, %rdx
	orq	%rsi, %rdx
	ret

We now compile this into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movzwl	12(%rdi), %eax
	movzbl	14(%rdi), %ecx
	shlq	$16, %rcx
	orl	%eax, %ecx
	shlq	$32, %rcx
	movl	8(%rdi), %edx
	orq	%rcx, %rdx
	movq	(%rdi), %rax
	ret

A small improvement :-)

llvm-svn: 123520

9c10d587

one more instcombine variant that is needed to work with future changes, · e20dd530
Chris Lattner authored Jan 15, 2011
```
no functionality change currently.

llvm-svn: 123517
```
e20dd530
fix typo · 497459d5
Chris Lattner authored Jan 15, 2011
```
llvm-svn: 123516
```
497459d5
Catch ~x < cst just like ~x < ~y, we currently handle this through · f3c4eeff
Chris Lattner authored Jan 15, 2011
```
means that are about to disappear.

llvm-svn: 123515
```
f3c4eeff
reduce indentation · 311aa63c
Chris Lattner authored Jan 15, 2011
```
llvm-svn: 123514
```
311aa63c
Generalize LoadAndStorePromoter a bit and switch LICM · b68ec5c3
Chris Lattner authored Jan 15, 2011
```
to use it.

llvm-svn: 123501
```
b68ec5c3

Jan 14, 2011
- Fix a false-positive warning. · 3e2f6cf7
  Owen Anderson authored Jan 14, 2011
```
llvm-svn: 123480
```
  3e2f6cf7
- Enhance GlobalOpt to be able evaluate initializers that involve stores through · 9eb7cb48
  Owen Anderson authored Jan 14, 2011
```
bitcasts, at least in simple cases.  This fixes clang's CodeGenCXX/virtual-base-dtor.cpp

llvm-svn: 123477
```
  9eb7cb48
- switch SRoA to use LoadAndStorePromoter instead of its own copy of the code. · b498f9af
  Chris Lattner authored Jan 14, 2011
```
llvm-svn: 123457
```
  b498f9af
- Add a new LoadAndStorePromoter class, which implements the general · 95294b87
  Chris Lattner authored Jan 14, 2011
```
"promote a bunch of load and stores" logic, allowing the code to
be shared and reused.

llvm-svn: 123456
```
  95294b87
- split SROA into two passes: one that uses DomFrontiers (-scalarrepl) · 9987a6f4
  Chris Lattner authored Jan 14, 2011
```
and one that uses SSAUpdater (-scalarrepl-ssa)

llvm-svn: 123436
```
  9987a6f4