Commits · a834008959d8849c0bd5e13551859bb2c0fb5fd9 · Roger Ferrer / llvm-epi

Dec 29, 2010
- test/Transforms/ConstProp/logicaltest.ll: FileCheck-ize. · a8340089
  NAKAMURA Takumi authored Dec 29, 2010
```
llvm-svn: 122620
```
  a8340089
Dec 27, 2010

implement enough of the memset inference algorithm to recognize and insert · 29e14edc

Chris Lattner authored Dec 26, 2010

memsets.  This is still missing one important validity check, but this is enough
to compile stuff like this:

void test0(std::vector<char> &X) {
  for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I)
    *I = 0;
}

void test1(std::vector<int> &X) {
  for (long i = 0, e = X.size(); i != e; ++i)
    X[i] = 0x01010101;
}

With:
 $ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc 

to:

__Z5test0RSt6vectorIcSaIcEE:            ## @_Z5test0RSt6vectorIcSaIcEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rsi
	cmpq	%rsi, %rax
	je	LBB0_2
## BB#1:                                ## %bb.nph
	subq	%rax, %rsi
	movq	%rax, %rdi
	callq	___bzero
LBB0_2:                                 ## %for.end
	addq	$8, %rsp
	ret
...
__Z5test1RSt6vectorIiSaIiEE:            ## @_Z5test1RSt6vectorIiSaIiEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rdx
	subq	%rax, %rdx
	cmpq	$4, %rdx
	jb	LBB1_2
## BB#1:                                ## %for.body.preheader
	andq	$-4, %rdx
	movl	$1, %esi
	movq	%rax, %rdi
	callq	_memset
LBB1_2:                                 ## %for.end
	addq	$8, %rsp
	ret

llvm-svn: 122573

29e14edc

Dec 26, 2010
- start using irbuilder to make mem intrinsics in a few passes. · 6cf8d6cc
  Chris Lattner authored Dec 26, 2010
```
llvm-svn: 122572
```
  6cf8d6cc
Dec 24, 2010

MemCpyOpt: Turn memcpys from a constant into a memset if possible. · ea9152e5

Benjamin Kramer authored Dec 24, 2010

This allows us to compile "int cst[] = {-1, -1, -1};" into
  movl  $-1, 16(%rsp)
  movq  $-1, 8(%rsp)
instead of
  movl  _cst+8(%rip), %eax
  movl  %eax, 16(%rsp)
  movq  _cst(%rip), %rax
  movq  %rax, 8(%rsp)

llvm-svn: 122548

ea9152e5

When determining if we can fold (x >> C1) << C2, the bits that we need to verify are zero · 226ac14a
Owen Anderson authored Dec 23, 2010
```
are not the low bits of x, but the bits that WILL be the low bits after the operation completes.

llvm-svn: 122529
```
226ac14a

Dec 23, 2010
- InstCombine: creating selects from -1 and 0 is fine, they combine into a sext from i1. · 8ef5001b
  Benjamin Kramer authored Dec 22, 2010
```
llvm-svn: 122453
```
  8ef5001b
Dec 22, 2010

When determining whether the new instruction was already present in · a45cfbd4

Duncan Sands authored Dec 22, 2010

the original instruction, half the cases were missed (making it not
wrong but suboptimal).  Also correct a typo (A <-> B) in the second
chunk. 

llvm-svn: 122414

a45cfbd4

Make this test not depend on how the variable is named. · bab9456f
Duncan Sands authored Dec 22, 2010
```
llvm-svn: 122413
```
bab9456f

Add a generic expansion transform: A op (B op' C) -> (A op B) op' (A op C) · fbb9ac3c

Duncan Sands authored Dec 22, 2010

if both A op B and A op C simplify.  This fires fairly often but doesn't
make that much difference.  On gcc-as-one-file it removes two "and"s and
turns one branch into a select.

llvm-svn: 122399

fbb9ac3c

Give GVN back the ability to perform simple conditional propagation on conditional branch values. · 5ab8d4b5

Owen Anderson authored Dec 21, 2010

I still think that LVI should be handling this, but that capability is some ways off in the future,
and this matters for some significant benchmarks.

llvm-svn: 122378

5ab8d4b5

Dec 21, 2010

Add an additional InstructionSimplify factorization test. · 76befde9
Duncan Sands authored Dec 21, 2010
```
llvm-svn: 122333
```
76befde9

While I don't think any later transforms can fire, it seems cleaner to · fecc6422

Duncan Sands authored Dec 21, 2010

not assume this (for example in case more transforms get added below
it).  Suggested by Frits van Bommel.

llvm-svn: 122332

fecc6422

Fix typo in comment, spotted by Deewiant. · 07c17132
Duncan Sands authored Dec 21, 2010
```
llvm-svn: 122329
```
07c17132

Teach InstructionSimplify about distributive laws. These transforms fire · ee3ec6eb

Duncan Sands authored Dec 21, 2010

quite often, but don't make much difference in practice presumably because
instcombine also knows them and more.

llvm-svn: 122328

ee3ec6eb

Add generic simplification of associative operations, generalizing · 6c7a52cf

Duncan Sands authored Dec 21, 2010

a couple of existing transforms.  This fires surprisingly often, for
example when compiling gcc "(X+(-1))+1->X" fires quite a lot as well
as various "and" simplifications (usually with a phi node operand).
Most of the time this doesn't make a real difference since the same
thing would have been done elsewhere anyway, eg: by instcombine, but
there are a few places where this results in simplifications that we
were not doing before.

llvm-svn: 122326

6c7a52cf

Dec 20, 2010
- Teach InstCombine to merge (icmp ult (X + CA), C1) | (icmp eq X, C2) into... · 68531bae
  Benjamin Kramer authored Dec 20, 2010
```
Teach InstCombine to merge (icmp ult (X + CA), C1) | (icmp eq X, C2) into (icmp ult (X + CA), C1 + 1) if C2 + CA == C1.

InstCombine creates these so now we compile x == 23 || x == 24 || x == 25 to
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 3
instead of
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 2
  %cmp3 = icmp eq i32 %x, 25
  %ret2 = or i1 %1, %cmp3

llvm-svn: 122248
```
  68531bae
- Have SimplifyBinOp dispatch Xor, Add and Sub to the corresponding methods · ed6d6c33
  Duncan Sands authored Dec 20, 2010
```
(they had just been forgotten before).  Adding Xor causes "main" in the
existing testcase 2010-11-01-lshr-mask.ll to be hugely more simplified.

llvm-svn: 122245
```
  ed6d6c33
- fix PR8807 by making transformConstExprCastCall aware of byval arguments. · 27ca8ebd
  Chris Lattner authored Dec 20, 2010
```
llvm-svn: 122238
```
  27ca8ebd
- when eliding a byval copy due to inlining a readonly function, we have · 0f114952
  Chris Lattner authored Dec 20, 2010
```
to make sure that the reused alloca has sufficient alignment.

llvm-svn: 122236
```
  0f114952
- pull byval processing out to its own helper function. · 00997445
  Chris Lattner authored Dec 20, 2010
```
llvm-svn: 122235
```
  00997445
- fix PR8769, a miscompilation by inliner when inlining a function with a byval · 7394680a
  Chris Lattner authored Dec 20, 2010
```
argument.  The generated alloca has to have at least the alignment of the
byval, if not, the client may be making assumptions that the new alloca won't
satisfy.

llvm-svn: 122234
```
  7394680a
- merge two tests. · a9a5c59d
  Chris Lattner authored Dec 20, 2010
```
llvm-svn: 122233
```
  a9a5c59d
- filecheckize · 6f3ddbd5
  Chris Lattner authored Dec 20, 2010
```
llvm-svn: 122232
```
  6f3ddbd5
- Test case for r122215 when InstCombine optimizes memset · 7bcead02
  Mon P Wang authored Dec 20, 2010
```
llvm-svn: 122216
```
  7bcead02
Dec 19, 2010

X86 supports i8/i16 overflow ops (except i8 multiplies), we should · 1e8c032a

Chris Lattner authored Dec 19, 2010

generate them.  

Now we compile:

define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp {
entry:
  %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b)
  %cmp = extractvalue %0 %0, 1
  br i1 %cmp, label %if.then, label %if.end

into:

_X:                                     ## @X
## BB#0:                                ## %entry
	subl	$12, %esp
	movb	16(%esp), %al
	addb	20(%esp), %al
	jo	LBB0_2

Before we were generating:

_X:                                     ## @X
## BB#0:                                ## %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movb	12(%ebp), %al
	testb	%al, %al
	setge	%cl
	movb	8(%ebp), %dl
	testb	%dl, %dl
	setge	%ah
	cmpb	%cl, %ah
	sete	%cl
	addb	%al, %dl
	testb	%dl, %dl
	setge	%al
	cmpb	%al, %ah
	setne	%al
	andb	%cl, %al
	testb	%al, %al
	jne	LBB0_2

llvm-svn: 122186

1e8c032a

recognize an unsigned add with overflow idiom into uadd. · 5e0c0c72

Chris Lattner authored Dec 19, 2010

This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.

Previously we got:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	cmpq	%rdi, %rsi
	movl	$42, %eax
	cmovaq	%rsi, %rax
	ret

Now we get:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	movl	$42, %eax
	cmovbq	%rsi, %rax
	ret

llvm-svn: 122182

5e0c0c72

optimize uadd(x, cst) into a comparison when the normal · 33dc3f0c
Chris Lattner authored Dec 19, 2010
```
result is dead.  This is required for my next patch to not
regress the testsuite.

llvm-svn: 122181
```
33dc3f0c

generalize the sadd creation code to not require that the · 79874566

Chris Lattner authored Dec 19, 2010

sadd formed is half the size of the original type. We can
now compile this into a sadd.i8:

unsigned char X(char a, char b) {
  int res = a+b;
  if ((unsigned )(res+128) > 255U)
    abort();
  return res;
}

llvm-svn: 122178

79874566

fix another miscompile in the llvm.sadd formation logic: it wasn't · c56c8453

Chris Lattner authored Dec 19, 2010

checking to see if the high bits of the original add result were dead.
Inserting a smaller add and zexting back to that size is not good enough.

This is likely to be the fix for 8816.

llvm-svn: 122177

c56c8453

fix a bug (possibly 8816) in the sadd forming xform: it isn't · f29562db
Chris Lattner authored Dec 19, 2010
```
profitable (or safe) to promote code when the add-with-constant
has other uses.

llvm-svn: 122175
```
f29562db
Enhance LICM to promote alias sets whose pointers themselves are stored, · 408a684d
Chris Lattner authored Dec 19, 2010
```
which doesn't affect the memory address being promoted.

llvm-svn: 122172
```
408a684d
fix PR8602, a bug in an assertion: a volatile store *of* a pointer · 3337a814
Chris Lattner authored Dec 19, 2010
```
does not make the alias set for that pointer volatile, just stores
*to* the pointer.

llvm-svn: 122171
```
3337a814
revert r122164, I'm going to go with a different approach. · fb888622
Chris Lattner authored Dec 19, 2010
```
llvm-svn: 122168
```
fb888622

first step to fixing PR8642: don't fold away empty basic blocks · 583ec6fa

Chris Lattner authored Dec 19, 2010

which have trapping constant exprs in them due to PHI nodes.
Eliminating them can cause the constant expr to be evalutated
on new paths if the input edges are critical.

llvm-svn: 122164

583ec6fa

move this test into the ARM test so that it is only run when the arm backend · 2b43f2df
Chris Lattner authored Dec 19, 2010
```
is enabled.

llvm-svn: 122163
```
2b43f2df

Dec 18, 2010

Add vector versions of some existing scalar transforms to aid codegen in... · 7aa18bf4

Nate Begeman authored Dec 17, 2010

Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms.

llvm-svn: 122105

7aa18bf4

Dec 17, 2010

Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix... · 1294ea7d

Owen Anderson authored Dec 17, 2010

Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix for a bug that manifested itself
on the DragonEgg self-host bot. Unfortunately, the testcase is pretty messy and doesn't reduce well due to
interactions with other parts of InstCombine.

llvm-svn: 122072

1294ea7d

SimplifyCFG: Ranges can be larger than 64 bits. Fixes Release-selfhost build. · e5f49c4f
Benjamin Kramer authored Dec 17, 2010
```
llvm-svn: 122054
```
e5f49c4f

improve switch formation to handle small range · d14b0f1d

Chris Lattner authored Dec 17, 2010

comparisons formed by comparisons.  For example,
this:

void foo(unsigned x) {
  if (x == 0 || x == 1 || x == 3 || x == 4 || x == 6) 
    bar();
}

compiles into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	cmpl	$6, %edi
	ja	LBB0_2
## BB#1:                                ## %entry
	movl	%edi, %eax
	movl	$91, %ecx
	btq	%rax, %rcx
	jb	LBB0_3

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	cmpl	$2, %edi
	jb	LBB0_4
## BB#1:                                ## %switch.early.test
	cmpl	$6, %edi
	ja	LBB0_3
## BB#2:                                ## %switch.early.test
	movl	%edi, %eax
	movl	$88, %ecx
	btq	%rax, %rcx
	jb	LBB0_4

This catches a bunch of cases in GCC, which look like this:

 %804 = load i32* @which_alternative, align 4, !tbaa !0
 %805 = icmp ult i32 %804, 2
 %806 = icmp eq i32 %804, 3
 %or.cond121 = or i1 %805, %806
 %807 = icmp eq i32 %804, 4
 %or.cond124 = or i1 %or.cond121, %807
 br i1 %or.cond124, label %.thread, label %808

turning this into a range comparison.

llvm-svn: 122045

d14b0f1d

Revert r64460. strtol and friends cannot be marked readonly, even with · 93dc2b80

Dan Gohman authored Dec 17, 2010

a null endptr argument, because they may write to errno.

This fixes a seflhost miscompile observed on Linux targets when TBAA
was enabled.

llvm-svn: 122014

93dc2b80