Commits · 08e5e62f98ea015e52038b9c6d36d7bd59626541 · Roger Ferrer / llvm-epi-0.8

Jan 14, 2009

Fix the time regression I introduced in 464.h264ref with · 1f0e0e7c

Dale Johannesen authored Jan 14, 2009

my earlier patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.  
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.

Also, when we build an expression that involves a (possibly
non-affine) IV from a different loop as well as an IV from
the one we're interested in (containsAddRecFromDifferentLoop),
don't recurse into that.  We can't do much with it and will
get in trouble if we try to create new non-affine IVs or something.

More testcases are coming.

llvm-svn: 62212

1f0e0e7c

rewrite OptimizeAwayTrappingUsesOfLoads to 1) avoid a temporary · 2538eb66

Chris Lattner authored Jan 14, 2009

vector and extraneous loop over it, 2) not delete globals used by
phis/selects etc which could actually be useful.  This fixes PR3321.
Many thanks to Duncan for narrowing this down.

llvm-svn: 62201

2538eb66

Jan 13, 2009
- Fix testsuite regressions from recursive inlining. · 0aeabdff
  Dale Johannesen authored Jan 13, 2009
```
llvm-svn: 62189
```
  0aeabdff
- Make instcombine ensure that all allocas are explicitly aligned at at · 59af7737
  Dan Gohman authored Jan 13, 2009
```
least their preferred alignment.

llvm-svn: 62176
```
  59af7737
- Correct a comment. · 944ccc5d
  Duncan Sands authored Jan 13, 2009
```
llvm-svn: 62165
```
  944ccc5d
Jan 12, 2009

Enable recursive inlining. Reduce inlining threshold · 433a9086
Dale Johannesen authored Jan 12, 2009
```
back to 200; 400 seems to be too high, loses more than
it gains.

llvm-svn: 62107
```
433a9086
Rename getABITypeSize to getTypePaddedSize, as · dc020f9c
Duncan Sands authored Jan 12, 2009
```
suggested by Chris.

llvm-svn: 62099
```
dc020f9c

Increase default inlining aggressiveness in partial · f8468529

Dale Johannesen authored Jan 11, 2009

compensation for turning off gcc's inliner.  This gets
us closer to the amount of inlining we were getting before.
It is not a win on everything, of course, but seems to
gain overall.

llvm-svn: 62058

f8468529

Jan 11, 2009

Duncan is nervous about undefinedness of % with negatives. I'm · bd3c7c8b
Chris Lattner authored Jan 11, 2009
```
not thrilled about 64-bit % in general, so rewrite to use * instead.

llvm-svn: 62047
```
bd3c7c8b
do not generated GEPs into vectors where they don't already exist. · b1915168
Chris Lattner authored Jan 11, 2009
```
We should treat vectors as atomic types, not like arrays.

llvm-svn: 62046
```
b1915168

Make a couple of cleanups to the instcombine bitcast/gep · 171d2d47

Chris Lattner authored Jan 11, 2009

canonicalization transform based on duncan's comments:

1) improve the comment about %.
2) within our index loop make sure the offset stays 
   within the *type size*, instead of within the *abi size*.
   This allows us to reason explicitly about landing in tail
   padding and means that issues like non-zero offsets into
   [0 x foo] types don't occur anymore.

llvm-svn: 62045

171d2d47

Jan 09, 2009

fix typo Duncan noticed. · 5f54d509
Chris Lattner authored Jan 09, 2009
```
llvm-svn: 61997
```
5f54d509
Fix PR3304 · ae0e857b
Chris Lattner authored Jan 09, 2009
```
llvm-svn: 61995
```
ae0e857b
Removed trailing whitespace from Makefiles. · 5cbf2239
Misha Brukman authored Jan 09, 2009
```
llvm-svn: 61991
```
5cbf2239

Implement rdar://6480391 , extending of equality icmp's to avoid a truncation. · f50aa6ae

Chris Lattner authored Jan 09, 2009

I noticed this in the code compiled for a routine using std::map, which produced
this code:
	%25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly
	%.lobit.i = lshr i32 %25, 31		; <i32> [#uses=1]
	%tmp.i = trunc i32 %.lobit.i to i8		; <i8> [#uses=1]
	%toBool = icmp eq i8 %tmp.i, 0		; <i1> [#uses=1]
	br i1 %toBool, label %bb3, label %bb4
which compiled to:

	call	L_memcmp$stub
	shrl	$31, %eax
	testb	%al, %al
	jne	LBB1_11	## 

with this change, we compile it to:

	call	L_memcmp$stub
	testl	%eax, %eax
	js	LBB1_11

This triggers all the time in common code, with patters like this:

	%169 = and i32 %ply, 1		; <i32> [#uses=1]
	%170 = trunc i32 %169 to i8		; <i8> [#uses=1]
	%toBool = icmp ne i8 %170, 0		; <i1> [#uses=1]

 	%7 = lshr i32 %6, 24		; <i32> [#uses=1]
	%9 = trunc i32 %7 to i8		; <i8> [#uses=1]
	%10 = icmp ne i8 %9, 0		; <i1> [#uses=1]

etc

llvm-svn: 61985

f50aa6ae

Remove some old code that looks like a remanant from signed-types days. · 0f7cf1d7
Chris Lattner authored Jan 09, 2009
```
llvm-svn: 61984
```
0f7cf1d7
Fix PR3298, a crash in Jump Threading. Apparently even · 482eb70a
Chris Lattner authored Jan 09, 2009
```
jump threading can have bugs, who knew? ;-)

llvm-svn: 61983
```
482eb70a
Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible. · fef138b1
Chris Lattner authored Jan 09, 2009
```
llvm-svn: 61980
```
fef138b1
move some code, check to see if the input to the GEP is a bitcast · a784a2ce
Chris Lattner authored Jan 09, 2009
```
(which is constant time and cheap) before checking hasAllZeroIndices.

llvm-svn: 61976
```
a784a2ce
Adjustments to last patch based on review. · 4755d9df
Dale Johannesen authored Jan 09, 2009
```
llvm-svn: 61969
```
4755d9df

Jan 08, 2009

Do not inline functions with (dynamic) alloca into · b48fc71f

Dale Johannesen authored Jan 08, 2009

functions that don't already have a (dynamic) alloca.
Dynamic allocas cause inefficient codegen and we shouldn't
propagate this (behavior follows gcc).  Two existing tests
assumed such inlining would be done; they are hacked by
adding an alloca in the caller, preserving the point of
the tests.

llvm-svn: 61946

b48fc71f

This implements the second half of the fix for PR3290, handling · c518dfd1

Chris Lattner authored Jan 08, 2009

loads from allocas that cover the entire aggregate.  This handles
some memcpy/byval cases that are produced by llvm-gcc.  This triggers
a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator
<kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon).

llvm-svn: 61915

c518dfd1

Jan 07, 2009

Whitespace - correct formatting. · 0bcf0858
Duncan Sands authored Jan 07, 2009
```
llvm-svn: 61879
```
0bcf0858

Remove alloca tracking from nocapture analysis. Not only · 289f59f2

Duncan Sands authored Jan 07, 2009

was it not very helpful, it was also wrong!  The problem
is shown in the testcase: the alloca might be passed to
a nocapture callee which dereferences it and returns the
original pointer.  But because it was a nocapture call we
think we don't need to track its uses, but we do.

llvm-svn: 61876

289f59f2

Reorder these. · 94bcbbab
Duncan Sands authored Jan 07, 2009
```
llvm-svn: 61873
```
94bcbbab
Use a switch rather than a sequence of "isa" tests. · 02599850
Duncan Sands authored Jan 07, 2009
```
llvm-svn: 61872
```
02599850
The verifier checks that the aliasee is not null. · 187c5716
Duncan Sands authored Jan 07, 2009
```
llvm-svn: 61870
```
187c5716

Implement the first half of PR3290: if there is a store of an · f2b8c82a

Chris Lattner authored Jan 07, 2009

integer to a (transitive) bitcast the alloca and if that integer
has the full size of the alloca, then it clobbers the whole thing.
Handle this by extracting pieces out of the stored integer and 
filing them away in the SROA'd elements.

This triggers fairly frequently because the CFE uses integers to
pass small structs by value and the inliner exposes these.  For 
example, in kimwitu++, I see a bunch of these with i64 stores to
"%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>"

In 176.gcc I see a few i32 stores to "%struct..0anon".

In the testcase, this is a difference between compiling test1 to:

_test1:
	subl	$12, %esp
	movl	20(%esp), %eax
	movl	%eax, 4(%esp)
	movl	16(%esp), %eax
	movl	%eax, (%esp)
	movl	(%esp), %eax
	addl	4(%esp), %eax
	addl	$12, %esp
	ret

vs:

_test1:
	movl	8(%esp), %eax
	addl	4(%esp), %eax
	ret

The second half of this will be to handle loads of the same form.

llvm-svn: 61853

f2b8c82a

Factor a bunch of code out into a helper method. · 9a2de65f
Chris Lattner authored Jan 07, 2009
```
llvm-svn: 61852
```
9a2de65f
use continue to simplify code and reduce nesting, no functionality · db561146
Chris Lattner authored Jan 07, 2009
```
change.

llvm-svn: 61851
```
db561146
Get TargetData once up front and cache as an ivar instead of · 938b54f3
Chris Lattner authored Jan 07, 2009
```
requerying it all over the place.

llvm-svn: 61850
```
938b54f3
Use the hasAllZeroIndices predicate to simplify some · a63dba9e
Chris Lattner authored Jan 07, 2009
```
code, no functionality change.

llvm-svn: 61849
```
a63dba9e

Jan 06, 2009

Change m_ConstantInt and m_SelectCst to take their constant integers · 2fdcc59b

Chris Lattner authored Jan 05, 2009

as template arguments instead of as instance variables, exposing more
optimization opportunities to the compiler earlier.

llvm-svn: 61776

2fdcc59b

Jan 05, 2009
- Teach the internalize pass to also internalize · 582c53d1
  Duncan Sands authored Jan 05, 2009
```
global aliases.

llvm-svn: 61754
```
  582c53d1
- Find loop back edges only after empty blocks are eliminated. · 8804293f
  Evan Cheng authored Jan 05, 2009
```
llvm-svn: 61752
```
  8804293f
- Not having an aliasee is a theoretical possibility. · 52e5deec
  Duncan Sands authored Jan 05, 2009
```
llvm-svn: 61745
```
  52e5deec
- Format more neatly. · 821d13cf
  Duncan Sands authored Jan 05, 2009
```
llvm-svn: 61744
```
  821d13cf
- Remove trailing spaces. · d24b93f3
  Duncan Sands authored Jan 05, 2009
```
llvm-svn: 61743
```
  d24b93f3
- Delete unused global aliases with internal linkage. · f5dbbae4
  Duncan Sands authored Jan 05, 2009
```
In fact this also deletes those with linkonce linkage,
however this is currently dead because for the moment
aliases aren't allowed to have this linkage type.

llvm-svn: 61742
```
  f5dbbae4
- Tidy up #includes, deleting a bunch of unnecessary #includes. · 906152a2
  Dan Gohman authored Jan 05, 2009
```
llvm-svn: 61715
```
  906152a2