Commits · 7ba71be3922b45a5f7ed638cf8bcc2b7338d4fbc · Roger Ferrer / llvm-epi-0.8

Sep 13, 2011
- Correct grammar. · a93ab13e
  Eli Friedman authored Sep 13, 2011
```
llvm-svn: 139565
```
  a93ab13e
Sep 12, 2011

Change a bunch of isVolatile() checks to check for atomic load/store as well. · 7c5dc122

Eli Friedman authored Sep 12, 2011

No tests; these changes aren't really interesting in the sense that the logic is the same for volatile and atomic.

I believe this completes all of the changes necessary for the optimizer to handle loads and stores correctly. I'm going to try and come up with some additional testing, though.

llvm-svn: 139533

7c5dc122

Jul 18, 2011
- land David Blaikie's patch to de-constify Type, with a few tweaks. · 229907cd
  Chris Lattner authored Jul 18, 2011
```
llvm-svn: 135375
```
  229907cd
Jul 15, 2011

Disable loop idiom recognition of memset/memcpy if the function being compiled · a7ff5435

Chad Rosier authored Jul 15, 2011

is named after a common idiom (i.e., memset/memcpy).  Otherwise, we can run into 
infinite recursion.  Ideally, the user should use the correct -fno-builtin flag,
but in case they don't we should play nicely.
rdar://9763412

llvm-svn: 135286

a7ff5435

Jun 28, 2011
- SCEVExpander: give new insts a name that identifies the reponsible pass. · 411daa5e
  Andrew Trick authored Jun 28, 2011
```
llvm-svn: 133992
```
  411daa5e
- whitespace · 60ab3efb
  Andrew Trick authored Jun 28, 2011
```
llvm-svn: 133991
```
  60ab3efb
May 22, 2011

Fix PR9815: I was trying to get out of "generating code and then · c4ca7ab7

Chris Lattner authored May 22, 2011

failing to form a memset, then having to delete it" but my approximation
isn't safe for self recurrent loops.  Instead of doign a hack, just
do it the right way.

llvm-svn: 131858

c4ca7ab7

May 04, 2011
- preserve line number info. · 0daa07eb
  Devang Patel authored May 04, 2011
```
llvm-svn: 130869
```
  0daa07eb
Mar 14, 2011

Added SCEV::NoWrapFlags to manage unsigned, signed, and self wrap · 8b55b736

Andrew Trick authored Mar 14, 2011

properties.
Added the self-wrap flag for SCEV::AddRecExpr.
A slew of temporary FIXMEs indicate the intention of the no-self-wrap flag
without changing behavior in this revision.

llvm-svn: 127590

8b55b736

whitespace · 328b223b
Andrew Trick authored Mar 14, 2011
```
llvm-svn: 127589
```
328b223b

Mar 07, 2011
- Preserve line no. info. · d00c628f
  Devang Patel authored Mar 07, 2011
```
Radar 9097659

llvm-svn: 127182
```
  d00c628f
Feb 21, 2011
- fix a crasher in disabled code (on variable stride loops) · 2333ac27
  Chris Lattner authored Feb 21, 2011
```
llvm-svn: 126125
```
  2333ac27
- Add some (disabled code) to print out negative strides. · bc661d66
  Chris Lattner authored Feb 21, 2011
```
llvm-svn: 126102
```
  bc661d66
Feb 19, 2011

rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byte · 72a35fb9

Chris Lattner authored Feb 19, 2011

constant, including globals.  This makes us generate much more "pretty" pattern
globals as well because it doesn't break it down to an array of bytes all the
time.

This enables us to handle stores of relocatable globals.  This kicks in about
48 times in 254.gap, giving us stuff like this:

@.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*] [%struct.TypHeader* (%struct.TypHeader*, %struct
.TypHeader*)* @IsFalse, %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)* @IsFalse], align 16

...
  call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*]* @.memset_pattern40 to i8*
), i64 %tmp75) nounwind

llvm-svn: 126044

72a35fb9

Implement rdar://9009151 , transforming strided loop stores of · 0f4a6401

Chris Lattner authored Feb 19, 2011

unsplatable values into memset_pattern16 when it is available
(recent darwins).  This transforms lots of strided loop stores
of ints for example, like 5 in vpr:

  Formed memset:   call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25)
    from store to: {%3,+,4}<%11> at:   store i32 3, i32* %scevgep, align 4, !tbaa !4

llvm-svn: 126040

0f4a6401

Feb 18, 2011
- Make loop-idiom use TargetLibraryInfo to determine whether it is allowed · e6b261fe
  Chris Lattner authored Feb 18, 2011
```
to hack on memset, memcpy etc.

llvm-svn: 125974
```
  e6b261fe
Feb 15, 2011
- Spelling fix: consequtive -> consecutive. · 75b5d27b
  Duncan Sands authored Feb 15, 2011
```
llvm-svn: 125563
```
  75b5d27b
Jan 04, 2011

Teach loop-idiom to turn a loop containing a memset into a larger memset · 8643810e

Chris Lattner authored Jan 04, 2011

when safe.

The testcase is basically this nested loop:
void foo(char *X) {
  for (int i = 0; i != 100; ++i) 
    for (int j = 0; j != 100; ++j)
      X[j+i*100] = 0;
}

which gets turned into a single memset now.  clang -O3 doesn't optimize
this yet though due to a phase ordering issue I haven't analyzed yet.

llvm-svn: 122806

8643810e

restructure this a bit. Initialize the WeakVH with "I", the · a62b01dc

Chris Lattner authored Jan 04, 2011

instruction *after* the store.  The store will always be deleted
if the transformation kicks in, so we'd do an N^2 scan of every
loop block.  Whoops.

llvm-svn: 122805

a62b01dc

use the very-handy getTruncateOrZeroExtend helper function, and · 0ba473c2
Chris Lattner authored Jan 04, 2011
```
stop setting NSW: signed overflow is possible.  Thanks to Dan
for pointing these out.

llvm-svn: 122790
```
0ba473c2
Fix comment. · 0839d393
Owen Anderson authored Jan 03, 2011
```
llvm-svn: 122788
```
0839d393

Jan 03, 2011
- reduce redundancy in the hashing code and other misc cleanups. · 02a9776b
  Chris Lattner authored Jan 03, 2011
```
llvm-svn: 122720
```
  02a9776b
- add DEBUG and -stats output to earlycse. · 8fac5db2
  Chris Lattner authored Jan 02, 2011
```
Teach it to CSE the rest of the non-side-effecting instructions.

llvm-svn: 122716
```
  8fac5db2
Jan 02, 2011

fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make · 9c69406f

Chris Lattner authored Jan 02, 2011

sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy.  Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.

llvm-svn: 122712

9c69406f

If a loop iterates exactly once (has backedge count = 0) then don't · 5702a43c
Chris Lattner authored Jan 02, 2011
```
mess with it.  We'd rather peel/unroll it than convert all of its 
stores into memsets.

llvm-svn: 122711
```
5702a43c

enhance loop idiom recognition to scan *all* unconditionally executed · 8455b6e4

Chris Lattner authored Jan 02, 2011

blocks in a loop, instead of just the header block.  This makes it more
aggressive, able to handle Duncan's Ada examples.

llvm-svn: 122704

8455b6e4

add a list of opportunities for future improvement. · 0469e01c
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122701
```
0469e01c

Allow loop-idiom to run on multiple BB loops, but still only scan the loop · ddf58010

Chris Lattner authored Jan 02, 2011

header for now for memset/memcpy opportunities.  It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for 
loops" into 2 basic block loops that loop-idiom was ignoring.

With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:

        for (j=0; j<MAX_history; ++j) {
          history_new[i][j+1] = history[2*i][j];
        }

Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine.  Woo.

llvm-svn: 122685

ddf58010

remove debugging code. · 5b5a043d
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122683
```
5b5a043d
add some -stats output. · 12f91bef
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122682
```
12f91bef
teach loop idiom recognition to form memcpy's from simple loops. · 85b6d81d
Chris Lattner authored Jan 02, 2011
```
llvm-svn: 122678
```
85b6d81d

Jan 01, 2011
- add a validity check that was missed, fixing a crash on the · a3514441
  Chris Lattner authored Jan 01, 2011
```
new testcase.

llvm-svn: 122662
```
  a3514441
- improve validity check to handle constant-trip-count loops more · 91a44358
  Chris Lattner authored Jan 01, 2011
```
aggressively.  In practice, this doesn't help anything though,
see the todo.

llvm-svn: 122660
```
  91a44358
- implement the "no aliasing accesses in loop" safety check. This pass · 8b3baf6d
  Chris Lattner authored Jan 01, 2011
```
should be correct now.

llvm-svn: 122659
```
  8b3baf6d
Dec 28, 2010
- simplify this, isBytewiseValue handles the extra check. We still · 65a699d4
  Chris Lattner authored Dec 28, 2010
```
check for "multiple of a byte" in size to make it clear that the
>> 3 below is safe.

llvm-svn: 122604
```
  65a699d4
- Silence gcc warning about an unused variable when doing a release build. · 5cf10e69
  Duncan Sands authored Dec 28, 2010
```
llvm-svn: 122593
```
  5cf10e69
Dec 27, 2010

fix some issues Frits noticed, add AliasAnalysis as a dependency · cb18bfa3
Chris Lattner authored Dec 27, 2010
```
llvm-svn: 122585
```
cb18bfa3
have loop-idiom nuke instructions that feed stores that get removed. · b9fe685b
Chris Lattner authored Dec 27, 2010
```
llvm-svn: 122574
```
b9fe685b

implement enough of the memset inference algorithm to recognize and insert · 29e14edc

Chris Lattner authored Dec 26, 2010

memsets.  This is still missing one important validity check, but this is enough
to compile stuff like this:

void test0(std::vector<char> &X) {
  for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I)
    *I = 0;
}

void test1(std::vector<int> &X) {
  for (long i = 0, e = X.size(); i != e; ++i)
    X[i] = 0x01010101;
}

With:
 $ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc 

to:

__Z5test0RSt6vectorIcSaIcEE:            ## @_Z5test0RSt6vectorIcSaIcEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rsi
	cmpq	%rsi, %rax
	je	LBB0_2
## BB#1:                                ## %bb.nph
	subq	%rax, %rsi
	movq	%rax, %rdi
	callq	___bzero
LBB0_2:                                 ## %for.end
	addq	$8, %rsp
	ret
...
__Z5test1RSt6vectorIiSaIiEE:            ## @_Z5test1RSt6vectorIiSaIiEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rdx
	subq	%rax, %rdx
	cmpq	$4, %rdx
	jb	LBB1_2
## BB#1:                                ## %for.body.preheader
	andq	$-4, %rdx
	movl	$1, %esi
	movq	%rax, %rdi
	callq	_memset
LBB1_2:                                 ## %for.end
	addq	$8, %rsp
	ret

llvm-svn: 122573

29e14edc

Dec 26, 2010
- sketch more of this out. · 7c5f9c35
  Chris Lattner authored Dec 26, 2010
```
llvm-svn: 122567
```
  7c5f9c35