- Dec 19, 2012
-
-
Nadav Rotem authored
Enable the loop vectorizer in clang and not in the pass manager, so that we can disable it in clang. llvm-svn: 170470
-
- Dec 18, 2012
-
-
Benjamin Kramer authored
LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations. For example on x86 with SSE4.2 a <8 x i8> add reduction becomes movdqa %xmm0, %xmm1 movhlps %xmm1, %xmm1 ## xmm1 = xmm1[1,1] paddw %xmm0, %xmm1 pshufd $1, %xmm1, %xmm0 ## xmm0 = xmm1[1,0,0,0] paddw %xmm1, %xmm0 phaddw %xmm0, %xmm0 pextrb $0, %xmm0, %edx instead of pextrb $2, %xmm0, %esi pextrb $0, %xmm0, %edx addb %sil, %dl pextrb $4, %xmm0, %esi addb %dl, %sil pextrb $6, %xmm0, %edx addb %sil, %dl pextrb $8, %xmm0, %esi addb %dl, %sil pextrb $10, %xmm0, %edi pextrb $14, %xmm0, %edx addb %sil, %dil pextrb $12, %xmm0, %esi addb %dil, %sil addb %sil, %dl llvm-svn: 170439
-
Nadav Rotem authored
llvm-svn: 170416
-
Nadav Rotem authored
getScalarSizeInBits could not handle vectors of pointers. llvm-svn: 170412
-
Rafael Espindola authored
llvm-svn: 170404
-
- Dec 17, 2012
-
-
Chandler Carruth authored
This was a silly oversight, we weren't pruning allocas which were used by variable-length memory intrinsics from the set that could be widened and promoted as integers. Fix that. llvm-svn: 170353
-
Evgeniy Stepanov authored
llvm-svn: 170347
-
Chandler Carruth authored
This also cleans up a bit of the memcpy call rewriting by sinking some irrelevant code further down and making the call-emitting code a bit more concrete. Previously, memcpy of a subvector would actually miscompile (!!!) the copy into a single vector element copy. I have no idea how this ever worked. =/ This is the memcpy half of PR14478 which we probably weren't noticing previously because it didn't actually assert. The rewrite relies on the newly refactored insert- and extractVector functions to do the heavy lifting, and those are the same as used for loads and stores which makes the test coverage a bit more meaningful here. llvm-svn: 170338
-
Evgeniy Stepanov authored
Check whether a BB is known as reachable before adding it to the worklist. This way BB's with multiple predecessors are added to the list no more than once. llvm-svn: 170335
-
Chandler Carruth authored
The first half of fixing this bug was actually in r170328, but was entirely coincidental. It did however get me to realize the nature of the bug, and adapt the test case to test more interesting behavior. In turn, that uncovered the rest of the bug which I've fixed here. This should fix two new asserts that showed up in the vectorize nightly tester. llvm-svn: 170333
-
Chandler Carruth authored
I noticed this while looking at r170328. We only ever do a vector rewrite when the alloca *is* the vector type, so it's good to not paper over bugs here by doing a convertValue that isn't needed. llvm-svn: 170331
-
Chandler Carruth authored
This will allow its use inside of memcpy rewriting as well. This routine is more complex than extractVector, and some of its uses are not 100% where I want them to be so there is still some work to do here. While this can technically change the output in some cases, it shouldn't be a change that matters -- IE, it can leave some dead code lying around that prior versions did not, etc. Yet another step in the refactorings leading up to the solution to the last component of PR14478. llvm-svn: 170328
-
Chandler Carruth authored
The method helpers all implicitly act upon the alloca, and what we really want is a fully generic helper. Doing memcpy rewrites is more special than all other rewrites because we are at times rewriting instructions which touch pointers *other* than the alloca. As a consequence all of the helpers needed by memcpy rewriting of sub-vector copies will need to be generalized fully. Note that all of these helpers ({insert,extract}{Integer,Vector}) are woefully uncommented. I'm going to go back through and document them once I get the factoring correct. No functionality changed. llvm-svn: 170325
-
Chandler Carruth authored
This makes it suitable for use in rewriting memcpy in the presence of subvector memcpy intrinsics. No functionality changed. llvm-svn: 170324
-
Chandler Carruth authored
PR14478 highlights a serious problem in SROA that simply wasn't being exercised due to a lack of vector input code mixed with C-library function calls. Part of SROA was written carefully to handle subvector accesses via memset and memcpy, but the rewriter never grew support for this. Fixing it required refactoring the subvector access code in other parts of SROA so it could be shared, and then fixing the splat formation logic and using subvector insertion (this patch). The PR isn't quite fixed yet, as memcpy is still broken in the same way. I'm starting on that series of patches now. Hopefully this will be enough to bring the bullet benchmark back to life with the bb-vectorizer enabled, but that may require fixing memcpy as well. llvm-svn: 170301
-
Chandler Carruth authored
No functionality changed. Another step of refactoring toward solving PR14487. llvm-svn: 170300
-
Chandler Carruth authored
No functionality changed. Refactoring leading up to the fix for PR14478 which requires some significant changes to the memset and memcpy rewriting. llvm-svn: 170299
-
- Dec 15, 2012
-
-
Chandler Carruth authored
The alloca width is based on the alloc size, not the type size. llvm-svn: 170270
-
NAKAMURA Takumi authored
llvm-svn: 170267
-
- Dec 14, 2012
-
-
Michael Ilseman authored
Add back FoldOpIntoPhi optimizations with fix. Included test cases to help catch these errors and to test the presence of the optimization itself llvm-svn: 170248
-
Nadav Rotem authored
llvm-svn: 170246
-
Shuxin Yang authored
Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0" llvm-svn: 170226
-
Evgeniy Stepanov authored
llvm-svn: 170203
-
Evgeniy Stepanov authored
Origin address is always 4 byte aligned, and the access type is always i32. llvm-svn: 170199
-
Evgeniy Stepanov authored
This change moves the code for default shadow propagaition (handleShadowOr) and origin tracking (setOriginForNaryOp) into a new builder-like class. Also gets rid of handleShadowOrBinary. llvm-svn: 170192
-
Nadav Rotem authored
llvm-svn: 170172
-
Nadav Rotem authored
llvm-svn: 170166
-
Nadav Rotem authored
llvm-svn: 170162
-
Nadav Rotem authored
Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by default. I plan to revert this patch later today. llvm-svn: 170157
-
- Dec 13, 2012
-
-
NAKAMURA Takumi authored
This assumes (1 << n) is always not zero. Consider n is greater than word size. Although I know it is undefined, this transforms undefined behavior hidden. This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang. llvm-svn: 170128
-
Eric Christopher authored
it seems to be breaking self-host for a few people and is PR14592. This reverts commit r170024. llvm-svn: 170106
-
Rafael Espindola authored
llvm-svn: 170094
-
Rafael Espindola authored
In a previous thread it was pointed out that isPowerOfTwo is not a very precise name since it can return false for powers of two if it is unable to show that they are powers of two. llvm-svn: 170093
-
Michael Ilseman authored
Provides m_Argument that allows matching against a CallSite's specified argument. Provides m_Intrinsic pattern that can be templatized over the intrinsic id and bind/match arguments similarly to other pattern matchers. Implementations provided for 0 to 4 arguments, though it's very simple to extend for more. Also provides example template specialization for bswap (m_BSwap) and example of code cleanup for its use. llvm-svn: 170091
-
Quentin Colombet authored
Better controls the inlining of functions when the caller function has MinSize attribute. Basically, when the caller function has this attribute, we do not "force" the inlining of callee functions carrying the InlineHint attribute (i.e., functions defined with inline keyword) llvm-svn: 170065
-
Nadav Rotem authored
Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc. llvm-svn: 170051
-
Chad Rosier authored
llvm-svn: 170050
-
- Dec 12, 2012
-
-
Michael Ilseman authored
llvm-svn: 170024
-
Michael Ilseman authored
llvm-svn: 170022
-
David Majnemer authored
llvm-svn: 170020
-