- Dec 31, 2012
-
-
Chris Lattner authored
promoting a store in a loop. This was noticed when working on PR14753, but isn't directly related. llvm-svn: 171281
-
Chris Lattner authored
PR14753 llvm-svn: 171279
-
Jakub Staszak authored
if C1 and C2 differ only with one bit. Fixes PR14708. llvm-svn: 171270
-
- Dec 30, 2012
-
-
Nadav Rotem authored
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs. The bug happened because undefs are not loop values. This patch handles these PHIs. PR14725 llvm-svn: 171251
-
Dmitri Gribenko authored
This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171250
-
Dmitri Gribenko authored
This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171246
-
NAKAMURA Takumi authored
Larry Evans reported it fails if source tree contains "load", like "download". llvm-svn: 171243
-
- Dec 28, 2012
-
-
Chandler Carruth authored
propagating one of the values it simplified to a constant across a myriad of instructions. Notably, ptrtoint instructions when we had a constant pointer (say, 0) didn't propagate that, blocking a massive number of down-stream optimizations. This was uncovered when investigating why we fail to inline and delete the boilerplate in: void f() { std::vector<int> v; v.push_back(1); } It turns out most of the efforts I've made thus far to improve the analysis weren't making it far purely because of this. After this is fixed, the store-to-load forwarding patch enables LLVM to optimize the above to an empty function. We still can't nuke a second push_back, but for different reasons. There is a very real chance this will cause somewhat noticable changes in inlining behavior, so please let me know if you see regressions (or improvements!) because of this patch. llvm-svn: 171196
-
Chandler Carruth authored
how to propagate constants through insert and extract value instructions. With the recent improvements to instsimplify, this allows inline cost analysis to constant fold through intrinsic functions, including notably the with.overflow intrinsic math routines which often show up inside of STL abstractions. This is yet another piece in the puzzle of breaking down the code for: void f() { std::vector<int> v; v.push_back(1); } But it still isn't enough. There are a pile of bugs in inline cost still blocking this. llvm-svn: 171195
-
Chandler Carruth authored
constant folding calls. Add the initial tests for this which show that now instsimplify can simplify blindingly obvious code patterns expressed with both intrinsics and library calls. llvm-svn: 171194
-
- Dec 27, 2012
-
-
Nadav Rotem authored
If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified. PR14719. llvm-svn: 171124
-
- Dec 26, 2012
-
-
Nadav Rotem authored
LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1 llvm-svn: 171114
-
Hal Finkel authored
For the time being this includes only some dummy test cases. Once the generic implementation of the intrinsics cost function does something other than assuming scalarization in all cases, or some target specializes the interface, some real test cases can be added. Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID in a few other places. llvm-svn: 171079
-
Hal Finkel authored
llvm-svn: 171076
-
- Dec 25, 2012
-
-
Hal Finkel authored
llvm-svn: 171075
-
- Dec 24, 2012
-
-
Nick Lewycky authored
llvm-svn: 171043
-
Nadav Rotem authored
the StoreInst operands. PR14705. llvm-svn: 171023
-
Nadav Rotem authored
The bug was in the code that detects PHIs in if-then-else block sequence. PR14701. llvm-svn: 171008
-
- Dec 23, 2012
-
-
Nadav Rotem authored
the cost of arithmetic functions. We now assume that the cost of arithmetic operations that are marked as Legal or Promote is low, but ops that are marked as custom are higher. llvm-svn: 171002
-
Nadav Rotem authored
them more expensive. llvm-svn: 170995
-
- Dec 21, 2012
-
-
Nadav Rotem authored
memory bound checks. Before the fix we were able to vectorize this loop from the Livermore Loops benchmark: for ( k=1 ; k<n ; k++ ) x[k] = x[k-1] + y[k]; llvm-svn: 170811
-
- Dec 20, 2012
-
-
Nadav Rotem authored
Before if-conversion we could check if a value is loop invariant if it was declared inside the basic block. Now that loops have multiple blocks this check is incorrect. This fixes External/SPEC/CINT95/099_go/099_go llvm-svn: 170756
-
James Molloy authored
Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call. Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage). llvm-svn: 170704
-
- Dec 19, 2012
-
-
Paul Redmond authored
When the least bit of C is greater than V, (x&C) must be greater than V if it is not zero, so the comparison can be simplified. Although this was suggested in Target/X86/README.txt, it benefits any architecture with a directly testable form of AND. Patch by Kevin Schoedel llvm-svn: 170576
-
Benjamin Kramer authored
- An MVT can become an EVT when being split (e.g. v2i8 -> v1i8, the latter doesn't exist) - Return the scalar value when an MVT is scalarized (v1i64 -> i64) Fixes PR14639ff. llvm-svn: 170546
-
rdar://12801297Shuxin Yang authored
InstCombine for unsafe floating-point add/sub. llvm-svn: 170471
-
- Dec 18, 2012
-
-
Benjamin Kramer authored
LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations. For example on x86 with SSE4.2 a <8 x i8> add reduction becomes movdqa %xmm0, %xmm1 movhlps %xmm1, %xmm1 ## xmm1 = xmm1[1,1] paddw %xmm0, %xmm1 pshufd $1, %xmm1, %xmm0 ## xmm0 = xmm1[1,0,0,0] paddw %xmm1, %xmm0 phaddw %xmm0, %xmm0 pextrb $0, %xmm0, %edx instead of pextrb $2, %xmm0, %esi pextrb $0, %xmm0, %edx addb %sil, %dl pextrb $4, %xmm0, %esi addb %dl, %sil pextrb $6, %xmm0, %edx addb %sil, %dl pextrb $8, %xmm0, %esi addb %dl, %sil pextrb $10, %xmm0, %edi pextrb $14, %xmm0, %edx addb %sil, %dil pextrb $12, %xmm0, %esi addb %dil, %sil addb %sil, %dl llvm-svn: 170439
-
Nadav Rotem authored
into the same file in the future. llvm-svn: 170414
-
Nadav Rotem authored
getScalarSizeInBits could not handle vectors of pointers. llvm-svn: 170412
-
- Dec 17, 2012
-
-
Chandler Carruth authored
This was a silly oversight, we weren't pruning allocas which were used by variable-length memory intrinsics from the set that could be widened and promoted as integers. Fix that. llvm-svn: 170353
-
Chandler Carruth authored
This also cleans up a bit of the memcpy call rewriting by sinking some irrelevant code further down and making the call-emitting code a bit more concrete. Previously, memcpy of a subvector would actually miscompile (!!!) the copy into a single vector element copy. I have no idea how this ever worked. =/ This is the memcpy half of PR14478 which we probably weren't noticing previously because it didn't actually assert. The rewrite relies on the newly refactored insert- and extractVector functions to do the heavy lifting, and those are the same as used for loads and stores which makes the test coverage a bit more meaningful here. llvm-svn: 170338
-
Chandler Carruth authored
The first half of fixing this bug was actually in r170328, but was entirely coincidental. It did however get me to realize the nature of the bug, and adapt the test case to test more interesting behavior. In turn, that uncovered the rest of the bug which I've fixed here. This should fix two new asserts that showed up in the vectorize nightly tester. llvm-svn: 170333
-
Chandler Carruth authored
PR14478 highlights a serious problem in SROA that simply wasn't being exercised due to a lack of vector input code mixed with C-library function calls. Part of SROA was written carefully to handle subvector accesses via memset and memcpy, but the rewriter never grew support for this. Fixing it required refactoring the subvector access code in other parts of SROA so it could be shared, and then fixing the splat formation logic and using subvector insertion (this patch). The PR isn't quite fixed yet, as memcpy is still broken in the same way. I'm starting on that series of patches now. Hopefully this will be enough to bring the bullet benchmark back to life with the bb-vectorizer enabled, but that may require fixing memcpy as well. llvm-svn: 170301
-
- Dec 15, 2012
-
-
Chandler Carruth authored
llvm-svn: 170271
-
Chandler Carruth authored
The alloca width is based on the alloc size, not the type size. llvm-svn: 170270
-
- Dec 14, 2012
-
-
Michael Ilseman authored
Add back FoldOpIntoPhi optimizations with fix. Included test cases to help catch these errors and to test the presence of the optimization itself llvm-svn: 170248
-
Nadav Rotem authored
llvm-svn: 170240
-
Shuxin Yang authored
Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0" llvm-svn: 170226
-
- Dec 13, 2012
-
-
NAKAMURA Takumi authored
This assumes (1 << n) is always not zero. Consider n is greater than word size. Although I know it is undefined, this transforms undefined behavior hidden. This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang. llvm-svn: 170128
-
Quentin Colombet authored
Better controls the inlining of functions when the caller function has MinSize attribute. Basically, when the caller function has this attribute, we do not "force" the inlining of callee functions carrying the InlineHint attribute (i.e., functions defined with inline keyword) llvm-svn: 170065
-