- May 22, 2013
-
-
Nadav Rotem authored
We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs). * Imroved the numbering API. * Changed the placement of new instructions to the last root. * Fixed a bug with external tree users with non-zero lane. * Fixed a bug in the placement of in-tree users. llvm-svn: 182508
-
Jean-Luc Duprat authored
The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499
-
Arnold Schwaighofer authored
The Value pointers we store in the induction variable list can be RAUW'ed by a call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing in some other places where we store pointers that could potentially be RAUW'ed. Fixes PR16073. llvm-svn: 182485
-
- May 21, 2013
-
-
Evgeniy Stepanov authored
This stuff is used on platforms where MSan does not have a proper VarArg implementation (anything other than x86_64 at the moment). llvm-svn: 182375
-
- May 20, 2013
-
-
Bill Wendling authored
llvm-svn: 182315
-
Hal Finkel authored
As discussed, LoopUtils.h is a better name. llvm-svn: 182314
-
Hal Finkel authored
Other passes, PPC counter-loop formation for example, also need to add loop preheaders outside of the regular loop simplification pass. This makes InsertPreheaderForLoop a global function so that it can be used by other passes. No functionality change intended. llvm-svn: 182299
-
- May 18, 2013
-
-
Arnold Schwaighofer authored
We might encouter single edge PHIs - handle them with an identity select. Fixes PR15990. llvm-svn: 182199
-
- May 17, 2013
-
-
Matt Arsenault authored
llvm-svn: 182164
-
Benjamin Kramer authored
llvm-svn: 182100
-
- May 16, 2013
-
-
Evgeniy Stepanov authored
They are always defined in the main executable. llvm-svn: 181994
-
Arnold Schwaighofer authored
We only want to check this once, not for every conditional block in the loop. No functionality change (except that we don't perform a check redudantly anymore). llvm-svn: 181942
-
- May 15, 2013
-
-
Michael Gottesman authored
[objc-arc] Fixed a spelling error and made the statistic descriptions be consistent about their usage of periods. llvm-svn: 181901
-
Arnold Schwaighofer authored
No functionality change. llvm-svn: 181862
-
Arnold Schwaighofer authored
InstCombine can be uncooperative to vectorization and sink loads into conditional blocks. This prevents vectorization. Undo this optimization if there are unconditional memory accesses to the same addresses in the loop. radar://13815763 llvm-svn: 181860
-
Sylvestre Ledru authored
llvm-svn: 181848
-
- May 14, 2013
-
-
Manman Ren authored
CXAAtExitFn was set outside a loop and before optimizations where functions can be deleted. This patch will set CXAAtExitFn inside the loop and after optimizations. Seg fault when running LTO because of accesses to a deleted function. rdar://problem/13838828 llvm-svn: 181838
-
Michael Gottesman authored
llvm-svn: 181760
-
Arnold Schwaighofer authored
We used to give up if we saw two integer inductions. After this patch, we base further induction variables on the chosen one like we do in the reverse induction and pointer induction case. Fixes PR15720. radar://13851975 llvm-svn: 181746
-
Michael Gottesman authored
llvm-svn: 181745
-
Michael Gottesman authored
[objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or. In the presense of a block being initialized, the frontend will emit the objc_retain on the original pointer and the release on the pointer loaded from the alloca. The optimizer will through the provenance analysis realize that the two are related (albiet different), but since we only require KnownSafe in one direction, will match the inner retain on the original pointer with the guard release on the original pointer. This is fixed by ensuring that in the presense of allocas we only unconditionally remove pointers if both our retain and our release are KnownSafe (i.e. we are KnownSafe in both directions) since we must deal with the possibility that the frontend will emit what (to the optimizer) appears to be unbalanced retain/releases. An example of the miscompile is: %A = alloca retain(%x) retain(%x) <--- Inner Retain store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) release(%x) <--- Guarding Release getting optimized to: %A = alloca retain(%x) store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) rdar://13750319 llvm-svn: 181743
-
- May 13, 2013
-
-
Matt Beaumont-Gay authored
Suppresses an unused-variable warning in -Asserts builds. llvm-svn: 181733
-
Michael Gottesman authored
[objc-arc-opts] Add comment to BBState making it clear that get{TopDown,BottomUp}PtrState will create a new PtrState object if it does not find a PtrState for Arg. llvm-svn: 181726
-
Michael Gottesman authored
This makes the statistics gathering completely independent of the actual optimization occuring, preventing any sort of bleeding over from occuring. Additionally, it simplifies a switch statement in the non-statistic gathering case. llvm-svn: 181719
-
Duncan Sands authored
read in asserts. llvm-svn: 181689
-
Nadav Rotem authored
llvm-svn: 181684
-
Nadav Rotem authored
The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract. llvm-svn: 181674
-
Nadav Rotem authored
SLPVectorizer: Clear the map that maps between scalars to vectors after each round of vectorization. Testcase in the next commit. llvm-svn: 181673
-
- May 12, 2013
-
-
David Majnemer authored
There are two transforms in visitUrem that conflict with each other. *) One, if a divisor is a power of two, subtracts one from the divisor and turns it into a bitwise-and. *) The other unwraps both operands if they are surrounded by zext instructions. Flipping the order allows the subtraction to go beneath the sign extension. llvm-svn: 181668
-
Arnold Schwaighofer authored
Use the widest induction type encountered for the cannonical induction variable. We used to turn the following loop into an empty loop because we used i8 as induction variable type and truncated 1024 to 0 as trip count. int a[1024]; void fail() { int reverse_induction = 1023; unsigned char forward_induction = 0; while ((reverse_induction) >= 0) { forward_induction++; a[reverse_induction] = forward_induction; --reverse_induction; } } radar://13862901 llvm-svn: 181667
-
Arnold Schwaighofer authored
No functionality change intended. llvm-svn: 181666
-
Arnold Schwaighofer authored
No functionality change intended. llvm-svn: 181665
-
- May 11, 2013
-
-
David Majnemer authored
Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively fold away urem instructions. llvm-svn: 181661
-
Nadav Rotem authored
For example: bar() { int a = A[i]; int b = A[i+1]; B[i] = a; B[i+1] = b; foo(a); <--- a is used outside the vectorized expression. } llvm-svn: 181648
-
Nadav Rotem authored
llvm-svn: 181647
-
- May 10, 2013
-
-
Benjamin Kramer authored
The shift amount may be larger than the type leading to undefined behavior. Limit the transform to constant shift amounts. While there update the bits to clear in the result which may enable additional optimizations. PR15959. llvm-svn: 181604
-
Benjamin Kramer authored
PR15952. llvm-svn: 181586
-
- May 09, 2013
-
-
Dmitri Gribenko authored
llvm-svn: 181551
-
Shuxin Yang authored
iteration. This on step toward non-iterative GVN. My local hack suggests that getting rid of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++). I cannot explain why not 2x or more at this moment. llvm-svn: 181532
-
Rafael Espindola authored
When we replace an internal alias with its target, be careful not to replace the entry in llvm.used (and llvm.compiler_used). llvm-svn: 181524
-