- Apr 30, 2012
-
-
Bill Wendling authored
Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If the pass is *sure* that it thinks it knows what it's doing, then it may go ahead and specify that the landing pad can have its critical edge split. The loop unswitch pass is one of these passes. It will split the critical edges of all edges coming from a loop to a landing pad not within the loop. Doing so will retain important loop analysis information, such as loop simplify. llvm-svn: 155817
-
Bill Wendling authored
llvm-svn: 155816
-
Bill Wendling authored
llvm-svn: 155813
-
Rafael Espindola authored
inputs. llvm-svn: 155809
-
- Apr 27, 2012
-
-
Hal Finkel authored
Target specific types should not be vectorized. As a practical matter, these types are already register matched (at least in the x86 case), and codegen does not always work correctly (at least in the ppc case, and this is not worth fixing because ppc_fp128 is currently broken and will probably go away soon). llvm-svn: 155729
-
David Blaikie authored
llvm-svn: 155727
-
Dan Gohman authored
llvm-svn: 155725
-
Mon P Wang authored
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow issues. <rdar://problem/11286839>. llvm-svn: 155722
-
Kostya Serebryany authored
llvm-svn: 155701
-
Kostya Serebryany authored
llvm-svn: 155698
-
Jakob Stoklund Olesen authored
The required checks are moved to ChainInstruction() itself and the policy decisions are moved to IVChain::isProfitableInc(). Also cache the ExprBase in IVChain to avoid frequent recomputations. No functional change intended. llvm-svn: 155676
-
Jakob Stoklund Olesen authored
No functional change intended. llvm-svn: 155675
-
Chad Rosier authored
(x & y) | (x ^ y) -> x | y (x & y) + (x ^ y) -> x | y Patch by Manman Ren. rdar://10770603 llvm-svn: 155674
-
- Apr 26, 2012
-
-
Chandler Carruth authored
elements to minimize the number of multiplies required to compute the final result. This uses a heuristic to attempt to form near-optimal binary exponentiation-style multiply chains. While there are some cases it misses, it seems to at least a decent job on a very diverse range of inputs. Initial benchmarks show no interesting regressions, and an 8% improvement on SPASS. Let me know if any other interesting results (in either direction) crop up! Credit to Richard Smith for the core algorithm, and helping code the patch itself. llvm-svn: 155616
-
- Apr 25, 2012
-
-
Jakob Stoklund Olesen authored
llvm-svn: 155567
-
Lang Hames authored
in poor taste. Talking through some alternate solutions with Chandler. llvm-svn: 155530
-
Dan Gohman authored
of a precise count. Also, move RRInfo's Partial field into PtrState, now that it won't increase the size. llvm-svn: 155513
-
Dan Gohman authored
These lists exclude invoke unwind edges and loop backedges which are being ignored. This makes it easier to ignore them consistently. llvm-svn: 155500
-
- Apr 24, 2012
-
-
Lang Hames authored
<rdar://problem/11291436>. llvm-svn: 155468
-
- Apr 23, 2012
-
-
Jakob Stoklund Olesen authored
Original commit message: Defer some shl transforms to DAGCombine. The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155362
-
Alexander Potapenko authored
Fix issue 67 by checking that the interface functions weren't redefined in the compiled source file. llvm-svn: 155346
-
Kostya Serebryany authored
llvm-svn: 155341
-
- Apr 20, 2012
-
-
Jakob Stoklund Olesen authored
While the patch was perfect and defect free, it exposed a really nasty bug in X86 SelectionDAG that caused an llc crash when compiling lencod. I'll put the patch back in after fixing the SelectionDAG problem. llvm-svn: 155181
-
Bill Wendling authored
llvm-svn: 155166
-
- Apr 19, 2012
-
-
Dan Gohman authored
loop repeatedlt making the same change. This is for rdar://11256239. llvm-svn: 155160
-
Jakob Stoklund Olesen authored
The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155136
-
Dan Gohman authored
a function with arguments. This fixes rdar://11265785. llvm-svn: 155073
-
- Apr 18, 2012
-
-
Bill Wendling authored
If the loop contains invoke instructions, whose unwind edge escapes the loop, then don't try to unswitch the loop. Doing so may cause the unwind edge to be split, which not only is non-trivial but doesn't preserve loop simplify information. Fixes PR12573 llvm-svn: 154987
-
Andrew Trick authored
This introduces a threshold of 200 IV Users, which is very conservative but should be sufficient to avoid serious compile time sink or stack overflow. The llvm test-suite with LTO never exceeds 190 users per loop. The bug doesn't relate to a specific type of loop. Checking in an arbitrary giant loop as a unit test would be silly. Fixes rdar://11262507. llvm-svn: 154983
-
Joe Groff authored
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint llvm-svn: 154960
-
- Apr 16, 2012
-
-
Hal Finkel authored
llvm-svn: 154810
-
Bill Wendling authored
llvm-svn: 154793
-
Hal Finkel authored
llvm-svn: 154787
-
- Apr 14, 2012
-
-
Hal Finkel authored
When vectorizing pointer types it is important to realize that potential pairs cannot be connected via the address pointer argument of a load or store. This is because even after vectorization, the address is still a scalar because the address of the higher half of the pair is implicit from the address of the lower half (it need not be, and should not be, explicitly computed). llvm-svn: 154735
-
Hal Finkel authored
llvm-svn: 154734
-
- Apr 13, 2012
-
-
Hal Finkel authored
llvm-svn: 154700
-
Dan Gohman authored
llvm-svn: 154687
-
Dan Gohman authored
their argument as "escape" points for objc_retainBlock optimization. This fixes rdar://11229925. llvm-svn: 154682
-
Hal Finkel authored
As has been suggested by Duncan and others, Early-CSE and GVN should do similar redundancy elimination, but Early-CSE is much less expensive. Most of my autovectorization benchmarks show a performance regresion, but all of these are < 0.1%, and so I think that it is still worth using the less expensive pass. llvm-svn: 154673
-
Dan Gohman authored
library return value optimization for phi uses. Even when the phi itself is not dominated, the specific use may be dominated. llvm-svn: 154647
-