- Oct 29, 2012
-
-
Duncan Sands authored
wrapper returns a vector of integers when passed a vector of pointers) by having getIntPtrType itself return a vector of integers in this case. Outside of this wrapper, I didn't find anywhere in the codebase that was relying on the old behaviour for vectors of pointers, so give this a whirl through the buildbots. llvm-svn: 166939
-
Nadav Rotem authored
Change the PassManagerBuilder (used by -O3) loop vectorizer flag from -vectorize to -vectorize-loops because we dont want to share the same flag as the bb-vectorizer. llvm-svn: 166937
-
Rafael Espindola authored
split module can see each other. If it is keeping a symbol that already has a non local linkage, it doesn't need to change it. llvm-svn: 166908
-
Rafael Espindola authored
output of both llvm-extract foo.ll -func=bar and llvm-extract foo.ll -func=bar -delete so the two new files could not be linked together anymore. With this change alias are handled almost like functions and global variables. Almost because with alias we cannot just clear the initializer/body, we have to create a new declaration and replace the alias with it. The net result is that now the output of the above commands can be linked even if foo.ll has aliases. llvm-svn: 166907
-
- Oct 27, 2012
-
-
Benjamin Kramer authored
I don't think this is possible with the current implementation but that may change eventually. llvm-svn: 166877
-
Benjamin Kramer authored
This turns loops like for (unsigned i = 0; i != n; ++i) p[i] = p[i+1]; into memmove, which has a highly optimized implementation in most libcs. This was really easy with the new DependenceAnalysis :) llvm-svn: 166875
-
Benjamin Kramer authored
Requires a lot less code and complexity on loop-idiom's side and the more precise analysis can catch more cases, like the one I included as a test case. This also fixes the edge-case miscompilation from PR9481. Compile time performance seems to be slightly worse, but this is mostly due to an extra LCSSA run scheduled by the PassManager and should be fixed there. llvm-svn: 166874
-
Hal Finkel authored
The monolithic interface for instruction costs has been split into several functions. This is the corresponding change. No functionality change is intended. llvm-svn: 166865
-
Nadav Rotem authored
1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to return the type of the split result. 2. Change the maximum vectorization width from 4 to 8. 3. A test for both. llvm-svn: 166864
-
Nadav Rotem authored
Refactor the VectorTargetTransformInfo interface. Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc. Port the LoopVectorizer to the new API. The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions. llvm-svn: 166836
-
- Oct 26, 2012
-
-
Rafael Espindola authored
list of externals. This makes sense since a shared library with no symbols can still be useful if it has static constructors. llvm-svn: 166795
-
Benjamin Kramer authored
This is currently true, but may change when DA grows more aggressive caching. Without this setting it's impossible to use DA from a LoopPass because DA is a function pass and cannot be properly scheduled in between LoopPasses. The LoopManager reacts to this with an infinite loop which made this really annoying to debug. llvm-svn: 166788
-
Benjamin Kramer authored
The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable to analyzable but the LCSSA bug is very nasty. It only comes into play with a specific order of the LoopPassManager worklist and can cause actual miscompilations, when a SCEV refers to a value that has been replaced with PHI node. SCEVExpander may then insert code into the wrong place, either violating domination or randomly miscompiling stuff. Comes with an extensive test case reduced from the test-suite with bugpoint+SCEVValidator. llvm-svn: 166787
-
Hal Finkel authored
This change reflects VTTI refactoring; no functionality change intended. llvm-svn: 166752
-
Hal Finkel authored
Once vector-of-pointer support works, then this can be reverted. llvm-svn: 166741
-
Hal Finkel authored
This is needed so that perl's SHA can be compiled (otherwise BBVectorize takes far too long to find its fixed point). I'll try to come up with a reduced test case. llvm-svn: 166738
-
- Oct 25, 2012
-
-
Hal Finkel authored
This is the first of several steps to incorporate information from the new TargetTransformInfo infrastructure into BBVectorize. Two things are done here: 1. Target information is used to determine if it is profitable to fuse two instructions. This means that the cost of the vector operation must not be more expensive than the cost of the two original operations. Pairs that are not profitable are no longer considered (because current cost information is incomplete, for intrinsics for example, equal-cost pairs are still considered). 2. The 'cost savings' computed for the profitability check are also used to rank the DAGs that represent the potential vectorization plans. Specifically, for nodes of non-trivial depth, the cost savings is used as the node weight. The next step will be to incorporate the shuffle costs into the DAG weighting; this will give the edges of the DAG weights as well. Once that is done, when target information is available, we should be able to dispense with the depth heuristic. llvm-svn: 166716
-
Nadav Rotem authored
llvm-svn: 166715
-
Jakob Stoklund Olesen authored
The isValueEqualityComparison() guard at the top of SimplifySwitch() only applies to some of the possible transformations. The newer transformations work just fine on large switches, and the check on predecessor count is nonsensical. llvm-svn: 166710
-
Chandler Carruth authored
smaller integer loads and stores. The high-level motivation is that the frontend sometimes generates a single whole-alloca integer load or store during ABI lowering of splittable allocas. We need to be able to break this apart in order to see the underlying elements and properly promote them to SSA values. The hope is that this fixes some performance regressions on x86-32 with the new SROA pass. Unfortunately, this causes quite a bit of churn in the test cases, and bloats some IR that comes out. When we see an alloca that consists soley of bits and bytes being extracted and re-inserted, we now do some splitting first, before building widened integer "bucket of bits" representations. These are always well folded by instcombine however, so this shouldn't actually result in missed opportunities. If this splitting of all-integer allocas does cause problems (perhaps due to smaller SSA values going into the RA), we could potentially go to some extreme measures to only do this integer splitting trick when there are non-integer component accesses of an alloca, but discovering this is quite expensive: it adds yet another complete walk of the recursive use tree of the alloca. Either way, I will be watching build bots and LNT bots to see what fallout there is here. If anyone gets x86-32 numbers before & after this change, I would be very interested. llvm-svn: 166662
-
Nadav Rotem authored
Patch by Paul Redmond <paul.redmond@intel.com>. llvm-svn: 166649
-
Nadav Rotem authored
llvm-svn: 166643
-
Nadav Rotem authored
llvm-svn: 166642
-
Micah Villmow authored
llvm-svn: 166634
-
- Oct 24, 2012
-
-
Hal Finkel authored
GVN will now generate ptrtoint instructions for vectors of pointers. Fixes PR14166. llvm-svn: 166624
-
Nadav Rotem authored
llvm-svn: 166622
-
Nadav Rotem authored
llvm-svn: 166620
-
Micah Villmow authored
llvm-svn: 166607
-
Micah Villmow authored
Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this! llvm-svn: 166596
-
Micah Villmow authored
llvm-svn: 166591
-
Micah Villmow authored
This checkin also adds in some tests that utilize these paths and updates some of the clients. llvm-svn: 166578
-
- Oct 23, 2012
-
-
Nadav Rotem authored
Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news. PR14158. llvm-svn: 166491
-
Duncan Sands authored
llvm-svn: 166475
-
Duncan Sands authored
%V = mul i64 %N, 4 %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V into %t1 = getelementptr i32* %arr, i32 %N %t = bitcast i32* %t1 to i8* incorporating the multiplication into the getelementptr. This happens all the time in dragonegg, for example for int foo(int *A, int N) { return A[N]; } because gcc turns this into byte pointer arithmetic before it hits the plugin: D.1590_2 = (long unsigned int) N_1(D); D.1591_3 = D.1590_2 * 4; D.1592_5 = A_4(D) + D.1591_3; D.1589_6 = *D.1592_5; return D.1589_6; The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the transform fires on. An analogous transform (with no testcases!) already existed for bitcasts of arrays, so I rewrote it to share code with this one. llvm-svn: 166474
-
Richard Smith authored
every TU where it's implicitly instantiated, even if there's an implicit instantiation for the same types available in another TU. llvm-svn: 166470
-
Julien Lerouge authored
llvm-svn: 166456
-
Julien Lerouge authored
llvm-svn: 166454
-
- Oct 22, 2012
-
-
Julien Lerouge authored
deterministic, replace it with a DenseMap<std::pair<unsigned, unsigned>, PHINode*> (we already have a map from BasicBlock to unsigned). <rdar://problem/12541389> llvm-svn: 166435
-
Nadav Rotem authored
Fix by Shivarama Rao <Shivarama.Rao@amd.com> llvm-svn: 166427
-
Argyrios Kyrtzidis authored
llvm-svn: 166424
-