- Jul 27, 2013
-
-
Nadav Rotem authored
SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize. llvm-svn: 187267
-
Nadav Rotem authored
SLP Vectorizer: Disable the vectorization of non power of two chains, such as <3 x float>, because we dont have a good cost model for these types. llvm-svn: 187265
-
Owen Anderson authored
llvm-svn: 187253
-
- Jul 26, 2013
-
-
Owen Anderson authored
also worthwhile for it to look through FP extensions and truncations, whose application commutes with fneg. llvm-svn: 187249
-
Stephen Lin authored
llvm-svn: 187225
-
Chandler Carruth authored
robust. It now uses an InstVisitor and worklist to actually walk the uses of the Alloca transitively and detect the pattern which we can directly promote: loads & stores of the whole alloca and instructions we can completely ignore. Also, with this new implementation teach both the predicate for testing whether we can promote and the promotion engine itself to use the same code so we no longer have strange divergence between the two code paths. I've added some silly test cases to demonstrate that we can handle slightly more degenerate code patterns now. See the below for why this is even interesting. Performance impact: roughly 1% regression in the performance of SROA or ScalarRepl on a large C++-ish test case where most of the allocas are basically ready for promotion. The reason is because of silly redundant work that I've left FIXMEs for and which I'll address in the next commit. I wanted to separate this commit as it changes the behavior. Once the redundant work in removing the dead uses of the alloca is fixed, this code appears to be faster than the old version. =] So why is this useful? Because the previous requirement for promotion required a *specific* visit pattern of the uses of the alloca to verify: we *had* to look for no more than 1 intervening use. The end goal is to have SROA automatically detect when an alloca is already promotable and directly hand it to the mem2reg machinery rather than trying to partition and rewrite it. This is a 25% or more performance improvement for SROA, and a significant chunk of the delta between it and ScalarRepl. To get there, we need to make mem2reg actually capable of promoting allocas which *look* promotable to SROA without have SROA do tons of work to massage the code into just the right form. This is actually the tip of the iceberg. There are tremendous potential savings we can realize here by de-duplicating work between mem2reg and SROA. llvm-svn: 187191
-
Bill Schmidt authored
This patch provides basic support for powerpc64le as an LLVM target. However, use of this target will not actually generate little-endian code. Instead, use of the target will cause the correct little-endian built-in defines to be generated, so that code that tests for __LITTLE_ENDIAN__, for example, will be correctly parsed for syntax-only testing. Code generation will otherwise be the same as powerpc64 (big-endian), for now. The patch leaves open the possibility of creating a little-endian PowerPC64 back end, but there is no immediate intent to create such a thing. The LLVM portions of this patch simply add ppc64le coverage everywhere that ppc64 coverage currently exists. There is nothing of any import worth testing until such time as little-endian code generation is implemented. In the corresponding Clang patch, there is a new test case variant to ensure that correct built-in defines for little-endian code are generated. llvm-svn: 187179
-
- Jul 25, 2013
-
-
Rafael Espindola authored
The language reference says that: "If a symbol appears in the @llvm.used list, then the compiler, assembler, and linker are required to treat the symbol as if there is a reference to the symbol that it cannot see" Since even the linker cannot see the reference, we must assume that the reference can be using the symbol table. For example, a user can add __attribute__((used)) to a debug helper function like dump and use it from a debugger. llvm-svn: 187103
-
Nick Lewycky authored
llvm-svn: 187099
-
Rafael Espindola authored
Thanks to Nick Lewycky for noticing it. llvm-svn: 187098
-
- Jul 24, 2013
-
-
Benjamin Kramer authored
While there shrink a dangerously large SmallPtrSet. llvm-svn: 187050
-
Chandler Carruth authored
schedule an alloca for another iteration in SROA. This only showed up with a mixture of promotable and unpromotable selects and phis. Added a test case for this. llvm-svn: 187031
-
Chandler Carruth authored
pending speculation for a phi node. The problem here is that we were using growth of the specluation set as an indicator of whether speculation would occur, and if the phi node is already in the set we don't see it grow. This is a symptom of the fact that this signal is a total hack. Unfortunately, I couldn't really come up with a non-hacky way of signaling that promotion remains valid *after* speculation occurs, such that we only speculate when all else looks good for promotion. In the end, I went with at least a much more explicit approach of doing the work of queuing inside the phi and select processing and setting a preposterously named flag to convey that we're in the special state of requiring speculating before promotion. Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing a testcase for this from a pretty giant, nasty assert in a big application. =] The testcase was excellent. llvm-svn: 187029
-
Matt Arsenault authored
llvm-svn: 186997
-
- Jul 23, 2013
-
-
Nick Lewycky authored
llvm-svn: 186893
-
Jakub Staszak authored
llvm-svn: 186892
-
Jakub Staszak authored
llvm-svn: 186890
-
Nadav Rotem authored
When we vectorize across multiple basic blocks we may vectorize PHINodes that create a cycle. We already break the cycle on phi-nodes, but arithmetic operations are still uplicated. This patch adds code that checks if the operation that we are vectorizing was vectorized during the visit of the operands and uses this value if it can. llvm-svn: 186883
-
Jakub Staszak authored
llvm-svn: 186880
-
- Jul 22, 2013
-
-
Jakub Staszak authored
llvm-svn: 186877
-
Matt Arsenault authored
llvm-svn: 186858
-
Nadav Rotem authored
Fix an obvious typo in the loop vectorizer where the cost model uses the wrong variable. The variable BlockCost is ignored. We don't have tests for the effect of if-conversion loops because it requires a big test (that includes if-converted loops) and it is difficult to find and balance a loop to do the right thing. llvm-svn: 186845
-
Nadav Rotem authored
llvm-svn: 186808
-
- Jul 21, 2013
-
-
Benjamin Kramer authored
llvm-svn: 186790
-
Chandler Carruth authored
to iterate over. llvm-svn: 186788
-
Nadav Rotem authored
llvm-svn: 186786
-
Chandler Carruth authored
helper function. This leaves both trivial cases handled entirely in helper functions and merely manages the list of allocas to process in the run method. The next step will be to handle all of the trivial promotion work prior to even creating the core class and the subsequent simplifications that enables. llvm-svn: 186784
-
Chandler Carruth authored
a single block into the helper routine. This takes advantage of the fact that we can directly replace uses prior to any store with undef to simplify matters and unconditionally promote allocas only used within one block. I've removed the special handling for the case of no stores existing. This has no semantic effect but might slow things down. I'll fix that in a later patch when I refactor this entire thing to be easier to manage the different cases. llvm-svn: 186783
-
Chandler Carruth authored
llvm-svn: 186782
-
Chandler Carruth authored
handles the general cases. The hope is to refactor this so that we don't end up building the entire class for the trivial cases. I also want to lift a lot of the early pre-processing in the initial segment of run() into a separate routine, and really none of it needs to happen inside the primary promotion class. These routines in particular used none of the actual state in the promotion class, so they don't really make sense as members. llvm-svn: 186781
-
Chandler Carruth authored
This struct is nicely independent of everything else, and we already needed a foward declaration here. It's simpler to just define it immediately. llvm-svn: 186780
-
Chandler Carruth authored
llvm-svn: 186779
-
Rafael Espindola authored
GlobalOpt simplifies llvm.compiler.used by removing any members that are also in the more strict llvm.used. Handle the special case where llvm.compiler.used becomes empty. llvm-svn: 186778
-
Chandler Carruth authored
that ensued from that. llvm-svn: 186777
-
Chandler Carruth authored
pattern and conform to the naming conventions. llvm-svn: 186776
-
Chandler Carruth authored
subsequent changes don't introduce inconsistencies. llvm-svn: 186775
-
Chandler Carruth authored
those baked into DenseMap now. llvm-svn: 186773
-
Chandler Carruth authored
functionality changed. llvm-svn: 186772
-
- Jul 20, 2013
-
-
Benjamin Kramer authored
While there replace an explicit struct with std::mem_fun. llvm-svn: 186761
-
Stephen Lin authored
llvm-svn: 186759
-