- Sep 03, 2013
-
-
Matt Arsenault authored
This is another one that doesn't matter much, but uses the right GEP index types in the first place. llvm-svn: 189854
-
Matt Arsenault authored
This doesn't actually matter, since alloca is always 0 address space, but this is more consistent. llvm-svn: 189853
-
Yi Jiang authored
1) If the width of vectorization list candidate is bigger than vector reg width, we will break it down to fit the vector reg. 2) We do not vectorize the width which is not power of two. The performance result shows it will help some spec benchmarks. mesa improved 6.97% and ammp improved 1.54%. llvm-svn: 189830
-
Evgeniy Stepanov authored
llvm-svn: 189796
-
Evgeniy Stepanov authored
Select condition shadow was being ignored resulting in false negatives. This change OR-s sign-extended condition shadow into the result shadow. llvm-svn: 189785
-
- Aug 31, 2013
-
-
Benjamin Kramer authored
The existing code missed some edge cases when e.g. we're going to emit sqrtf but only the availability of sqrt was checked. This happens on odd platforms like windows. llvm-svn: 189724
-
- Aug 30, 2013
-
-
Bill Wendling authored
llvm-svn: 189697
-
Benjamin Kramer authored
PR17026. Also avoid undefined shifts and shift amounts larger than 64 bits (those are always undef because we can't represent integer types that large). llvm-svn: 189672
-
Bill Wendling authored
llvm-svn: 189632
-
- Aug 29, 2013
-
-
Hal Finkel authored
Revert unintentional commit (of an unreviewed change). Original commit message: Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189566
-
Hal Finkel authored
Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189565
-
Nadav Rotem authored
Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons: 1. They are a kind of cannonicalization. 2. The performance measurements show that it is better to keep them in. There should be no functional change if you are not enabling the LateVectorization mode. llvm-svn: 189539
-
Matt Arsenault authored
llvm-svn: 189524
-
- Aug 28, 2013
-
-
Hal Finkel authored
When unrolling is disabled in the pass manager, the loop vectorizer should also not unroll loops. This will allow the -fno-unroll-loops option in Clang to behave as expected (even for vectorizable loops). The loop vectorizer's -force-vector-unroll option will (continue to) override the pass-manager setting (including -force-vector-unroll=0 to force use of the internal auto-selection logic). In order to test this, I added a flag to opt (-disable-loop-unrolling) to force disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also, this fixes a small bug in opt where the loop vectorizer was enabled only after the pass manager populated the queue of passes (the global_alias.ll test needed a slight update to the RUN line as a result of this fix). llvm-svn: 189499
-
Alexey Samsonov authored
llvm-svn: 189473
-
Peter Collingbourne authored
Differential Revision: http://llvm-reviews.chandlerc.com/D1503 llvm-svn: 189408
-
- Aug 27, 2013
-
-
Nadav Rotem authored
This patch merges LoopVectorize of InnerLoopVectorizer and InnerLoopUnroller by adding checks for VF=1. This helps in erasing the Unroller code that is almost identical to the InnerLoopVectorizer code. llvm-svn: 189391
-
Michael Gottesman authored
Noticed by Stephen Checkoway <s@pahtak.org>. llvm-svn: 189312
-
Matt Arsenault authored
The builder inserts from before the insert point, not after, so this would insert before the last instruction in the bundle instead of after it. I'm not sure if this can actually be a problem with any of the current insertions. llvm-svn: 189285
-
Nadav Rotem authored
This patch enables unrolling of loops when vectorization is legal but not profitable. We add a new class InnerLoopUnroller, that extends InnerLoopVectorizer and replaces some of the vector-specific logic with scalars. This patch does not introduce any runtime regressions and improves the following workloads: SingleSource/Benchmarks/Shootout/matrix -22.64% SingleSource/Benchmarks/Shootout-C++/matrix -13.06% External/SPEC/CINT2006/464_h264ref/464_h264ref -3.99% SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -1.95% llvm-svn: 189281
-
- Aug 26, 2013
-
-
Yi Jiang authored
llvm-svn: 189265
-
Matt Arsenault authored
llvm-svn: 189264
-
Matt Arsenault authored
llvm-svn: 189234
-
Matt Arsenault authored
llvm-svn: 189233
-
- Aug 24, 2013
-
-
Matt Arsenault authored
llvm-svn: 189179
-
Benjamin Kramer authored
Replace instances of this scattered around the code base. llvm-svn: 189169
-
- Aug 23, 2013
-
-
Peter Collingbourne authored
llvm-svn: 189133
-
Evgeniy Stepanov authored
The code was erroneously reading overflow area shadow from the TLS slot, bypassing the local copy. Reading shadow directly from TLS is wrong, because it can be overwritten by a nested vararg call, if that happens before va_start. llvm-svn: 189104
-
Richard Sandiford authored
...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097
-
Alexey Samsonov authored
llvm-svn: 189091
-
Michael Gottesman authored
Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer stale to the point of not working and more resilient to debug info changes. The current version of StripDeadDebugInfo became stale and no longer actually worked since it was expecting an older version of debug info. This patch updates it to use DebugInfoFinder and the modern DebugInfo classes as much as possible to make it more redundent to such changes. Additionally, the only place where that was avoided (the code where we replace the old sets with the new), I call verify on the DIContextUnit implying that if the format changes and my live set changes no longer make sense an assert will be hit. In order to ensure that that occurs I have included a test case. The actual stripping of the dead debug info follows the same strategy as was used before in this class: find the live set and replace the old set in the given compile unit (which may contain dead global variables/functions) with the new live one. llvm-svn: 189078
-
- Aug 22, 2013
-
-
Peter Collingbourne authored
DataFlowSanitizer: Replace non-instrumented aliases of instrumented functions, and vice versa, with wrappers. Differential Revision: http://llvm-reviews.chandlerc.com/D1442 llvm-svn: 189054
-
Peter Collingbourne authored
Differential Revision: http://llvm-reviews.chandlerc.com/D1441 llvm-svn: 189053
-
Peter Collingbourne authored
DFSan changes the ABI of each function in the module. This makes it possible for a function with the native ABI to be called with the instrumented ABI, or vice versa, thus possibly invoking undefined behavior. A simple way of statically detecting instances of this problem is to prepend the prefix "dfs$" to the name of each instrumented-ABI function. This will not catch every such problem; in particular function pointers passed across the instrumented-native barrier cannot be used on the other side. These problems could potentially be caught dynamically. Differential Revision: http://llvm-reviews.chandlerc.com/D1373 llvm-svn: 189052
-
Chandler Carruth authored
using GEPs. Previously, it used a number of different heuristics for analyzing the GEPs. Several of these were conservatively correct, but failed to fall back to SCEV even when SCEV might have given a reasonable answer. One was simply incorrect in how it was formulated. There was good code already to recursively evaluate the constant offsets in GEPs, look through pointer casts, etc. I gathered this into a form code like the SLP code can use in a previous commit, which allows all of this code to become quite simple. There is some performance (compile time) concern here at first glance as we're directly attempting to walk both pointers constant GEP chains. However, a couple of thoughts: 1) The very common cases where there is a dynamic pointer, and a second pointer at a constant offset (usually a stride) from it, this code will actually not do any unnecessary work. 2) InstCombine and other passes work very hard to collapse constant GEPs, so it will be rare that we iterate here for a long time. That said, if there remain performance problems here, there are some obvious things that can improve the situation immensely. Doing a vectorizer-pass-wide memoizer for each individual layer of pointer values, their base values, and the constant offset is likely to be able to completely remove redundant work and strictly limit the scaling of the work to scrape these GEPs. Since this optimization was not done on the prior version (which would still benefit from it), I've not done it here. But if folks have benchmarks that slow down it should be straight forward for them to add. I've added a test case, but I'm not really confident of the amount of testing done for different access patterns, strides, and pointer manipulation. llvm-svn: 189007
-
Matt Arsenault authored
llvm-svn: 188980
-
Michael Gottesman authored
llvm-svn: 188957
-
Michael Gottesman authored
llvm-svn: 188956
-
Yunzhong Gao authored
Replace "(255 & value)" with "(0xFF & value)" to improve clarity. llvm-svn: 188941
-
- Aug 21, 2013
-
-
Matt Arsenault authored
llvm-svn: 188926
-