- Jan 27, 2014
-
-
Tobias Grosser authored
This does not only seem helpful for Polly, but it should help in general to further reduce bugs. llvm-svn: 200225
-
Tobias Grosser authored
llvm-svn: 200224
-
Simon Atanasyan authored
llvm-svn: 200223
-
Simon Atanasyan authored
llvm-svn: 200222
-
Simon Atanasyan authored
llvm-svn: 200221
-
Tobias Grosser authored
Reiterating: llvm-gcc is dead since a long time. llvm-svn: 200220
-
Chandler Carruth authored
cold loops as-if they were being optimized for size. Nothing fancy here. Simply test case included. The nice thing is that we can now incrementally build on top of this to drive other heuristics. All of the infrastructure work is done to get the profile information into this layer. The remaining work necessary to make this a fully general purpose loop unroller for very hot loops is to make it a fully general purpose loop unroller. Things I know of but am not going to have time to benchmark and fix in the immediate future: 1) Don't disable the entire pass when the target is lacking vector registers. This really doesn't make any sense any more. 2) Teach the unroller at least and the vectorizer potentially to handle non-if-converted loops. This is trivial for the unroller but hard for the vectorizer. 3) Compute the relative hotness of the loop and thread that down to the various places that make cost tradeoffs (very likely only the unroller makes sense here, and then only when dealing with loops that are small enough for unrolling to not completely blow out the LSD). I'm still dubious how useful hotness information will be. So far, my experiments show that if we can get the correct logic for determining when unrolling actually helps performance, the code size impact is completely unimportant and we can unroll in all cases. But at least we'll no longer burn code size on cold code. One somewhat unrelated idea that I've had forever but not had time to implement: mark all functions which are only reachable via the global constructors rigging in the module as optsize. This would also decrease the impact of any more aggressive heuristics here on code size. llvm-svn: 200219
-
Benjamin Kramer authored
Insert before the terminating instruction of the dominating block instead. llvm-svn: 200218
-
Kostya Serebryany authored
[sanitizer] revert r200197: the buggy kernel (https://bugzilla.kernel.org/show_bug.cgi?id=67651) is almost unusable with asan even with this workaround (too slow), so this workaround makes no sense. The asan/msan bootstrap bot was changed to use a non-buggy kernel llvm-svn: 200217
-
Benjamin Kramer authored
llvm-svn: 200216
-
Chandler Carruth authored
to stabilize a test that really is trying to test generic behavior and not a specific target's behavior. llvm-svn: 200215
-
Chandler Carruth authored
object and fewer pointless variables. Also, add a clarifying comment and a FIXME because the code which disables *all* vectorization if we can't use implicit floating point instructions just makes no sense at all. llvm-svn: 200214
-
Chandler Carruth authored
powers of two. This is essentially always the correct thing given the impact on alignment, scaling factors that can be used in addressing modes, etc. Also, fix the management of the unroll vs. small loop cost to more accurately model things with this world. Enhance a test case to actually exercise more of the unroll machinery if using synthetic constants rather than a specific target model. Before this change, with the added flags this test will unroll 3 times instead of either 2 or 4 (the two sensible answers). While I don't expect this to make a huge difference, if there are lots of loops sitting right on the edge of hitting the 'small unroll' factor, they might change behavior. However, I've benchmarked moving the small loop cost up and down in many various ways and by a huge factor (2x) without seeing more than 0.2% code size growth. Small adjustments such as the series that led up here have led to about 1% improvement on some benchmarks, but it is very close to the noise floor so I mostly checked that nothing regressed. Let me know if you see bad behavior on other targets but I don't expect this to be a sufficiently dramatic change to trigger anything. llvm-svn: 200213
-
Chandler Carruth authored
with the unrolling behavior in the loop vectorizer. No functionality changed at this point. These are a bit hack-y, but talking with Hal, there doesn't seem to be a cleaner way to easily experiment with different thresholds here and he was also interested in them so I wanted to commit them. Suggestions for improvement are very welcome here. llvm-svn: 200212
-
Chandler Carruth authored
number of vector registers rather than toggling between vector and scalar register number based on VF. I don't have a test case as I spotted this by inspection and on X86 it only makes a difference if your target is lacking SSE and thus has *no* vector registers. If someone wants to add a test case for this for ARM or somewhere else where this is more significant, that would be awesome. Also made the variable name a bit more sensible while I'm here. llvm-svn: 200211
-
Nick Lewycky authored
Fix crasher introduced in r200203 and caught by a libc++ buildbot. Don't assume that getMulExpr returns a SCEVMulExpr, it may have simplified it to something else! llvm-svn: 200210
-
Tobias Grosser authored
llvm-svn: 200209
-
Tobias Grosser authored
Restricting Polly to -O3 does not make a lot of sense as it is opt-in anyway and users who specifically request it should get it. If this causes performance problems we should rather address them by scheduling the right cleanup passes then just prevent the user from trying. Also restricting Polly to -O3 made bugpoint not work with the -O3 flag and polly enabled. llvm-svn: 200208
-
Tobias Grosser authored
Those test cases should be tested in the LLVM test suite. For Polly we should extract regression tests for the individual passes. llvm-svn: 200206
-
Tobias Grosser authored
The polly test suite is now -O3 clean. llvm-svn: 200205
-
Tobias Grosser authored
This is not only not necessary, but in case -03 changes this can actually cause arbitrarily failing test cases such as, e.g., a recent change by Chandler that caused -O3 to unroll the loop body, which made the loop we wanted to detect disappear and consequently this test case fail. llvm-svn: 200204
-
Nick Lewycky authored
Teach SCEV to handle more cases of 'and X, CST', specifically where CST is any number of contiguous 1 bits in a row, with any number of leading and trailing 0 bits. Unfortunately, this in turn led to some lower quality SCEVs due to some different paths through expression simplification, so add getUDivExactExpr and use it. This fixes all instances of the problems that I found, but we can make that function smarter as necessary. Merge test "xor-and.ll" into "and-xor.ll" since I needed to update it anyways. Test 'nsw-offset.ll' analyzes a little deeper, %n now gets a scev in terms of %no instead of a SCEVUnknown. llvm-svn: 200203
-
Stepan Dyatkovskiy authored
llvm-svn: 200202
-
Stepan Dyatkovskiy authored
Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from mem-ops sequence sorting. Consider, how MergeConsequtiveStores works for next example: store i8 1, a[0] store i8 2, a[1] store i8 3, a[1] ; a[1] again. return ; DAG starts here 1. Method will collect all the 3 stores. 2. It sorts them by distance from the base pointer (farthest with highest index). 3. It takes first consecutive non-overlapping stores and (if possible) replaces them with a single store instruction. The point is, we can't determine here which 'store' instruction would be the second after sorting ('store 2' or 'store 3'). It happens that 'store 3' would be the second, and 'store 2' would be the third. So after merging we have the next result: store i16 (1 | 3 << 8), base ; is a[0] but bit-casted to i16 store i8 2, a[1] So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1]. Fix: In sort routine just also take into account mem-op sequence number. llvm-svn: 200201
-
Evgeniy Stepanov authored
llvm-svn: 200200
-
Evgeniy Stepanov authored
llvm-svn: 200199
-
Chandler Carruth authored
LoopVectorize pass. The logic here doesn't make much sense. We *only* unrolled if the unvectorized loop was a reduction loop with a single basic block *and* small loop body. The reduction part in particular doesn't make much sense. Instead, if we just fall through to the vectorized unroll logic it makes more sense of unrolling if there is a vectorized reduction that could be hacked on by the SLP vectorizer *or* if the loop is small. This is mostly a cleanup and nothing in the test suite really exercises this, but I did run benchmarks across this change and saw no really significant changes. llvm-svn: 200198
-
Kostya Serebryany authored
[sanitizer] increase the mmap granularity in sanitizer allocator from 2^16 to 2^18. This is a partial workaround for the fresh Kernel bug https://bugzilla.kernel.org/show_bug.cgi?id=67651 llvm-svn: 200197
-
Michel Danzer authored
Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200196
-
Michel Danzer authored
Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200195
-
Alp Toker authored
There are a couple of interesting things here that we want to check over (particularly the expecting asserts in StringRef) and get right for general use in ADT so hold back on this one. For clang we have a workable templated solution to use in the meanwhile. This reverts commit r200187. llvm-svn: 200194
-
Alp Toker authored
We might want try a different strategy so hold back on this for the moment, but fix the off-by-one error in the original function template. This reverts commit r200190. llvm-svn: 200193
-
Rafael Espindola authored
Testing this also found the missing '\n' after .frame that this patch also fixes. llvm-svn: 200192
-
Rui Ueyama authored
editbin.exe and link.exe both accepts /highentropyva option to set this bit, so doing s/VIRTUAL_ADDRESS/VA/ should make sense. llvm-svn: 200191
-
Alp Toker authored
This is one of various functions in clang that don't handle arbitrary strings well and can benefit from compile-time safety checks. Also fixes an off-by-one error that caused one additional null byte to get added to the end of custom diagnostic descriptions. ConstStringRef handles tricky details like that for us now. Requires supporting changes in LLVM r200187. llvm-svn: 200190
-
Richard Smith authored
throw-expression, the result is also a glvalue and isn't unnecessarily coerced to a prvalue. llvm-svn: 200189
-
Alp Toker authored
StringRef is a low-level data wrapper that shouldn't know about language strings like 'true' and 'false' whereas StringExtras is just the place for higher-level utilities. llvm-svn: 200188
-
Alp Toker authored
(1) Add llvm_expect(), an asserting macro that can be evaluated as a constexpr expression as well as a runtime assert or compiler hint in release builds. This technique can be used to construct functions that are both unevaluated and compiled depending on usage. (2) Update StringRef using llvm_expect() to preserve runtime assertions while extending the same checks to static asserts in C++11 builds that support the feature. (3) Introduce ConstStringRef, a strong subclass of StringRef that references compile-time constant strings. It's convertible to, but not from, ordinary StringRef and thus can be used to add compile-time safety to various interfaces in LLVM and clang that only accept fixed inputs such as diagnostic format strings that tend to get misused. llvm-svn: 200187
-
Rafael Espindola authored
llvm-svn: 200186
-
Rui Ueyama authored
llvm-svn: 200185
-