Commits · 6a62ebdd0f30bea0112ba48e7b7b9966ff1a9505 · Roger Ferrer / llvm-epi-0.8

Jan 27, 2014

TODO: Add topic 'teach bugpoint to extract regions' · 6a62ebdd

Tobias Grosser authored Jan 27, 2014

This does not only seem helpful for Polly, but it should help in general to
further reduce bugs.

llvm-svn: 200225

6a62ebdd

Detection: Allow to filter the regions that can be detected · 4449e526
Tobias Grosser authored Jan 27, 2014
```
llvm-svn: 200224
```
4449e526
[Mips] Fix __mips macro definition. · 26292ccc
Simon Atanasyan authored Jan 27, 2014
```
llvm-svn: 200223
```
26292ccc
[Mips] Change default CPU for MIPS 32/64 targets. Now they are mips32r2/mips64r2 respectively. · 1a3665b6
Simon Atanasyan authored Jan 27, 2014
```
llvm-svn: 200222
```
1a3665b6
[Mips] Add tests to check MIPS arch macros. · 682b49b9
Simon Atanasyan authored Jan 27, 2014
```
llvm-svn: 200221
```
682b49b9
Do not reference llvm-gcc from bugpoint · 0811b945
Tobias Grosser authored Jan 27, 2014
```
Reiterating: llvm-gcc is dead since a long time.
llvm-svn: 200220
```
0811b945

[vectorize] Initial version of respecting PGO in the vectorizer: treat · e24f3973

Chandler Carruth authored Jan 27, 2014

cold loops as-if they were being optimized for size.

Nothing fancy here. Simply test case included. The nice thing is that we
can now incrementally build on top of this to drive other heuristics.
All of the infrastructure work is done to get the profile information
into this layer.

The remaining work necessary to make this a fully general purpose loop
unroller for very hot loops is to make it a fully general purpose loop
unroller. Things I know of but am not going to have time to benchmark
and fix in the immediate future:

1) Don't disable the entire pass when the target is lacking vector
   registers. This really doesn't make any sense any more.
2) Teach the unroller at least and the vectorizer potentially to handle
   non-if-converted loops. This is trivial for the unroller but hard for
   the vectorizer.
3) Compute the relative hotness of the loop and thread that down to the
   various places that make cost tradeoffs (very likely only the
   unroller makes sense here, and then only when dealing with loops that
   are small enough for unrolling to not completely blow out the LSD).

I'm still dubious how useful hotness information will be. So far, my
experiments show that if we can get the correct logic for determining
when unrolling actually helps performance, the code size impact is
completely unimportant and we can unroll in all cases. But at least
we'll no longer burn code size on cold code.

One somewhat unrelated idea that I've had forever but not had time to
implement: mark all functions which are only reachable via the global
constructors rigging in the module as optsize. This would also decrease
the impact of any more aggressive heuristics here on code size.

llvm-svn: 200219

e24f3973

ConstantHoisting: We can't insert instructions directly in front of a PHI node. · 9e709bce
Benjamin Kramer authored Jan 27, 2014
```
Insert before the terminating instruction of the dominating block instead.

llvm-svn: 200218
```
9e709bce

[sanitizer] revert r200197: the buggy kernel... · 7fe86589

Kostya Serebryany authored Jan 27, 2014

[sanitizer] revert r200197: the buggy kernel (https://bugzilla.kernel.org/show_bug.cgi?id=67651) is almost unusable with asan even with this workaround (too slow), so this workaround makes no sense. The asan/msan bootstrap bot was changed to use a non-buggy kernel

llvm-svn: 200217

7fe86589

XCore: Fix typo in function name. · 9d990792
Benjamin Kramer authored Jan 27, 2014
```
llvm-svn: 200216
```
9d990792
[vectorizer] Add an override for the target instruction cost and use it · edfa37ef
Chandler Carruth authored Jan 27, 2014
```
to stabilize a test that really is trying to test generic behavior and
not a specific target's behavior.

llvm-svn: 200215
```
edfa37ef

[vectorizer] Simplify code to use existing helpers on the Function · 2bb03ba6

Chandler Carruth authored Jan 27, 2014

object and fewer pointless variables.

Also, add a clarifying comment and a FIXME because the code which
disables *all* vectorization if we can't use implicit floating point
instructions just makes no sense at all.

llvm-svn: 200214

2bb03ba6

[vectorizer] Teach the loop vectorizer's unroller to only unroll by · 147c2327

Chandler Carruth authored Jan 27, 2014

powers of two. This is essentially always the correct thing given the
impact on alignment, scaling factors that can be used in addressing
modes, etc. Also, fix the management of the unroll vs. small loop cost
to more accurately model things with this world.

Enhance a test case to actually exercise more of the unroll machinery if
using synthetic constants rather than a specific target model. Before
this change, with the added flags this test will unroll 3 times instead
of either 2 or 4 (the two sensible answers).

While I don't expect this to make a huge difference, if there are lots
of loops sitting right on the edge of hitting the 'small unroll' factor,
they might change behavior. However, I've benchmarked moving the small
loop cost up and down in many various ways and by a huge factor (2x)
without seeing more than 0.2% code size growth. Small adjustments such
as the series that led up here have led to about 1% improvement on some
benchmarks, but it is very close to the noise floor so I mostly checked
that nothing regressed. Let me know if you see bad behavior on other
targets but I don't expect this to be a sufficiently dramatic change to
trigger anything.

llvm-svn: 200213

147c2327

[vectorizer] Add some flags which are useful for conducting experiments · 7f90b453

Chandler Carruth authored Jan 27, 2014

with the unrolling behavior in the loop vectorizer. No functionality
changed at this point.

These are a bit hack-y, but talking with Hal, there doesn't seem to be
a cleaner way to easily experiment with different thresholds here and he
was also interested in them so I wanted to commit them. Suggestions for
improvement are very welcome here.

llvm-svn: 200212

7f90b453

[vectorizer] Fix a trivial oversight where we always requested the · 328998b2

Chandler Carruth authored Jan 27, 2014

number of vector registers rather than toggling between vector and
scalar register number based on VF. I don't have a test case as
I spotted this by inspection and on X86 it only makes a difference if
your target is lacking SSE and thus has *no* vector registers.

If someone wants to add a test case for this for ARM or somewhere else
where this is more significant, that would be awesome.

Also made the variable name a bit more sensible while I'm here.

llvm-svn: 200211

328998b2

Fix crasher introduced in r200203 and caught by a libc++ buildbot. Don't... · 629199cc

Nick Lewycky authored Jan 27, 2014

Fix crasher introduced in r200203 and caught by a libc++ buildbot. Don't assume that getMulExpr returns a SCEVMulExpr, it may have simplified it to something else!

llvm-svn: 200210

629199cc

Remove an unused parameter · 5ff8579c
Tobias Grosser authored Jan 27, 2014
```
llvm-svn: 200209
```
5ff8579c

Allow Polly at all optimization levels · f2021094

Tobias Grosser authored Jan 27, 2014

Restricting Polly to -O3 does not make a lot of sense as it is opt-in anyway
and users who specifically request it should get it. If this causes performance
problems we should rather address them by scheduling the right cleanup passes
then just prevent the user from trying.

Also restricting Polly to -O3 made bugpoint not work with the -O3 flag and polly
enabled.

llvm-svn: 200208

f2021094

Do not test polybench with 'make check-polly' · 24d7e669

Tobias Grosser authored Jan 27, 2014

Those test cases should be tested in the LLVM test suite. For Polly we should
extract regression tests for the individual passes.

llvm-svn: 200206

24d7e669

Remove other unneccessary uses of -O3 in the test suite · 54646f7f
Tobias Grosser authored Jan 27, 2014
```
The polly test suite is now -O3 clean.

llvm-svn: 200205
```
54646f7f

Do not run -O3 to canonicalize test case · a7fea838

Tobias Grosser authored Jan 27, 2014

This is not only not necessary, but in case -03 changes this can actually
cause arbitrarily failing test cases such as, e.g., a recent change by Chandler
that caused -O3 to unroll the loop body, which made the loop we wanted to
detect disappear and consequently this test case fail.

llvm-svn: 200204

a7fea838

Teach SCEV to handle more cases of 'and X, CST', specifically where CST is any... · 31eaca55

Nick Lewycky authored Jan 27, 2014

Teach SCEV to handle more cases of 'and X, CST', specifically where CST is any number of contiguous 1 bits in a row, with any number of leading and trailing 0 bits.

Unfortunately, this in turn led to some lower quality SCEVs due to some different paths through expression simplification, so add getUDivExactExpr and use it. This fixes all instances of the problems that I found, but we can make that function smarter as necessary.

Merge test "xor-and.ll" into "and-xor.ll" since I needed to update it anyways. Test 'nsw-offset.ll' analyzes a little deeper, %n now gets a scev in terms of %no instead of a SCEVUnknown.

llvm-svn: 200203

31eaca55

Additional fix for 200201: due to dependence on bitwidth test was moved to X86 directory. · 55139555
Stepan Dyatkovskiy authored Jan 27, 2014
```
llvm-svn: 200202
```
55139555

Fix for PR18102. · 157bb42e

Stepan Dyatkovskiy authored Jan 27, 2014

Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from
mem-ops sequence sorting.

Consider, how MergeConsequtiveStores works for next example:

store i8 1, a[0]
store i8 2, a[1]
store i8 3, a[1]   ; a[1] again.
return   ; DAG starts here

1. Method will collect all the 3 stores.
2. It sorts them by distance from the base pointer (farthest with highest
index).
3. It takes first consecutive non-overlapping stores and (if possible) replaces
them with a single store instruction.

The point is, we can't determine here which 'store' instruction
would be the second after sorting ('store 2' or 'store 3').
It happens that 'store 3' would be the second, and 'store 2' would be the third.

So after merging we have the next result:

store i16 (1 | 3 << 8), base   ; is a[0] but bit-casted to i16
store i8 2, a[1]

So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1].

Fix: In sort routine just also take into account mem-op sequence number. 
llvm-svn: 200201

157bb42e

[msan] Disable mmap outside of application address range. · 067f5471
Evgeniy Stepanov authored Jan 27, 2014
```
llvm-svn: 200200
```
067f5471
[asan] Android setup: do "adb root" before "adb remount". · aecead9d
Evgeniy Stepanov authored Jan 27, 2014
```
llvm-svn: 200199
```
aecead9d

[vectorizer] Clean up the handling of unvectorized loop unrolling in the · 56612b20

Chandler Carruth authored Jan 27, 2014

LoopVectorize pass.

The logic here doesn't make much sense. We *only* unrolled if the
unvectorized loop was a reduction loop with a single basic block *and*
small loop body. The reduction part in particular doesn't make much
sense. Instead, if we just fall through to the vectorized unroll logic
it makes more sense of unrolling if there is a vectorized reduction that
could be hacked on by the SLP vectorizer *or* if the loop is small.

This is mostly a cleanup and nothing in the test suite really exercises
this, but I did run benchmarks across this change and saw no really
significant changes.

llvm-svn: 200198

56612b20

[sanitizer] increase the mmap granularity in sanitizer allocator from 2^16 to... · 0a5049b7

Kostya Serebryany authored Jan 27, 2014

[sanitizer] increase the mmap granularity in sanitizer allocator from 2^16 to 2^18. This is a partial workaround for the fresh Kernel bug https://bugzilla.kernel.org/show_bug.cgi?id=67651

llvm-svn: 200197

0a5049b7

R600/SI: Add intrinsic for BUFFER_LOAD_DWORD* instructions · 13736221
Michel Danzer authored Jan 27, 2014
```
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200196
```
13736221
R600/SI: Add intrinsic for S_SENDMSG instruction · 6064f57a
Michel Danzer authored Jan 27, 2014
```
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200195
```
6064f57a

Roll back the ConstStringRef change for now · 17d4e98e

Alp Toker authored Jan 27, 2014

There are a couple of interesting things here that we want to check over
(particularly the expecting asserts in StringRef) and get right for general use
in ADT so hold back on this one. For clang we have a workable templated
solution to use in the meanwhile.

This reverts commit r200187.

llvm-svn: 200194

17d4e98e

Roll back the use of ConstStringRef for now · 0d865d3d

Alp Toker authored Jan 27, 2014

We might want try a different strategy so hold back on this for the moment, but
fix the off-by-one error in the original function template.

This reverts commit r200190.

llvm-svn: 200193

0d865d3d

Print .mask and .fmask with the target streamer. · 25fa291f
Rafael Espindola authored Jan 27, 2014
```
Testing this also found the missing '\n' after .frame that this patch also
fixes.

llvm-svn: 200192
```
25fa291f

Rename IMAGE_DLL_CHARACTERISTICS_HIGH_ENTROPY_VA. · 06dc5e79

Rui Ueyama authored Jan 27, 2014

editbin.exe and link.exe both accepts /highentropyva option to set this bit, so
doing s/VIRTUAL_ADDRESS/VA/ should make sense.

llvm-svn: 200191

06dc5e79

Use ConstStringRef facility for getCustomDiagID() safety · a8050174

Alp Toker authored Jan 27, 2014

This is one of various functions in clang that don't handle arbitrary strings
well and can benefit from compile-time safety checks.

Also fixes an off-by-one error that caused one additional null byte to get
added to the end of custom diagnostic descriptions. ConstStringRef handles
tricky details like that for us now.

Requires supporting changes in LLVM r200187.

llvm-svn: 200190

a8050174

PR17052 / DR1560 (+DR1550): In a conditional expression between a glvalue and a · 6a6a4bbd
Richard Smith authored Jan 27, 2014
```
throw-expression, the result is also a glvalue and isn't unnecessarily coerced
to a prvalue.

llvm-svn: 200189
```
6a6a4bbd

Move true/false StringRef helper to StringExtras · 3bb1de78

Alp Toker authored Jan 27, 2014

StringRef is a low-level data wrapper that shouldn't know about language
strings like 'true' and 'false' whereas StringExtras is just the place for
higher-level utilities.

llvm-svn: 200188

3bb1de78

StringRef: Extend constexpr capabilities and introduce ConstStringRef · 042f41b0

Alp Toker authored Jan 27, 2014

(1) Add llvm_expect(), an asserting macro that can be evaluated as a constexpr
    expression as well as a runtime assert or compiler hint in release builds. This
    technique can be used to construct functions that are both unevaluated and
    compiled depending on usage.

(2) Update StringRef using llvm_expect() to preserve runtime assertions while
    extending the same checks to static asserts in C++11 builds that support the
    feature.

(3) Introduce ConstStringRef, a strong subclass of StringRef that references
    compile-time constant strings. It's convertible to, but not from, ordinary
    StringRef and thus can be used to add compile-time safety to various interfaces
    in LLVM and clang that only accept fixed inputs such as diagnostic format
    strings that tend to get misused.

llvm-svn: 200187

042f41b0

Print .frame via the target streamer. · 054234fa
Rafael Espindola authored Jan 27, 2014
```
llvm-svn: 200186
```
054234fa
[PECOFF] Implement relocations for x86-64. · b73d2852
Rui Ueyama authored Jan 27, 2014
```
llvm-svn: 200185
```
b73d2852