Commits · ae27b2380fe0b7fed17766ef8cba0683706cfc48 · Raul Torres / llvm-target-spread

Mar 03, 2016

[MBP] Renaming a confusing variable and add clarifying comments · ae27b238
Philip Reames authored Mar 03, 2016
```
Was discussed as part of http://reviews.llvm.org/D17830

llvm-svn: 262571
```
ae27b238
[lanai] Fixing file path used in test · 2a064143
Jacques Pienaar authored Mar 03, 2016
```
llvm-svn: 262567
```
2a064143
TargetSchedule: Allow explicit Unsupported markers in InstRW · 0f521c54
Matthias Braun authored Mar 03, 2016
```
llvm-svn: 262549
```
0f521c54
TableGen: Accept itinerary data when checking for schedmodel completeness · 42d9ad9c
Matthias Braun authored Mar 03, 2016
```
llvm-svn: 262548
```
42d9ad9c

[MBP] Avoid placing random blocks between loop preheader and header · 23d93398

Philip Reames authored Mar 03, 2016

If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code.

The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!.

Differential Revision: http://reviews.llvm.org/D17830

llvm-svn: 262547

23d93398

[X86] Don't give catch objects a displacement of zero · 1ef65402

David Majnemer authored Mar 03, 2016

Catch objects with a displacement of zero do not initialize a catch
object.  The displacement is relative to %rsp at the end of the
function's prologue for x86_64 targets.

If we place an object at the top-of-stack, we will end up wit a
displacement of zero resulting in our catch object remaining
uninitialized.

Address this by creating our catch objects as fixed objects.  We will
ensure that the UnwindHelp object is created after the catch objects so
that no catch object will have a displacement of zero.

Differential Revision: http://reviews.llvm.org/D17823

llvm-svn: 262546

1ef65402

[AArch64] add tests to demonstrate existing codegen for PR26819 · 84056497
Sanjay Patel authored Mar 02, 2016
```
llvm-svn: 262540
```
84056497
AMDGPU: Simplify boolean conditional return statements · 8226fc48
Matt Arsenault authored Mar 02, 2016
```
Patch by Richard Thomson

llvm-svn: 262536
```
8226fc48

Mar 02, 2016

[MBP] Remove overly verbose debug output · 02e1132a
Philip Reames authored Mar 02, 2016
```
llvm-svn: 262531
```
02e1132a

Explode store of arrays in instcombine · 3b8b2ea2

Amaury Sechet authored Mar 02, 2016

Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized.

Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17828

llvm-svn: 262530

3b8b2ea2

[llvm-nm] Restore the previous behaviour (pre r262525). · 98be3f2b
Davide Italiano authored Mar 02, 2016
```
It broke some buildbots.

Pointy-hat to:  me

llvm-svn: 262529
```
98be3f2b
[llvm-nm] Fix rendering of -s grouping with all the othe options. · f156763c
Davide Italiano authored Mar 02, 2016
```
llvm-svn: 262525
```
f156763c
[MBP] Adjust debug output to be more focused and approachable · b9688f43
Philip Reames authored Mar 02, 2016
```
llvm-svn: 262522
```
b9688f43

Unpack array of all sizes in InstCombine · 7cd3fe7d

Amaury Sechet authored Mar 02, 2016

Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements.

Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15890

llvm-svn: 262521

7cd3fe7d

Really fix ASAN leak/etc issues with MemorySSA unittests · 6412002d
Daniel Berlin authored Mar 02, 2016
```
llvm-svn: 262519
```
6412002d
[libFuzzer] add -Werror for libFuzzer build rule · 4394b31e
Kostya Serebryany authored Mar 02, 2016
```
llvm-svn: 262517
```
4394b31e
Revert "Fix ASAN detected errors in code and test" (it was not meant to be committed yet) · 989e601b
Daniel Berlin authored Mar 02, 2016
```
This reverts commit 890bbccd600ba1eb050353d06a29650ad0f2eb95.

llvm-svn: 262512
```
989e601b
Fix ASAN detected errors in code and test · 27ed1c2e
Daniel Berlin authored Mar 02, 2016
```
llvm-svn: 262511
```
27ed1c2e

Add another test for the GlobalOpt change in r212079. · 9ab86aab

Bob Wilson authored Mar 02, 2016

This is a test that Akira Hatanaka wrote to test GlobalOpt's handling of
aliases with GEP operands. David Majnemer independently made the same
change to GlobalOpt in r212079. Akira's test is a useful addition, so I'm
pulling it over from the llvm repo for Swift on GitHub.

llvm-svn: 262510

9ab86aab

[libFuzzer] more trophies · 721f61a0
Kostya Serebryany authored Mar 02, 2016
```
llvm-svn: 262509
```
721f61a0

[ARM] Merging 64-bit divmod lib calls into one · 93e42d99

Renato Golin authored Mar 02, 2016

When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262507

93e42d99

Revert "[X86] Elide references to _chkstk for dynamic allocas" · 65f9d9cd

Reid Kleckner authored Mar 02, 2016

This reverts commit r262370.

It turns out there is code out there that does sequences of allocas
greater than 4K: http://crbug.com/591404

The goal of this change was to improve the code size of inalloca call
sequences, but we got tangled up in the mess of dynamic allocas.
Instead, we should come back later with a separate MI pass that uses
dominance to optimize the full sequence. This should also be able to
remove the often unneeded stacksave/stackrestore pairs around the call.

llvm-svn: 262505

65f9d9cd

ARM: Introduce conservative load/store optimization mode · f290912d

Matthias Braun authored Mar 02, 2016

Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which
means that unaligned loads/stores do not trap and even extensive testing
will not catch these bugs. However the multi/double variants are not
affected by this bit and will still trap. In effect a more aggressive
load/store optimization will break existing (bad) code.

These bugs do not necessarily manifest in the broken code where the
misaligned pointer is formed but often later in perfectly legal code
where it is accessed. This means recompiling system libraries (which
have no alignment bugs) with a newer compiler will break existing
applications (with alignment bugs) that worked before.

So (under protest) I implemented this safe mode which limits the
formation of multi/double operations to cases that are not affected by
user code (stack operations like spills/reloads) or cases where the
normal operations trap anyway (floating point load/stores). It is
disabled by default.

Differential Revision: http://reviews.llvm.org/D17015

llvm-svn: 262504

f290912d

SelectionDAG: Use correctly sized allocation functions for SDNodes · b2ecee9c

Justin Bogner authored Mar 02, 2016

The placement new calls here were all calling the allocation function
in RecyclingAllocator/Recycler for SDNode, instead of the function for
the specific subclass we were constructing.

Since this particular allocator always overallocates it more or less
worked, but would hide what we're actually doing from any memory
tools. Also, if you tried to change this allocator so something like a
BumpPtrAllocator or MallocAllocator, the compiler would crash horribly
all the time.

Part of llvm.org/PR26808.

llvm-svn: 262500

b2ecee9c

[AArch64] Enable non-leaf frame pointer elimination. · 62c1a1e7

Geoff Berry authored Mar 02, 2016

Summary:
This change enables frame pointer elimination in non-leaf functions.
The -fomit-frame-pointer option still needs to be used when compiling
via clang (or an equivalent method of not setting the
'no-frame-pointer-elim*' function attributes if generating llvm IR via
some other method) to take advantage of this optimization.

This change should be NFC when compiling via clang without
-fomit-frame-pointer.

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, tberghammer, qcolombet, llvm-commits, danalbert, mcrosier, srhines

Differential Revision: http://reviews.llvm.org/D17730

llvm-svn: 262495

62c1a1e7

[CMake] Add test-depends target to build dependencies of check-all · 7d942d73

Chris Bieneman authored Mar 02, 2016

This is just another convenience target for bots to use. It enables isolation of building and testing.

llvm-svn: 262494

7d942d73

[cmake] Check the compiler version first · 9fac19f0

Reid Kleckner authored Mar 02, 2016

Otherwise users get messages from CheckAtomic about missing libatomic
instead of a sensible message that says "use GCC 4.7 or newer".

I structured the change along the lines of HandleLLVMStdlib.cmake, so
that the standalone build of Clang still gets the compiler version
check.

Reviewers: beanz

Differential Revision: http://reviews.llvm.org/D17789

llvm-svn: 262491

9fac19f0

[AA] Hoist the logic to reformulate various AA queries in terms of other · 12884f7f

Chandler Carruth authored Mar 02, 2016

parts of the AA interface out of the base class of every single AA
result object.

Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.

When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.

To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).

However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.

The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.

It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that
manner.

However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.

Differential Revision: http://reviews.llvm.org/D17329

llvm-svn: 262490

12884f7f

[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing... · 537907fd

Simon Pilgrim authored Mar 02, 2016

[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from one of the inputs of a binary shuffle (punpcklbw)

llvm-svn: 262486

537907fd

[LLVM][AVX512]PSRAWI Change imm8 to int. · 927fdaee
Michael Zuckerman authored Mar 02, 2016
```
Differential Revision: http://reviews.llvm.org/D17705

llvm-svn: 262480
```
927fdaee

[X86][SSE] Lower 128-bit MOVDDUP with existing VBROADCAST mechanisms · c02b7262

Simon Pilgrim authored Mar 02, 2016

We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of.

This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well.

It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts.

Differential Revision: http://reviews.llvm.org/D17680

llvm-svn: 262478

c02b7262

Revert "[AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields" · f2fbabe9
Nikolay Haustov authored Mar 02, 2016
```
Build failure with clang.

llvm-svn: 262477
```
f2fbabe9
Revert "[AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler." · f0f24628
Nikolay Haustov authored Mar 02, 2016
```
Build failure with clang.

llvm-svn: 262475
```
f0f24628

[AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler. · 73447a97

Nikolay Haustov authored Mar 02, 2016

complementary patch to table-driven amd_kernel_code_t field parser/printer utility. lit tests passed.

Patch by: Valery Pykhtin

Differential Revision: http://reviews.llvm.org/D17151

llvm-svn: 262474

73447a97

[AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields · 6c8c7496

Nikolay Haustov authored Mar 02, 2016

This is going to be used in .hsatext disassembler and can be used
in current assembler parser (lit tests passed on parsing).
Code using this helpers isn't included in this patch.

Benefits:

unified approach
fast field name lookup on parsing
Later I would like to enhance some of the field naming/syntax using this code.

Patch by: Valery Pykhtin

Differential Revision: http://reviews.llvm.org/D17150

llvm-svn: 262473

6c8c7496

libfuzzer: fix compiler warnings · 2eed1218

Dmitry Vyukov authored Mar 02, 2016

- unused sigaction/setitimer result (used in assert)
- unchecked fscanf return value
- signed/unsigned comparison

llvm-svn: 262472

2eed1218

[X86] Remove unnecessary call to isReg from emitter's DestMem handling for VEX... · 1d3f4aef

Craig Topper authored Mar 02, 2016

[X86] Remove unnecessary call to isReg from emitter's DestMem handling for VEX prefix. The operand is always a register. NFC

llvm-svn: 262468

1d3f4aef

[X86] Make X86MCCodeEmitter::DetermineREXPrefix locate operands more like how... · 6a7cd422
Craig Topper authored Mar 02, 2016
```
[X86] Make X86MCCodeEmitter::DetermineREXPrefix locate operands more like how VEX prefix handling does.

llvm-svn: 262467
```
6a7cd422

[X86] Permit reading of the FLAGS register without it being previously defined · 5aadde1e

David Majnemer authored Mar 02, 2016

We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS.
While technically correct, this is not be desirable for folks who want
to examine aspects of the FLAGS register which are not related to
computation like whether or not CPUID is a valid instruction.

Differential Revision: http://reviews.llvm.org/D17782

llvm-svn: 262465

5aadde1e

[X86] Remove assertion I accidentally left in. · d4dabb39
Craig Topper authored Mar 02, 2016
```
llvm-svn: 262464
```
d4dabb39