Commits · 171b71d1b80d2a2f8b3bd1bb8c9e6b5410272222 · Roger Ferrer / llvm-epi

Apr 13, 2016

[mips][microMIPS] Add CodeGen support for DIV, MOD, DIVU, MODU, DDIV, DMOD,... · 58d6a959

Zlatko Buljan authored Apr 13, 2016

[mips][microMIPS] Add CodeGen support for DIV, MOD, DIVU, MODU, DDIV, DMOD, DDIVU and DMODU instructions
Differential Revision: http://reviews.llvm.org/D17137

This patch was reverted after the revertion of dependant patch http://reviews.llvm.org/D17068.
There was the problem with test-suite failure.
The problem is hopefully solved with dependant patch so this patch is commited again.

llvm-svn: 266179

58d6a959

[InstCombine] We folded an fcmp to an i1 instead of a vector of i1 · 3ee5f344

David Majnemer authored Apr 13, 2016

Remove an ad-hoc transform in InstCombine and replace it with more
general machinery (ValueTracking, InstructionSimplify and VectorUtils).

This fixes PR27332.

llvm-svn: 266175

3ee5f344

Simplify LTOInternalize into UpdateLLVMCompilerUsed · ce23e970

Mehdi Amini authored Apr 13, 2016

It is now only doing the update to the llvm.compiler_used global.
The client has to call separately the internalization stage.
Hopefully the code is simpler to understand this way.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266174

ce23e970

Minor cleanup in Internalize, hide helper class using anonymous namespace (NFC) · 10593830
Mehdi Amini authored Apr 13, 2016
```
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266173
```
10593830

LTOInternalize: Use a StringSet instead of a sorted vector and a binary search... · 16fcb418

Mehdi Amini authored Apr 13, 2016

LTOInternalize: Use a StringSet instead of a sorted vector and a binary search query for each function

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266172

16fcb418

[mips][microMIPS] Fix for "Cannot copy registers" assertion · 11dd31df

Hrvoje Varga authored Apr 13, 2016

Differential Revision: http://reviews.llvm.org/D17068

This changes contains fix for failing test-suite. So, this patch should hopefully work now.

llvm-svn: 266171

11dd31df

Move "ExternalSymbols" out of LTOInternalize (NFC) · deee003a

Mehdi Amini authored Apr 13, 2016

This is not really related to internalization per se.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266170

deee003a

Really return whether Internalize did change the Module or not. · 59269a87
Mehdi Amini authored Apr 13, 2016
```
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266169
```
59269a87
Modernize Internalizer with for-range loop (NFC) · 3949b9e6
Mehdi Amini authored Apr 13, 2016
```
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266168
```
3949b9e6

Refactor the InternalizePass into a helper class, and expose it through a... · 24d3414f

Mehdi Amini authored Apr 13, 2016

Refactor the InternalizePass into a helper class, and expose it through a public free function (NFC)

There is really no reason to require to instanciate a pass manager to
internalize.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266167

24d3414f

Refactor Internalization pass to use as a callback instead of a StringSet (NFC) · 40787099

Mehdi Amini authored Apr 13, 2016

This will save a bunch of copies / initialization of intermediate
datastructure, and (hopefully) simplify the code.

This also abstract the symbol preservation mechanism outside of the
Internalization pass into the client code, which is not forced
to keep a map of strings for instance (ThinLTO will prefere hashes).

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266163

40787099

Recommit r265547, and r265610,r265639,r265657 on top of it, plus · 9a16d655

Wei Mi authored Apr 13, 2016

two fixes with one about error verify-regalloc reported, and
another about live range update of phi after rematerialization.

r265547:
Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.

analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.

To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.

Patches on top of r265547:
r265610 "Fix the compare-clang diff error introduced by r265547."
r265639 "Fix the sanitizer bootstrap error in r265547."
r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]"

Differential Revision: http://reviews.llvm.org/D15302
Differential Revision: http://reviews.llvm.org/D18934
Differential Revision: http://reviews.llvm.org/D18935
Differential Revision: http://reviews.llvm.org/D18936

llvm-svn: 266162

9a16d655

Fix FunctionImport export list computation: need to take a reference to a map... · ef7555fb

Mehdi Amini authored Apr 13, 2016

Fix FunctionImport export list computation: need to take a reference to a map entry to actually modify it

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266159

ef7555fb

AMDGPU: Add test for m0 initialization in basic loop · 887d4767

Matt Arsenault authored Apr 13, 2016

Initialization of m0 is emitted for each LDS operation, so
every block with LDS usage ends up with one. MachineLICM
used to fail to hoist this out of the loop, so every loop
iteration with LDS usage in it would re-initialize it.

This seems to be fixed now, so add a test to make sure that
it stays this way.

llvm-svn: 266156

887d4767

AMDGPU: Remove leftover ShaderType attributes in tests · b34eea9c
Matt Arsenault authored Apr 13, 2016
```
llvm-svn: 266155
```
b34eea9c
LTOInternalize: Fix member type, should be a reference and not a copy · 3dfc952e
Mehdi Amini authored Apr 12, 2016
```
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266153
```
3dfc952e

AMDGPU/SI: Fix spilling of 96-bit registers · 703b2ec4

Tom Stellard authored Apr 12, 2016

Summary:
It seems like this was broken in r252327.  I thought we had test cases
for this, but it's really hard to tirgger spills of this exact register
size since they aren't used very much.

Reviewers: arsenm, nhaehnle

Subscribers: nhaehnle, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19021

llvm-svn: 266152

703b2ec4

Fix mismatch on returned type between header and implementation for createNameAnonFunctionPass() · 818f67ad
Mehdi Amini authored Apr 12, 2016
```
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266151
```
818f67ad

CodeGen: Clear the MFI's save and restore point after PrologEpilogInserter · 263f314b

Justin Bogner authored Apr 12, 2016

This state is no longer useful and not guaranteed to be valid in later
codegen passes. For example, see the added test, which would print a
savepoint of %bb.-1 without this change, and crashes with a
use-after-free error under ASan if you apply the recycling allocator
patch from llvm.org/PR26808.

llvm-svn: 266150

263f314b

Add space between words in verify-scev-maps option help message · e48e3937
Jeroen Ketema authored Apr 12, 2016
```
llvm-svn: 266149
```
e48e3937

[x86, InstCombine] fix masked load pass-through operand to be a zero vector · 5e5056d9

Sanjay Patel authored Apr 12, 2016

This bug was introduced with:
http://reviews.llvm.org/rL262269

AVX masked loads are specified to set vector lanes to zero when the high bit of the mask
element for that lane is zero:
"If the mask is 0, the corresponding data element is set to zero in the load form of these
instructions, and unmodified in the store form." --Intel manual

Differential Revision: http://reviews.llvm.org/D19017

llvm-svn: 266148

5e5056d9

[AArch64] Fuse AES{D,E}/AESMC for Exynos M1. (NFC) · 551af44e
Evandro Menezes authored Apr 12, 2016
```
llvm-svn: 266144
```
551af44e
Pre-fill LibcallRoutineNames with nullptr. · 7873fb9d
James Y Knight authored Apr 12, 2016
```
And rearrange InitLibcallNames slightly.

llvm-svn: 266142
```
7873fb9d

Apr 12, 2016

Update psabi link for x86-64. Add link to linux gabi supplement. · 7c7e73b5
James Y Knight authored Apr 12, 2016
```
llvm-svn: 266137
```
7c7e73b5
[MC/ELFObjectWriter] Fix indentation of class body. · a0fa2621
David Blaikie authored Apr 12, 2016
```
llvm-svn: 266136
```
a0fa2621
Fixed a few typos and formatting problems. NFCI. · 99775c1b
David L Kreitzer authored Apr 12, 2016
```
llvm-svn: 266135
```
99775c1b
[DebugInfo] Add error message to test. · 245383c5
Davide Italiano authored Apr 12, 2016
```
Suggested by Rafael as post-commit review (r266102).

llvm-svn: 266134
```
245383c5

Add a pass to name anonymous/nameless function · d5faa267

Mehdi Amini authored Apr 12, 2016

Summary:
For correct handling of alias to nameless
function, we need to be able to refer them through a GUID in the summary.
Here we name them using a hash of the non-private global names in the module.

Reviewers: tejohnson

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18883

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266132

d5faa267

Move summary creation out of llvm-as into opt · 68da426e

Mehdi Amini authored Apr 12, 2016

Summary:
Let keep llvm-as "dumb": it converts textual IR to bitcode. This
commit removes the dependency from llvm-as to libLLVMAnalysis.
We'll add back summary in llvm-as if we get to a textual
representation for it at some point. In the meantime, opt seems
like a better place for that.

Reviewers: tejohnson

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19032

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266131

68da426e

X86: Avoid accessing SDValues after they've been RAUW'd · 32ad24d4

Justin Bogner authored Apr 12, 2016

This fixes two use-after-frees in selectLEA64_32Addr. If matchAddress
matches an ADD with an AND as an operand, and that AND hits one of the
"heroic transforms" that folds masks and shifts, we end up with N
pointing to an SDNode that was deleted. Make sure we're done accessing
it before that.

Found by ASan with the recycling allocator changes in llvm.org/PR26808.

llvm-svn: 266130

32ad24d4

NFC: MergeFunctions return early · f90029bb
JF Bastien authored Apr 12, 2016
```
Same effect, easier to read.

llvm-svn: 266128
```
f90029bb

AMDGPU: add llvm.amdgcn.buffer.load/store intrinsics · df77c9ad

Nicolai Haehnle authored Apr 12, 2016

Summary:
They correspond to BUFFER_LOAD/STORE_DWORD[_X2,X3,X4] and mostly behave like
llvm.amdgcn.buffer.load/store.format. They will be used by Mesa for SSBO and
atomic counters at least when robust buffer access behavior is desired.
(These instructions perform no format conversion and do buffer range checking
per component.)

As a side effect of sharing patterns with llvm.amdgcn.buffer.store.format,
it has become trivial to add support for the f32 and v2f32 variants of that
intrinsic, so the patch does so.

Also DAG-ify (and fix) some tests that I noticed intermittent failures in
while developing this patch.

Some tests were (temporarily) adjusted for the required mayLoad/hasSideEffects
changes to the BUFFER_STORE_DWORD* instructions. See also
http://reviews.llvm.org/D18291.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18292

llvm-svn: 266126

df77c9ad

[ThinLTO] Only compute imports for current module in FunctionImport pass · c86af334

Teresa Johnson authored Apr 12, 2016

Summary:
The function import pass was computing all the imports for all the
modules in the index, and only using the imports for the current module.
Change this to instead compute only for the given module. This means
that the exports list can't be populated, but they weren't being used
anyway.

Longer term, the linker can collect all the imports and export lists
and serialize them out for consumption by the distributed backend
processes which use this pass.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18945

llvm-svn: 266125

c86af334

NFC: MergeFunctions update more comments · 1bb32ac4
JF Bastien authored Apr 12, 2016
```
They are wordy. Some words were wrong.

llvm-svn: 266124
```
1bb32ac4

Add __atomic_* lowering to AtomicExpandPass. · 19f6cce4

James Y Knight authored Apr 12, 2016

(Recommit of r266002, with r266011, r266016, and not accidentally
including an extra unused/uninitialized element in LibcallRoutineNames)

AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and
cmpxchg instructions to __atomic_* library calls, when the target
doesn't support atomics of a given size.

This is the first step towards moving all atomic lowering from clang
into llvm. When all is done, the behavior of __sync_* builtins,
__atomic_* builtins, and C11 atomics will be unified.

Previously LLVM would pass everything through to the ISelLowering
code. There, unsupported atomic instructions would turn into __sync_*
library calls. Because of that behavior, Clang currently avoids emitting
llvm IR atomic instructions when this would happen, and emits __atomic_*
library functions itself, in the frontend.

This change makes LLVM able to emit __atomic_* libcalls, and thus will
eventually allow clang to depend on LLVM to do the right thing.

It is advantageous to do the new lowering to atomic libcalls in
AtomicExpandPass, before ISel time, because it's important that all
atomic operations for a given size either lower to __atomic_*
libcalls (which may use locks), or native instructions which won't. No
mixing and matching.

At the moment, this code is enabled only for SPARC, as a
demonstration. The next commit will expand support to all of the other
targets.

Differential Revision: http://reviews.llvm.org/D18200

llvm-svn: 266115

19f6cce4

[WebAssembly] Fix debug info in reg-stackify.ll test · b861ec87
Derek Schuff authored Apr 12, 2016
```
It lacked a CU and thus became invalid with r266102

llvm-svn: 266114
```
b861ec87
Delete mergefunctions.clang.svn.patch · 4c3fa5f9
JF Bastien authored Apr 12, 2016
```
The patch doesn't apply, and was removed from zorg by rL266094.

llvm-svn: 266112
```
4c3fa5f9

AMDGPU/SI: Insert wait states required after v_readfirstlane on SI · ab1d3a9d

Tom Stellard authored Apr 12, 2016

Summary:
We will be able to handle this case much better once the hazard recognizer
is finished, but this conservative implementation  fixes a hang with the piglit
test:

spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra

Reviewers: arsenm, nhaehnle

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18988

llvm-svn: 266105

ab1d3a9d

AMDGPU: Eliminate half of i64 or if one operand is zero_extend from i32 · 3b08238f

Matt Arsenault authored Apr 12, 2016

This helps clean up some of the mess when expanding unaligned 64-bit
loads when changed to be promote to v2i32, and fixes situations
where or x, 0 was emitted after splitting 64-bit ors during moveToVALU.

I think this could be a generic combine but I'm not sure.

llvm-svn: 266104

3b08238f

[IR/Verifier] Each DISubprogram with isDefinition: true must belong to a CU. · b390d8ee

Davide Italiano authored Apr 12, 2016

Add a check to catch violations. ~60 tests were broken and prevented
this change to be committed. Adrian and I (thanks Adrian!) went
through them in the last week or so updating. The check can be
done more efficiently but I'd still like to get this in ASAP to
avoid more broken tests to be checked in (if any).

PR:  27101
llvm-svn: 266102

b390d8ee