Commits · fa4e63c0d60a5f14379053fd6d05b594f40f91f6 · Lorenzo Albano / LLVM bpEVL

Mar 09, 2018

[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf · fa4e63c0

Ulrich Weigand authored Mar 09, 2018

When hoisting common code from the "then" and "else" branches of a condition
to before the "if", there is no need to require that debug intrinsics match
before moving them (and merging them).  Instead, we can simply always keep
all debug intrinsics from both sides of the "if".

This fixes PR36410, which describes a problem where as a result of the attempt
to merge debug locations for two debug intrinsics we end up with an invalid
intrinsic, where the scope indicated in the !dbg location no longer matches
the scope of the variable tracked by the intrinsic.

In addition, this has the benefit that we no longer throw away information
that is actually still valid, helping to generate better debug data.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D44312

llvm-svn: 327175

fa4e63c0

[Power9] Code Cleaup and adding Comments for Power 9 Scheduler · 735817aa

Stefan Pintilie authored Mar 09, 2018

Did some code cleanup up removing ItinRW that are not needed and resource types
that are no longer used.

Also added more comments to the td files related to the Power 9 sheduler model.

llvm-svn: 327174

735817aa

[NFC] Consolidate six getPointerOperand() utility functions into one place · 038ede2a

Renato Golin authored Mar 09, 2018

There are six separate instances of getPointerOperand() utility.
LoopVectorize.cpp has one of them,
and I don't want to create a 7th one while I'm trying to move
LoopVectorizationLegality into a separate file
(eventual objective is to move it to Analysis tree).

See http://lists.llvm.org/pipermail/llvm-dev/2018-February/120999.html
for llvm-dev discussions

Closes D43323.

Patch by Hideki Saito <hideki.saito@intel.com>.

llvm-svn: 327173

038ede2a

Correct load-op-store cycle detection analysis · 0fab4178

Nirav Dave authored Mar 09, 2018

Add missing cycle dependency checks in load-op-store fusion.

Fixes PR36274.

Reviewers: craig.topper, bogner

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D43154

llvm-svn: 327172

0fab4178

Improve Dependency analysis when doing multi-node Instruction Selection · d668f69e

Nirav Dave authored Mar 09, 2018

Relanding after fixing NodeId Invariant.

Cleanup cycle/validity checks in ISel (IsLegalToFold,
HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full
search for cycles / dependencies pruning the search when topological
property of NodeId allows.

As part of this propogate the NodeId-based cutoffs to narrow
hasPreprocessorHelper searches.

Reviewers: craig.topper, bogner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D41293

llvm-svn: 327171

d668f69e

[DAG] Enforce stricter NodeId invariant during Instruction selection · 071699bf

Nirav Dave authored Mar 09, 2018

Instruction Selection makes use of the topological ordering of nodes
by node id (a node's operands have smaller node id than it) when doing
cycle detection.  During selection we may violate this property as a
selection of multiple nodes may induce a use dependence (and thus a
node id restriction) between two unrelated nodes. If a selected node
has an unselected successor this may allow us to miss a cycle in
detection an invalid selection.

This patch fixes this by marking all unselected successors of a
selected node have negated node id.  We avoid pruning on such negative
ids but still can reconstruct the original id for pruning.

In-tree targets have been updated to replace DAG-level replacements
with ISel-level ones which enforce this property.

This preemptively fixes PR36312 before triggering commit r324359 relands

Reviewers: craig.topper, bogner, jyknight

Subscribers: arsenm, nhaehnle, javed.absar, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D43198

llvm-svn: 327170

071699bf

Make early exit hasPredecessorHelper return true. NFCI. · 775f07d1

Nirav Dave authored Mar 09, 2018

All uses conservatively assume in early exit case that it will be a
predecessor. Changing default removes checking code in all uses.

llvm-svn: 327169

775f07d1

cfi: Disable simple-pass.cpp on Darwin. · 43b055f8
Peter Collingbourne authored Mar 09, 2018
```
-mretpoline does not work yet on Darwin.

llvm-svn: 327168
```
43b055f8

[sanitizer] Revert rCRT327145 · 112d7a43

Kostya Kortchinsky authored Mar 09, 2018

Summary:
It breaks the Chromium toolchain due to:
```
lib/sanitizer_common/sanitizer_allocator_primary32.h:269:34: error: requested alignment is not an integer constant
   struct ALIGNED(kCacheLineSize) SizeClassInfo {
```

Reviewers: alekseyshl, thakis

Reviewed By: thakis

Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D44320

llvm-svn: 327167

112d7a43

Fix Clang test case. · a2f10056
Peter Collingbourne authored Mar 09, 2018
```
llvm-svn: 327166
```
a2f10056

Don't use -pie in relocatable link. · d48c0cd9

Evgeniy Stepanov authored Mar 09, 2018

Summary:
Android, in particular, got PIE enabled by default in r316606. It resulted in
relocatable links passing both -r and -pie to the linker, which is not allowed.

Reviewers: srhines

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D44229

llvm-svn: 327165

d48c0cd9

[llvm-objdump] Support disassembling by symbol name · b0e4b916

Rafael Auler authored Mar 09, 2018

Summary:
Add a new option -df to llvm-objdump that takes function names
as arguments and instructs the disassembler to only dump those function
contents. Based on code originally written by Bill Nell.

Reviewers: espindola, JDevlieghere

Differential Revision: https://reviews.llvm.org/D44224

llvm-svn: 327164

b0e4b916

Use branch funnels for virtual calls when retpoline mitigation is enabled. · 2974856a

Peter Collingbourne authored Mar 09, 2018

The retpoline mitigation for variant 2 of CVE-2017-5715 inhibits the
branch predictor, and as a result it can lead to a measurable loss of
performance. We can reduce the performance impact of retpolined virtual
calls by replacing them with a special construct known as a branch
funnel, which is an instruction sequence that implements virtual calls
to a set of known targets using a binary tree of direct branches. This
allows the processor to speculately execute valid implementations of the
virtual function without allowing for speculative execution of of calls
to arbitrary addresses.

This patch extends the whole-program devirtualization pass to replace
certain virtual calls with calls to branch funnels, which are
represented using a new llvm.icall.jumptable intrinsic. It also extends
the LowerTypeTests pass to recognize the new intrinsic, generate code
for the branch funnels (x86_64 only for now) and lay out virtual tables
as required for each branch funnel.

The implementation supports full LTO as well as ThinLTO, and extends the
ThinLTO summary format used for whole-program devirtualization to
support branch funnels.

For more details see RFC:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120672.html

Differential Revision: https://reviews.llvm.org/D42453

llvm-svn: 327163

2974856a

[SymbolFilePDB] Keep searching until the file name is found for the pdb compiland · dee18b82

Aaron Smith authored Mar 09, 2018

Reviewers: zturner, rnk, lldb-commits

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44182

llvm-svn: 327162

dee18b82

Avoid creating a Constant for each value in a ConstantDataSequential. · 08fa5942

Alina Sbirlea authored Mar 09, 2018

Summary: We create a ConstantDataSequential (ConstantDataArray or ConstantDataVector) to avoid creating a Constant for each element in an array of constants. But them in AsmPrinter, we do create a ConstantFP for each element in the ConstantDataSequential. This triggers excessive memory use when generating large global FP constants.

Reviewers: bogner, lhames, t.p.northover

Subscribers: jlebar, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D44277

llvm-svn: 327161

08fa5942

Delay creating an alias for @@@. · 47b4d6ba

Rafael Espindola authored Mar 09, 2018

With this we only create an alias for @@@ once we know if it should
use @ or @@. This avoids last minutes renames and hacks to handle MS
names.

This only handles the ELF writer. LTO still has issues with @@@
aliases.

llvm-svn: 327160

47b4d6ba

[X86][AVX] createVariablePermute - fix v2i64/v2f64 VPERMILPD index creation. · 2cd489fe

Simon Pilgrim authored Mar 09, 2018

The input indices vector will put the index in bit0, but VPERMILPD actually selects off bit1 - so we need to scale accordingly.

llvm-svn: 327159

2cd489fe

TableGen: Remove space at EOL in TGLexer.{h,cpp} · 169ec09c
Nicolai Haehnle authored Mar 09, 2018
```
Change-Id: Ica5f39470174e85f173d3b6db95789033f75ce17
llvm-svn: 327158
```
169ec09c
[X86][SSE] createVariablePermute - move source vector canonicalization to top of function. NFCI. · 230d38b5
Simon Pilgrim authored Mar 09, 2018
```
This is to make it easier to return early from the switch statement with custom lowering.

llvm-svn: 327157
```
230d38b5

[ELF] Convert {read,write}*be to endianness-aware read/write. · 0c483024

Fangrui Song authored Mar 09, 2018

Subscribers: emaste, nemanjai, arichardson, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D44227

llvm-svn: 327156

0c483024

[LV] Adding test for r327109 · bc94b98c
Renato Golin authored Mar 09, 2018
```
llvm-svn: 327155
```
bc94b98c

ELF: Do not create multiple thunks for the same virtual address. · 04ff1226

Peter Collingbourne authored Mar 09, 2018

This avoids creating multiple thunks for symbols with aliases or which
belong to ICF'd sections. This patch reduces the size of Chromium for
Android by 260KB (0.8% of .text).

Differential Revision: https://reviews.llvm.org/D44284

llvm-svn: 327154

04ff1226

[AMDGPU] Supported ds_read_b128 generation; Widened vector length for local address-space. · a7cb3112

Farhana Aleen authored Mar 09, 2018

Summary: Starting from GCN 2nd generation, ISA supports ds_read_b128 on top of ds_read_b64.
This patch supports ds_read_b128 instruction pattern and generation of this instruction.
In the vectorizer, this patch also widen the vector length so that vectorizer generates
128 bit loads for local address-space which gets translated to ds_read_b128.
Since the performance benefit is not clear; compiler generates ds_read_b128 under -amdgpu-ds128.

Author: FarhanaAleen

Reviewed By: rampitec, arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D44210

llvm-svn: 327153

a7cb3112

[GISel]: Add helpers for easy building G_FCONSTANT along with matchers · 91fc4e09

Aditya Nandakumar authored Mar 09, 2018

Added helpers to build G_FCONSTANT, along with matching ConstantFP and
unit tests for the same.

Sample usage.

auto MIB = Builder.buildFConstant(s32, 0.5); // Build IEEESingle
For Matching the above

const ConstantFP* Tmp;
mi_match(DstReg, MRI, m_GFCst(Tmp));

https://reviews.llvm.org/D44128
reviewed by: volkan

llvm-svn: 327152

91fc4e09

[WebAssembly] Handle weak undefined functions with a synthetic stub · 2e55ee77

Nicholas Wilson authored Mar 09, 2018

This error case is described in Linking.md. The operand for call requires
generation of a synthetic stub.

Differential Revision: https://reviews.llvm.org/D44028

llvm-svn: 327151

2e55ee77

[JumpThreading] Don't restrict cast-traversal to i1 · 95d9ccb2

Chad Rosier authored Mar 09, 2018

In r263618, JumpThreading learned to look trough simple cast instructions, but
only if the source of those cast instructions was a phi/cmp i1 (in an effort to
limit compile time effects). I think this condition is too restrictive. For
switches with limited value range, InstCombine will readily introduce an extra
trunc instruction to a smaller integer type (e.g. from i8 to i2), leaving us in
the somewhat perverse situation that jump-threading would work before running
instcombine, but not after. Since instcombine produces this pattern, I think we
need to consider it canonical and support it in JumpThreading. In general,
for limiting recursion, I think the existing restriction to phi and cmp nodes
should be sufficient to avoid looking through unprofitable chains of
instructions.

Patch by Keno Fischer!
Differential Revision: https://reviews.llvm.org/D42262

llvm-svn: 327150

95d9ccb2

[WebAssembly] Refactor order of creation for SyntheticFunction · ebda41f8

Nicholas Wilson authored Mar 09, 2018

Previously we created __wasm_call_ctors with null InputFunction, and
added the InputFunction later. Now we create the SyntheticFunction with
null body, and set the body later.

Differential Revision: https://reviews.llvm.org/D44206

llvm-svn: 327149

ebda41f8

Move generic test to the Generic directory · 3e4a82ff
Adrian Prantl authored Mar 09, 2018
```
llvm-svn: 327148
```
3e4a82ff
[AMDGPU] fix test to be independent of FP undef · 56d59c1f
Sanjay Patel authored Mar 09, 2018
```
llvm-svn: 327147
```
56d59c1f

[WebAssembly] Disallow weak undefined globals in the object format · 15f349f7

Nicholas Wilson authored Mar 09, 2018

This implements https://github.com/WebAssembly/tool-conventions/pull/47

Differential Revision: https://reviews.llvm.org/D44201

llvm-svn: 327146

15f349f7

[sanitizer] Align & pad the allocator structures to the cacheline size · 69df838b

Kostya Kortchinsky authored Mar 09, 2018

Summary:
Both `SizeClassInfo` structures for the 32-bit primary & `RegionInfo`
structures for the 64-bit primary can be used by different threads, and as such
they should be aligned & padded to the cacheline size to avoid false sharing.
The former was padded but the array was not aligned, the latter was not padded
but we lucked up as the size of the structure was 192 bytes, and aligned by
the properties of `mmap`.

I plan on adding a couple of fields to the `RegionInfo`, and some highly
threaded tests pointed out that without proper padding & alignment, performance
was getting a hit - and it is going away with proper padding.

This patch makes sure that we are properly padded & aligned for both. I used
a template to avoid padding if the size is already a multiple of the cacheline
size. There might be a better way to do this, I am open to suggestions.

Reviewers: alekseyshl, dvyukov

Reviewed By: alekseyshl

Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D44261

llvm-svn: 327145

69df838b

[InstSimplify] fix FP infinite hex constant values in tests; NFC · 3675b8ce
Sanjay Patel authored Mar 09, 2018
```
Really should improve this...

llvm-svn: 327144
```
3675b8ce

Revert "[PowerPC] LSR tunings for PowerPC" · ef7c4976

Stefan Pintilie authored Mar 09, 2018

Revert the rest of the LST tune commit.
It seems that the LSR tune commit breaks internal tests.
Reverting the commit.

llvm-svn: 327143

ef7c4976

Revert "[PowerPC] Move test to correct location." · 7f879a84
Stefan Pintilie authored Mar 09, 2018
```
Revert part of the LSR tune commit.

llvm-svn: 327142
```
7f879a84
Tidyup comment that was destroyed by clang-format. NFCI. · 033a4167
Simon Pilgrim authored Mar 09, 2018
```
llvm-svn: 327141
```
033a4167
[X86][SSE] createVariablePermute - move index vector canonicalization to top of function. NFCI. · 322c521e
Simon Pilgrim authored Mar 09, 2018
```
This is to make it easier to return early from the switch statement with custom lowering.

llvm-svn: 327140
```
322c521e
Try to fix Windows bot by forcing "rm". · 2a705312
Tim Northover authored Mar 09, 2018
```
llvm-svn: 327139
```
2a705312

[LangRef] make it clear that FP instructions do not have side effects · 3aaf6a02

Sanjay Patel authored Mar 09, 2018

Also, fix the undef vs. UB example to use 'sdiv' because that can trigger div-by-zero UB.

The existing text for the constrained intrinsics says:
"By default, LLVM optimization passes assume that the rounding mode is round-to-nearest 
and that floating point exceptions will not be monitored. Constrained FP intrinsics are 
used to support non-default rounding modes and accurately preserve exception behavior 
without compromising LLVM’s ability to optimize FP code when the default behavior is 
used."
...so the additional text with the normal FP opcodes should make the different modes
clear.

Differential Revision: https://reviews.llvm.org/D44216

llvm-svn: 327138

3aaf6a02

[dsymutil] Unify error handling and add color · 1dd69783

Jonas Devlieghere authored Mar 09, 2018

We improved the handling of errors and warnings in dwarfdump's verifier
in rL314498. This patch does the same thing for dsymutil.

Differential revision: https://reviews.llvm.org/D44052

llvm-svn: 327137

1dd69783

[OPENMP] Fix the address of the original variable in task reductions. · 21dab124

Alexey Bataev authored Mar 09, 2018

If initialization of the task reductions requires pointer to original
variable, which is stored in the threadprivate storage, we used the
address of this pointer instead.

llvm-svn: 327136

21dab124