Commits · 39d4c9fd56e353cc6ce55d5e482c8d14fb340ecc · Lorenzo Albano / LLVM bpEVL

Oct 11, 2019

[VPlan] Add moveAfter to VPRecipeBase. · 39d4c9fd

Florian Hahn authored Oct 11, 2019

This patch adds a moveAfter method to VPRecipeBase, which can be used to
move elements after other elements, across VPBasicBlocks, if necessary.

Reviewers: dcaballe, hsaito, rengolin, hfinkel

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D46825

llvm-svn: 374565

39d4c9fd

[AIX] Use .space instead of .zero in assembly · 033d16ce

David Tenty authored Oct 11, 2019

Summary:
The AIX system assembler does not understand .zero, so we should prefer
emitting .space.

Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68815

llvm-svn: 374564

033d16ce

[AMDGPU][MC][GFX9][GFX10] Corrected number of src operands for ds_[read/write]_addtid_b32 · c4995076

Dmitry Preobrazhensky authored Oct 11, 2019

See https://bugs.llvm.org/show_bug.cgi?id=37941

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D68787

llvm-svn: 374561

c4995076

gn build: Merge r374558 · b67d3df1
LLVM GN Syncbot authored Oct 11, 2019
```
llvm-svn: 374560
```
b67d3df1

[AMDGPU][MC][GFX6][GFX7][GFX10] Added instructions buffer_atomic_[fcmpswap/fmin/fmax]* · b82fae01

Dmitry Preobrazhensky authored Oct 11, 2019

See https://bugs.llvm.org/show_bug.cgi?id=28232

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D68788

llvm-svn: 374559

b82fae01

[AMDGPU][MC][GFX10] Enabled null for 64-bit dst operands · 472c6b0a

Dmitry Preobrazhensky authored Oct 11, 2019

See https://bugs.llvm.org/show_bug.cgi?id=43524

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D68785

llvm-svn: 374557

472c6b0a

[llvm] [ocaml] Support linking against dylib · da2a29a1

Michal Gorny authored Oct 11, 2019

Support linking OCaml modules against LLVM dylib when requested,
rather than against static libs that might not be installed at all.

Differential Revision: https://reviews.llvm.org/D68452

llvm-svn: 374556

da2a29a1

[DAGCombiner] fold vselect-of-constants to shift · 3b581ac8

Sanjay Patel authored Oct 11, 2019

The diffs suggest that we are missing some more basic
analysis/transforms, but this keeps the vector path in
sync with the scalar (rL374397). This is again a
preliminary step for introducing the reverse transform
in IR as proposed in D63382.

llvm-svn: 374555

3b581ac8

Fix compilation warnings. NFC. · 30c855d4
Michael Liao authored Oct 11, 2019
```
llvm-svn: 374554
```
30c855d4

[AMDGPU][MC] Corrected parsing of optional operands · 882c3e3d

Dmitry Preobrazhensky authored Oct 11, 2019

See https://bugs.llvm.org/show_bug.cgi?id=43486

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D68350

llvm-svn: 374553

882c3e3d

[mips] Follow-up to r374544. Fix test case. · d18e56db
Simon Atanasyan authored Oct 11, 2019
```
llvm-svn: 374548
```
d18e56db

[Tests] Output of od can be lower or upper case (llvm-objcopy/yaml2obj). · 42b7cd58

Kai Nacke authored Oct 11, 2019

The command `od -t x` is used to dump data in hex format.
The LIT tests assumes that the hex characters are in lowercase.
However, there are also platforms which use uppercase letter.

To solve this issue the tests are updated to use the new
`--ignore-case` option of FileCheck.

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson

Differential Revision: https://reviews.llvm.org/D68693

llvm-svn: 374547

42b7cd58

[mips] Fix loading "double" immediate into a GPR and FPR · b051a19a

Simon Atanasyan authored Oct 11, 2019

If a "double" (64-bit) value has zero low 32-bits, it's possible to load
such value into a GP/FP registers as an instruction immediate. But now
assembler loads only high 32-bits of the value.

For example, if a target register is GPR the `li.d $4, 1.0` instruction
converts into the `lui $4, 16368` one. As a result, we get `0x3FF00000`
in the register. While a correct representation of the `1.0` value is
`0x3FF0000000000000`. The patch fixes that.

Differential Revision: https://reviews.llvm.org/D68776

llvm-svn: 374544

b051a19a

[llvm-readobj] - Remove excessive fields when dumping "Version symbols". · e6e26339

George Rimar authored Oct 11, 2019

This removes a few fields that are not useful:
"Section Name", "Address", "Offset" and "Link"
(they duplicated the information available under
the "Sections [" tag).

Differential revision: https://reviews.llvm.org/D68704

llvm-svn: 374541

e6e26339

Dead Virtual Function Elimination · 9f6a8732

Oliver Stannard authored Oct 11, 2019

Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 374539

9f6a8732

[FileCheck] Implement --ignore-case option. · 5b5b2fd2

Kai Nacke authored Oct 11, 2019

The FileCheck utility is enhanced to support a `--ignore-case`
option. This is useful in cases where the output of Unix tools
differs in case (e.g. case not specified by Posix).

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D68146

llvm-svn: 374538

5b5b2fd2

[SCEV] Add stricter verification option. · 77fbf069

Florian Hahn authored Oct 11, 2019

Currently -verify-scev only fails if there is a constant difference
between two BE counts. This misses a lot of cases.

This patch adds a -verify-scev-strict options, which fails for any
non-zero differences, if used together with -verify-scev.

With the stricter checking, some unit tests fail because
of mis-matches, especially around IndVarSimplify.

If there is no reason I am missing for just checking constant deltas, I
am planning on looking into the various failures.

Reviewers: efriedma, sanjoy.google, reames, atrick

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D68592

llvm-svn: 374535

77fbf069

[X86] isFNEG - add recursion depth limit · 6434eac8

Simon Pilgrim authored Oct 11, 2019

Now that its used by isNegatibleForFree we should try to avoid costly deep recursion

llvm-svn: 374534

6434eac8

[llvm-exegesis] Show noise cluster in analysis output. · c8eb0547

Clement Courbet authored Oct 11, 2019

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68780

llvm-svn: 374533

c8eb0547

[Windows] Use information from the PE32 exceptions directory to construct unwind plans · 30c2441a

Aleksandr Urakov authored Oct 11, 2019

This patch adds an implementation of unwinding using PE EH info. It allows to
get almost ideal call stacks on 64-bit Windows systems (except some epilogue
cases, but I believe that they can be fixed with unwind plan disassembly
augmentation in the future).

To achieve the goal the CallFrameInfo abstraction was made. It is based on the
DWARFCallFrameInfo class interface with a few changes to make it less
DWARF-specific.

To implement the new interface for PECOFF object files the class PECallFrameInfo
was written. It uses the next helper classes:

- UnwindCodesIterator helps to iterate through UnwindCode structures (and
  processes chained infos transparently);
- EHProgramBuilder with the use of UnwindCodesIterator constructs EHProgram;
- EHProgram is, by fact, a vector of EHInstructions. It creates an abstraction
  over the low-level unwind codes and simplifies work with them. It contains
  only the information that is relevant to unwinding in the unified form. Also
  the required unwind codes are read from the object file only once with it;
- EHProgramRange allows to take a range of EHProgram and to build an unwind row
  for it.

So, PECallFrameInfo builds the EHProgram with EHProgramBuilder, takes the ranges
corresponding to every offset in prologue and builds the rows of the resulted
unwind plan. The resulted plan covers the whole range of the function except the
epilogue.

Reviewers: jasonmolenda, asmith, amccarth, clayborg, JDevlieghere, stella.stamenova, labath, espindola

Reviewed By: jasonmolenda

Subscribers: leonid.mashinskiy, emaste, mgorny, aprantl, arichardson, MaskRay, lldb-commits, llvm-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D67347

llvm-svn: 374528

30c2441a

Insert module constructors in a module pass · b46dd6e9

Vitaly Buka authored Oct 11, 2019

Summary:
If we insert them from function pass some analysis may be missing or invalid.
Fixes PR42877.

Reviewers: eugenis, leonardchan

Reviewed By: leonardchan

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68832



> llvm-svn: 374481
Signed-off-by: Vitaly Buka <vitalybuka@google.com>

llvm-svn: 374527

b46dd6e9

[TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel · bb8d5400

QingShan Zhang authored Oct 11, 2019

Assume that, ModelA has scheduling resource for InstA and ModelB has scheduling resource for InstB. This is what the llvm::MCSchedClassDesc looks like:

llvm::MCSchedClassDesc ModelASchedClasses[] = {
...
InstA, 0, ...
InstB, -1,...
};

llvm::MCSchedClassDesc ModelBSchedClasses[] = {
...
InstA, -1,...
InstB, 0,...
};
The -1 means invalid num of macro ops, while it is valid if it is >=0. This is what we look like now:

llvm::MCSchedClassDesc ModelASchedClasses[] = {
...
InstA, 0, ...
InstB, 0,...
};

llvm::MCSchedClassDesc ModelBSchedClasses[] = {
...
InstA, 0,...
InstB, 0,...
};
And compiler hit the assertion here because the SCDesc is valid now for both InstA and InstB.

Differential Revision: https://reviews.llvm.org/D67950

llvm-svn: 374524

bb8d5400

[X86] Add v8i64->v8i8 ssat/usat/packus truncate tests to min-legal-vector-width.ll · e0cb1cf7

Craig Topper authored Oct 11, 2019

I wonder if we should split the v8i8 stores in order to form
two v4i8 saturating truncating stores. This would remove the
unpckl needed to concatenated the v4i8 results to make a
single store.

llvm-svn: 374519

e0cb1cf7

[ADT][Statistics] Fix test after rL374490 · 404e2128
Kadir Cetinkaya authored Oct 11, 2019
```
llvm-svn: 374518
```
404e2128

Fix modules build for r374337 · 7ff28ce1

Pavel Labath authored Oct 11, 2019

A modules build failed with the following error:
  call to function 'operator&' that is neither visible in the template definition nor found by argument-dependent lookup

Fix that by declaring the appropriate operators in the llvm::minidump
namespace.

llvm-svn: 374517

7ff28ce1

[PowerPC] Remove assertion "Shouldn't overwrite a register before it is killed" · 2fbfb04f

Yi-Hong Lyu authored Oct 11, 2019

The assertion is everzealous and fail tests like:

  renamable $x3 = LI8 0
  STD renamable $x3, 16, $x1
  renamable $x3 = LI8 0

Remove the assertion since killed flag of $x3 is not mandentory.

Differential Revision: https://reviews.llvm.org/D68344

llvm-svn: 374515

2fbfb04f

[NFC] run specific pass instead of whole -O3 pipeline for popcount recoginzation testcase. · c6c6f717
Chen Zheng authored Oct 11, 2019
```
llvm-svn: 374514
```
c6c6f717

[InstCombine] recognize popcount. · c17c5864

Chen Zheng authored Oct 11, 2019

  This patch recognizes popcount intrinsic according to algorithm from website
  http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

Differential Revision: https://reviews.llvm.org/D68189

llvm-svn: 374512

c17c5864

[X86] Add a DAG combine to turn v16i16->v16i8 VTRUNCUS+store into a saturating truncating store. · ccc85ac8
Craig Topper authored Oct 11, 2019
```
llvm-svn: 374509
```
ccc85ac8
[X86] Add test case for trunc_packus_v16i32_v16i8_store to min-legal-vector-width.ll · 4b9947e2
Craig Topper authored Oct 11, 2019
```
We aren't folding the vpmovuswb into the store.

llvm-svn: 374507
```
4b9947e2

[CVP] Remove a masking operation if range information implies it's a noop · 2d5820cd

Philip Reames authored Oct 11, 2019

This is really a known bits style transformation, but known bits isn't context sensitive. The particular case which comes up happens to involve a range which allows range based reasoning to eliminate the mask pattern, so handle that case specifically in CVP.

InstCombine likes to generate the mask-by-low-bits pattern when widening an arithmetic expression which includes a zext in the middle.

Differential Revision: https://reviews.llvm.org/D68811

llvm-svn: 374506

2d5820cd

[X86] Add more packus/ssat/usat truncate tests from legal vectors to less than 128-bit vectors. · 32097c26
Craig Topper authored Oct 11, 2019
```
Some of these have sub-optimal codegen for avx512 relative to avx2.

llvm-svn: 374505
```
32097c26

Revert 374481 "[tsan,msan] Insert module constructors in a module pass" · d3833298

Nico Weber authored Oct 11, 2019

CodeGen/sanitizer-module-constructor.c fails on mac and windows, see e.g.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/11424

llvm-svn: 374503

d3833298

[JITLink] Disable the MachO/AArch64 testcase while investigating bot failures. · e5c61cee

Lang Hames authored Oct 11, 2019

The windows bots are failing due to a memory layout error. Temporarily disabling
while I investigate whether this can be worked around, or whether the test
should be disabled on Windows.

llvm-svn: 374500

e5c61cee

[JITLink] Fix MachO/arm64 GOTPAGEOFF encoding. · b4535938

Lang Hames authored Oct 11, 2019

The original implementation failed to shift the immediate down.

This should fix some of the bot failures due to r374476.

llvm-svn: 374499

b4535938

[Attributor][FIX] Do not replace musstail calls with constant · 8fa56c49
Johannes Doerfert authored Oct 11, 2019
```
llvm-svn: 374498
```
8fa56c49
AMDGPU: Move SelectFlatOffset back into AMDGPUISelDAGToDAG · 4227c62b
Matt Arsenault authored Oct 11, 2019
```
llvm-svn: 374495
```
4227c62b

[Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS. · adb203fe

Volodymyr Sapsai authored Oct 11, 2019

The intended usage is to measure relatively expensive operations. So the
cost of the statistic is negligible compared to the cost of a measured
operation and can be enabled all the time without impairing the
compilation time.

rdar://problem/55715134

Reviewers: dsanders, bogner, rtereshin

Reviewed By: dsanders

Subscribers: hiraditya, jkorous, dexonsmith, ributzka, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68252

llvm-svn: 374490

adb203fe

[X86] Improve the AVX512 bailout in combineTruncateWithSat to allow pack... · b560fd6c

Craig Topper authored Oct 11, 2019

[X86] Improve the AVX512 bailout in combineTruncateWithSat to allow pack instructions in more situations.

If we don't have VLX we won't end up selecting a saturating
truncate for 256-bit or smaller vectors so we should just use
the pack lowering.

llvm-svn: 374487

b560fd6c

[X86] Update trunc_packus_v32i32_v32i8 test in min-legal-vector-width.ll to... · 4dc27c69

Craig Topper authored Oct 11, 2019

[X86] Update trunc_packus_v32i32_v32i8 test in min-legal-vector-width.ll to use a load for the large type and add the min-legal-vector-width attribute.

The attribute is needed to avoid zmm registers. Using memory
avoids argument splitting for large vectors.

llvm-svn: 374486

4dc27c69