Commits · 49d00824bbbb8945b92c0f592c6951a881a6242f · Lorenzo Albano / LLVM bpEVL

Mar 29, 2020

[VPlan] Use one VPWidenRecipe per original IR instruction. (NFC). · 49d00824

Florian Hahn authored Mar 29, 2020

This patch changes VPWidenRecipe to only store a single original IR
instruction. This is the first required step towards modeling it's
operands as VPValues and also towards breaking it up into a
VPInstruction.

Discussed as part of D74695.

Reviewers: Ayal, gilr, rengolin

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D76988

49d00824

Mar 28, 2020

Revert "[FileCollector] Add a method to add a whole directory and it contents." · 190df4a5
Jonas Devlieghere authored Mar 27, 2020
```
This reverts commit 8913769e because the
unit test is failing on the Windows bot.
```
190df4a5

[FileCollector] Add a method to add a whole directory and it contents. · 8913769e

Jonas Devlieghere authored Mar 27, 2020

Extend the FileCollector's API with addDirectory which adds a directory
and its contents to the VFS mapping.

Differential revision: https://reviews.llvm.org/D76671

8913769e

FunctionRef: Strip cv qualifiers in the converting constructor · cbce88dd

David Blaikie authored Mar 27, 2020

Without this some instances of copy construction would use the
converting constructor & lead to the destination function_ref referring
to the source function_ref instead of the underlying functor.

Discovered in feedback from 857bf5da

Thanks to Johannes Doerfert, Arthur O'Dwyer, and Richard Smith for the
discussion and debugging.

cbce88dd

Mar 27, 2020

[VirtualFileSystem] Support directory entries in the YAMLVFSWriter · 3ef33e69

Jonas Devlieghere authored Mar 27, 2020

The current implementation of the JSONWriter does not support writing
out directory entries. Earlier today I added a unit test to illustrate
the problem. When an entry is added to the YAMLVFSWriter and the path is
a directory, it will incorrectly emit the directory as a file, and any
files inside that directory will not be found by the VFS.

It's possible to partially work around the issue by only adding "leaf
nodes" (files) to the YAMLVFSWriter. However, this doesn't work for
representing empty directories. This is a problem for clients of the VFS
that want to iterate over a directory. The directory not being there is
not the same as the directory being empty.

This is not just a hypothetical problem. The FileCollector for example
does not differentiate between file and directory paths. I temporarily
worked around the issue for LLDB by ignoring directories, but I suspect
this will prove problematic sooner rather than later.

This patch fixes the issue by extending the JSONWriter to support
writing out directory entries. We store whether an entry should be
emitted as a file or directory.

Differential revision: https://reviews.llvm.org/D76670

3ef33e69

[ORC] Introduce JITSymbolFlags::HasMaterializeSideEffectsOnly flag. · cb84e482

Lang Hames authored Mar 25, 2020

This flag can be used to mark a symbol as existing only for the purpose of
enabling materialization. Such a symbol can be looked up to trigger
materialization with the lookup returning only once materialization is
complete. Symbols with this flag will never resolve however (to avoid
permanently polluting the symbol table), and should only be looked up using
the SymbolLookupFlags::WeaklyReferencedSymbol flag. The primary use case for
this flag is initialization symbols.

cb84e482

[ORC] Don't create MaterializingInfo entries unnecessarily. · d38d06e6
Lang Hames authored Mar 26, 2020

d38d06e6

[ARM][MVE] Add DoubleWidthResult flag · 0e6aa083

Sam Parker authored Mar 27, 2020

Add a flag for those instructions which read from the top/bottom
halves of their inputs and produce a vector of results with double
width elements.

Differential Revision: https://reviews.llvm.org/D76762

0e6aa083

[llvm][TextAPI/MachO] silence clang-tidy warnings, NFC · d26e0bcf
Cyndy Ishida authored Mar 26, 2020
```
* applies only to tests
```
d26e0bcf

Mar 26, 2020

[AMDGPU] Fix PC register mapping in wave32 mode · bd12ecb8

Scott Linder authored Mar 24, 2020

Summary:
The PC_32 DWARF register is for a 32-bit process address space which we
don't implement in AMDGCN; another way of putting this is that the size
of the PC register is not a function of the wavefront size. If we ever
implement a 32-bit process address space we will need to add two more
DwarfFlavours i.e. we will need to represent the product of (wave32,
wave64) x (64-bit address space, 32-bit address space).

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76732

bd12ecb8

[GlobalISel] add helper function to create arbitrary libcalls · 9fedb690

Dominik Montada authored Mar 26, 2020

Summary:
The existing helper function can only create a libcall to functions available in
RTLIB. Add a helper function that can create a libcall to a given function name
using the provided calling convention.

Reviewers: aditya_nandakumar, t.p.northover, rovka, arsenm, dsanders

Reviewed By: arsenm

Subscribers: wdng, hiraditya, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76845

9fedb690

[PATCH] [ARM] ARMv8.6-a command-line + BFloat16 Asm Support · 71ae267d

Ties Stuij authored Mar 26, 2020

Summary:
This patch introduces command-line support for the Armv8.6-a architecture and assembly support for BFloat16. Details can be found
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

in addition to the GCC patch for the 8..6-a CLI:
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg02647.html

In detail this patch

- march options for armv8.6-a
- BFloat16 assembly

This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.

Based on work by:
- labrinea
- MarkMurrayARM
- Luke Cheeseman
- Javed Asbar
- Mikhail Maltsev
- Luke Geeson

Reviewers: SjoerdMeijer, craig.topper, rjmccall, jfb, LukeGeeson

Reviewed By: SjoerdMeijer

Subscribers: stuij, kristof.beyls, hiraditya, dexonsmith, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D76062

71ae267d

Mar 25, 2020

[CFG/BasicBlock] Rename succ_const to const_succ. [NFC] · 3abcbf99

Alina Sbirlea authored Mar 10, 2020

Summary:
Rename `succ_const_iterator` to `const_succ_iterator` and
`succ_const_range` to `const_succ_range` for consistency with the
predecessor iterators, and the corresponding iterators in
MachineBasicBlock.

Reviewers: nicholas, dblaikie, nlewycky

Subscribers: hiraditya, bmahjour, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75952

3abcbf99

[Clang] Fix clang-tidy errors. · df48e392
Alexander Belyaev authored Mar 25, 2020

df48e392
[NFC] Rename function to match Coding Convention and fix typo in KnowledgeRetention · d72c586a
Tyker authored Mar 25, 2020

d72c586a

[ARM][MVE] Add HorizontalReduction flag · e8725020

Sam Parker authored Mar 25, 2020

Add a target flag for instructions that reduce into one, or more,
scalar reg(s), including variants of:
- VADDV
- VABAV
- VMINV/VMAXV
- VMLADAV

Differential Revision: https://reviews.llvm.org/D76683

e8725020

GlobalISel: Introduce bitcast legalize action · 39c55cef

Matt Arsenault authored Feb 13, 2020

For some operations, the type is unimportant and only the number of
bits matters. For example I don't want to treat <4 x s8> as a legal
type, but I also don't want to decompose loads of this into smaller
pieces to get legal register types.

On AMDGPU in SelectionDAG, we legalize a number of operations (most
notably load and store) by coercing all types to vectors of i32. For
GlobalISel, I'm trying very hard to avoid doing this for every type,
but I don't think this strategy can be completely avoided. I'm trying
to avoid bitcasts for any legitimately legal type we can operate on,
since the intervening bitcasts have proven to be a hassle.

For loads, I think I can get away without ever casting the result
type, and handling any arbitrary bitwidth during selection (I will
eventually want new tablegen support to help with this, rather than
having to add every possible type as legal). The unmerge required to
do anything with the value should expand to the expected shifts. This
is trickier for stores, since it would now require handling a wide
array of truncates during selection which I don't want.

Future potentially interesting case are for vector indexing, where
sub-dword type should be indexed in s32 pieces.

39c55cef

Mar 24, 2020

[Attributor] Use knowledge retained in llvm.assume (operand bundles) · 5699d08b

Johannes Doerfert authored Feb 20, 2020

This patch integrates operand bundle llvm.assumes [0] with the
Attributor. Most IRAttributes will now look at uses of the associated
value and if there are llvm.assume operand bundle uses with the right
tag we will check if they are in the must-be-executed-context (around
the context instruction). Droppable users, which is currently only
llvm::assume, are handled special in some places now as well.

[0] http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D74888

5699d08b

[ConstantRange] Add initial support for binaryXor. · 7caba339

Florian Hahn authored Mar 24, 2020

The initial implementation just delegates to APInt's implementation of
XOR for single element ranges and conservatively returns the full set
otherwise.

Reviewers: nikic, spatel, lebedev.ri

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D76453

7caba339

[ARM][MVE] Add target flag for narrowing insts · 6f86e6bf

Sam Parker authored Mar 24, 2020

Add a flag, 'RetainsPreviousHalfElement', for operations that operate
on top/bottom halves of their input and only write to half of their
destination, leaving the other half to retain its previous value.

Differential Revision: https://reviews.llvm.org/D76608

6f86e6bf

Add an algorithm for performing "optimal" layout of a struct. · 49e5a97e

John McCall authored Mar 20, 2020

The algorithm supports both assigning a fixed offset to a field prior to
layout and allowing fields to have sizes that aren't multiples of their
required alignments. This means that the well-known algorithm of sorting
by decreasing alignment isn't always good enough. Still, we start with
that, and only if that leaves padding around do we fall back on a greedy
padding-minimizing algorithm.

There is no known efficient algorithm for producing a guaranteed-minimal
layout in all cases. In fact, allowing arbitrary fixed-offset fields means
there's a straightforward reduction from bin-packing, making this NP-hard.
But as usual with such problems, we can still efficiently produce adequate
solutions to the cases that matter most to us.

I intend to use this in coroutine frame layout, where the retcon lowerings
very badly want to minimize total space usage, and where the switch lowering
can indeed produce a header with interior padding if the promise field is
highly-aligned. But it may be useful in a much wider variety of situations.

49e5a97e

[VirtualFileSystem] Add unit test for vfs::YAMLVFSWriter · 42df3e29

Jonas Devlieghere authored Mar 23, 2020

Add a unit test for vfs::YAMLVFSWriter.

This patch exposes an issue in the writer: when we call addFileMapping
with a directory, the VFS writer will emit it as a regular file, causing
any of the nested files or directories to not be found.

42df3e29

Mar 23, 2020

Allow replacing intrinsic operands with variables · 43d98a0e

Matt Arsenault authored Feb 13, 2019

Since intrinsics can now specify when an argument is required to be
constant, it is now OK to replace arguments with variables if they
aren't. This means intrinsics must now be accurately marked with
immarg.

43d98a0e

Don't export symbols from clang/opt/llc if plugins are disabled. · 896335bf

Eli Friedman authored Mar 20, 2020

The only reason we export symbols from these tools is to support
plugins; if we don't have plugins, exporting symbols just bloats the
executable and makes LTO less effective.

See review of D75879 for the discussion that led to this patch.

Differential Revision: https://reviews.llvm.org/D76527

896335bf

AMDGPU/GlobalISel: Implement computeNumSignBitsForTargetInstr · 2ad5fc1d
Matt Arsenault authored Mar 22, 2020

2ad5fc1d

GlobalISel: Prepare to allow other target unit tests · 58f843a5

Matt Arsenault authored Mar 22, 2020

Currently all GlobalISel unittests use a hardcoded AArch64 target
machine. Factor this so I can write some for AMDGPU specific known
bits unittests.

58f843a5

[CMake] Fix AMDGPUTests -DBUILD_SHARED_LIBS=on builds and trim dependencies of... · 1b9cd51d

Fangrui Song authored Mar 23, 2020

[CMake] Fix AMDGPUTests -DBUILD_SHARED_LIBS=on builds and trim dependencies of AMDGPUTests and AMDDwarfTests after D76357/G24698e526f619271705fe72bcaa928be9bc82484

FAILED: unittests/Target/AMDGPU/AMDGPUTests
...
ld.lld: error: undefined symbol: llvm::MCRegisterInfo::getLLVMRegNum(unsigned int, bool) const
>>> referenced by DwarfRegMappings.cpp:60 (/usr/local/google/home/maskray/llvm/llvm/unittests/Target/AMDGPU/DwarfRegMappings.cpp:60)
>>>               unittests/Target/AMDGPU/CMakeFiles/AMDGPUTests.dir/DwarfRegMappings.cpp.o:(AMDGPUDwarfRegMappingTests_TestWave64DwarfRegMapping_Test::TestBody())
>>> referenced by DwarfRegMappings.cpp:82 (/usr/local/google/home/maskray/llvm/llvm/unittests/Target/AMDGPU/DwarfRegMappings.cpp:82)
>>>               unittests/Target/AMDGPU/CMakeFiles/AMDGPUTests.dir/DwarfRegMappings.cpp.o:(AMDGPUDwarfRegMappingTests_TestWave32DwarfRegMapping_Test::TestBody())

A -DBUILD_SHARED_LIBS=off build is good because AMDGPUCodeGen pulls in MC.
A -DBUILD_SHARED_LIBS=on build requires all direct dependencies (MC) to be listed becuase llvm/cmake/modules/HandleLLVMOptions.cmake uses -Wl,-z,defs

1b9cd51d

[Support] Silence warning in Path unittests when compiling with clang-cl · c1f8595f
Alexandre Ganea authored Mar 23, 2020
```
warning: comparison of integers of different signs: 'const unsigned long long' and 'const int' [-Wsign-compare]
```
c1f8595f
Add AMDGPU MC unittests only when AMDGPU target is being built · 0ca19efe
Ram Nalamothu authored Mar 23, 2020
```
Fixes the build failures introduced by 24698e52
```
0ca19efe

[Analysis] simplify code for scaleShuffleMask · ebf83c36

Sanjay Patel authored Mar 23, 2020

This is NFC-ish. The results should be identical, but perf is hopefully
better with the fast-path for no scaling. Added a unit test for that.

The code is adapted from what used to be the DAGCombiner equivalent
function before D76508 (rG0eeee83d7513).

ebf83c36

Implement wave32 DWARF register mapping · 24698e52

Ram Nalamothu authored Mar 23, 2020

Implement the DWARF register mapping described in llvm/docs/AMDGPUUsage.rst.

This enables generating appropriate DWARF register numbers for wave64 and
wave32 modes.

24698e52

[VectorUtils] move x86's scaleShuffleMask to generic VectorUtils · 0eeee83d

Sanjay Patel authored Mar 23, 2020

We have some long-standing missing shuffle optimizations that could
use this transform via VectorCombine now:
https://bugs.llvm.org/show_bug.cgi?id=35454
(and we still don't get that case in the backend either)

This function is apparently templated because there's existing code
in IR that treats mask values as unsigned and backend code that
treats masks values as signed.

The mask values are not endian-dependent (as shown by the existing
bitcast transform from DAGCombiner).

Differential Revision: https://reviews.llvm.org/D76508

0eeee83d

[GlobalISel] support widen unmerge if WideTy > SrcTy · ccf49b9e

Dominik Montada authored Mar 20, 2020

Summary:
Widening G_UNMERGE_VALUES to a type which is larger than the
original source type is the same as widening it to the same
type as the source type: in both cases, G_UNMERGE_VALUES has
to be replaced with bit arithmetic which. Although the arithmetic
itself is independent of whether the source type is smaller
or equal to the widen type, widening the source type to the
widen type should result in less artifacts being emitted,
since this is the type that the user explicitly requested.

Reviewers: arsenm, dsanders, aemerson, aditya_nandakumar

Reviewed By: arsenm, dsanders

Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76494

ccf49b9e

Mar 21, 2020

Revert "[ADT] Implement the Waymarking as an independent utility" · 34fd007a
Ehud Katz authored Mar 21, 2020
```
This reverts commit 73cf8abb.
```
34fd007a

[ADT] Implement the Waymarking as an independent utility · 73cf8abb

Ehud Katz authored Mar 21, 2020

This is the Waymarking algorithm implemented as an independent utility.
The utility is operating on a range of sequential elements.
First we "tag" the elements, by calling `fillWaymarks`.
Then we can "follow" the tags from every element inside the tagged
range, and reach the "head" (the first element), by calling
`followWaymarks`.

Differential Revision: https://reviews.llvm.org/D74415

73cf8abb

Mar 20, 2020

unittest: Work around build failure on MSVC builders · 7ec24448

Vedant Kumar authored Mar 20, 2020

MSVC insists on using the deleted move constructor instead of the copy
constructor:

http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/41203

C:\ps4-buildslave2\lld-x86_64-win7\llvm-project\llvm\unittests\ADT\CoalescingBitVectorTest.cpp(193):
error C2280: 'llvm::CoalescingBitVector<unsigned
int,16>::CoalescingBitVector(llvm::CoalescingBitVector<unsigned int,16>
&&)': attempting to reference a deleted function

7ec24448

[ADT] CoalescingBitVector: Add advanceToLowerBound iterator operation · a3fd1a1c

Vedant Kumar authored Mar 19, 2020

advanceToLowerBound moves an iterator to the first bit set at, or after,
the given index. This can be faster than doing IntervalMap::find.

rdar://60046261

Differential Revision: https://reviews.llvm.org/D76466

a3fd1a1c

[ADT] CoalescingBitVector: Avoid initial heap allocation, NFC · 4716ebb8

Vedant Kumar authored Mar 19, 2020

Avoid making a heap allocation when constructing a CoalescingBitVector.

This reduces time spent in LiveDebugValues when compiling sqlite3 by
700ms (0.5% of the total User Time).

rdar://60046261

Differential Revision: https://reviews.llvm.org/D76465

4716ebb8

Cleanup the plumbing for DILineInfoSpecifier. [NFC - Try 2] · 5de4ba17
Sterling Augustine authored Mar 19, 2020

5de4ba17
Revert "Cleanup the plumbing for DILineInfoSpecifier. [NFC]" · 6343526d
Sterling Augustine authored Mar 19, 2020
```
This broke lldb. Will fix and resubmit.

This reverts commit 98ff6eb6.
```
6343526d