Commits · 5db37f3bca3d404b0d7fcbe1dc764ee67665e6c2 · Lorenzo Albano / LLVM bpEVL

Mar 26, 2020

Make PS4 use -fno-use-init-array only as the ABI does not support .init_array. · 5db37f3b
Douglas Yung authored Mar 26, 2020
```
Reviewed by Paul Robinson
```
5db37f3b

[Hexagon] Add support for Linux/Musl ABI (part 2) · b0da0949

Sid Manning authored Mar 03, 2020

A continuation of https://reviews.llvm.org/D72701.  This
adds support needed in clang.

Differential Revision: https://reviews.llvm.org/D75638

b0da0949

[AMDGPU] Propagate amdgpu-waves-per-eu to callees · 4c4b7184
Stanislav Mekhanoshin authored Mar 25, 2020
```
Differential Revision: https://reviews.llvm.org/D76868
```
4c4b7184

[OPENMP50]Fix the checks for the nesting of scan directives. · 2a43a161

Alexey Bataev authored Mar 26, 2020

Fixed the check for the orhaned scan directives and improved checks for
parallel for and parallel for simd directives.

2a43a161

[lld][Wasm] Wasm-ld emits invalid .debug_ranges entries for non-live symbols · aff75e1a

Paolo Severini authored Mar 26, 2020

When the debug info contains a relocation against a dead symbol, wasm-ld
may emit spurious range-list terminator entries (entries with Start==0
and End==0). This change fixes this by emitting the WasmRelocation
Addend as End value for a non-live symbol.

Reviewed by: sbc100, dblaikie

Differential Revision: https://reviews.llvm.org/D74781

aff75e1a

[gn build] Port 9f7d4150 · 19628643
LLVM GN Syncbot authored Mar 26, 2020

19628643

[X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG. · 9f7d4150

Craig Topper authored Mar 26, 2020

These transforms rely on a vector reduction flag on the SDNode
set by SelectionDAGBuilder. This flag exists because SelectionDAG
can't see across basic blocks so SelectionDAGBuilder is looking
across and saving the info. X86 is the only target that uses this
flag currently. By removing the X86 code we can remove the flag
and the SelectionDAGBuilder code.

This pass adds a dedicated IR pass for X86 that looks across the
blocks and transforms the IR into a form that the X86 SelectionDAG
can finish.

An advantage of this new approach is that we can enhance it to
shrink the phi nodes and final reduction tree based on the zeroes
that we need to concatenate to bring the partially reduced
reduction back up to the original width.

Differential Revision: https://reviews.llvm.org/D76649

9f7d4150

[clang] Allow -DDEFAULT_SYSROOT to be a relative path · 0731372e

Sam Clegg authored Mar 19, 2020

In this case we interpret the path as relative the clang driver binary.

This allows SDKs to be built that include clang along with a custom
sysroot without requiring users to specify --sysroot to point to the
directory where they installed the SDK.

See https://github.com/WebAssembly/wasi-sdk/issues/58

Differential Revision: https://reviews.llvm.org/D76653

0731372e

[X86] Prefer PACKUS(AND(),AND()) to SHUFFLE(PSHUFB(),PSHUFB()) on all targets · ad36491e
Simon Pilgrim authored Mar 26, 2020
```
Extends rG9d1721ce3926 to support AVX2+ targets.
```
ad36491e

[AMDGPU] Rename overloaded getMaxWavesPerEU to getWavesPerEUForWorkGroup · 0fe096c4

Jay Foad authored Mar 26, 2020

Summary: I think Max in the name was misleading. NFC.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76860

0fe096c4

[AMDGPU] Remove getMaxWavesPerCU in favour of getWavesPerWorkGroup. · bb9c4fd7

Jay Foad authored Mar 26, 2020

Summary:
These methods were identical. I chose to remove getMaxWavesPerCU because
I think Max in the name was misleading. NFC.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76859

bb9c4fd7

[WEbAssembly] Clear frame base vreg in explicit-locals when stack pointer is dead · e110897e

Derek Schuff authored Mar 25, 2020

Having an alloca in a function causes the stack pointer to be generated in the
prolog, but if it's unused other than for debug info, explicit-locals will drop
it and not allocate a local. In this case we need to reset the FrameBaseVreg.

Differential Revision: https://reviews.llvm.org/D76784

e110897e

[X86] lowerV16I8Shuffle - create v8i16 mask for PACKUS(AND(),AND()) patterns. · 39a52a19

Simon Pilgrim authored Mar 26, 2020

We can improve computeKnownBits results by avoiding excess bitcasts.

For this pattern we were doing:

  (v16i8 PACKUS(v8i16 BITCAST(v16i8 AND(V1, MASK)), v8i16 BITCAST(v16i8 AND(V2, MASK))))

By performing the MASK/AND with a v8i16 type and bitcasting V1/V2 directly we can help computeKnownBits see that the mask is clearing the upper bits and allows shuffle combining to peek through later on.

This will be necessary to extend rG9d1721ce3926 to AVX2+ targets in a future patch.

39a52a19

Revert "[OPENMP50]Add basic support for inscan reduction modifier." · f9e71f4d
Alexey Bataev authored Mar 26, 2020
```
This reverts commit 8099e0fe to fix the
problems with the Windows-based buildbots.
```
f9e71f4d

[sanitizer][RISCV] Implement SignalContext::GetWriteFlag for RISC-V · ad1466f8

Luís Marques authored Mar 26, 2020

This patch follows the approach also used for MIPS, where we decode the
offending instruction to determine if the fault was caused by a read or
write operation, as that seems to be the only relevant information we have
in the signal context structure to determine that.

Differential Revision: https://reviews.llvm.org/D75168

ad1466f8

[AIX] discard the label in the csect of function description and use qualname for linkage · fdfe411e

diggerlin authored Mar 26, 2020

SUMMARY:

SUMMARY
for a source file  "test.c"

void foo() {};

llc will generate assembly code as (assembly patch)
     .globl  foo
     .globl  .foo
     .csect foo[DS]
foo:

        .long   .foo
        .long   TOC[TC0]
        .long   0

   and symbol table as (xcoff object file)
   [4]     m   0x00000004     .data     1  unamex                    foo
   [5]     a4  0x0000000c       0    0     SD       DS    0    0
   [6]     m   0x00000004     .data     1  extern                    foo
   [7]     a4  0x00000004       0    0     LD       DS    0    0

   After first patch, the assembly will be as

        .globl  foo[DS]                 # -- Begin function foo
        .globl  .foo
        .align  2
        .csect foo[DS]
        .long   .foo
        .long   TOC[TC0]
        .long   0

    and symbol table will as
   [6]     m   0x00000004     .data     1  extern                    foo
   [7]     a4  0x00000004       0    0     DS      DS    0    0
Change the code for the assembly path and xcoff objectfile patch for llc.

Reviewers: Jason Liu
Subscribers: wuzish, nemanjai, hiraditya

Differential Revision: https://reviews.llvm.org/D76162

fdfe411e

[libomptarget] Add missing elf_end call in elf_common.c · 856c9954

Jon Chesterfield authored Mar 26, 2020

Summary:
[libomptarget] Add missing elf_end call in elf_common.c
Noticed when reviewing D76843.

Reviewers: simoll, jdoerfert, efocht, AndreyChurbanov, grokos, manorom

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D76874

856c9954

[OPENMP50]Add basic support for inscan reduction modifier. · 8099e0fe
Alexey Bataev authored Mar 25, 2020
```
Added basic support (parsing/sema checks) for the inscan modifier in the
reduction clauses.
```
8099e0fe

[cuda][hip] Add CUDA builtin surface/texture reference support. · 6a9ad5f3

Michael Liao authored Mar 07, 2020

Summary:
- Even though the bindless surface/texture interfaces are promoted,
there are still code using surface/texture references. For example,
[PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
compilation issue for code using `tex2D` with texture references. For
better compatibility, this patch proposes the support of
surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
`nvcc` does use builtins for texture support. From the limited NVVM
documentation[^nvvm] and NVPTX backend texture/surface related
tests[^test], it's believed that surface/texture references are
supported by replacing their reference types, which are annotated with
`device_builtin_surface_type`/`device_builtin_texture_type`, with the
corresponding handle-like object types, `cudaSurfaceObject_t` or
`cudaTextureObject_t`, in the device-side compilation. On the host
side, that global handle variables are registered and will be
established and updated later when corresponding binding/unbinding
APIs are called[^bind]. Surface/texture references are most like
device global variables but represented in different types on the host
and device sides.
- In this patch, the following changes are proposed to support that
behavior:
+ Refine `device_builtin_surface_type` and
`device_builtin_texture_type` attributes to be applied on `Type`
decl only to check whether a variable is of the surface/texture
reference type.
+ Add hooks in code generation to replace that reference types with
the correponding object types as well as all accesses to them. In
particular, `nvvm.texsurf.handle.internal` should be used to load
object handles from global reference variables[^texsurf] as well as
metadata annotations.
+ Generate host-side registration with proper template argument
parsing.

---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365

6a9ad5f3

[AMDGPU] Fix PC register mapping in wave32 mode · bd12ecb8

Scott Linder authored Mar 24, 2020

Summary:
The PC_32 DWARF register is for a 32-bit process address space which we
don't implement in AMDGCN; another way of putting this is that the size
of the PC register is not a function of the wavefront size. If we ever
implement a 32-bit process address space we will need to add two more
DwarfFlavours i.e. we will need to represent the product of (wave32,
wave64) x (64-bit address space, 32-bit address space).

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76732

bd12ecb8

Roll otherwise unused subexpressions into an assertion · 9002db05
David Blaikie authored Mar 26, 2020

9002db05
[InstCombine] add shuffle-with-bitcast-operand tests; NFC · 5237262f
Sanjay Patel authored Mar 25, 2020

5237262f

Correctly handle using foo = std::foo inside namespaces. · 6c6fba88

Sterling Augustine authored Mar 25, 2020

Summary:
The gdb pretty printer misprints variables declared via
using declarations of the form:

namespace foo {
using string_view = std::string_view;

string_view bar;
}

This change fixes that, by deferring the decision to ignore
types not inside std until after desugaring.

Reviewers: #libc!

Subscribers: broadwaylamb, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D76816

6c6fba88

[Alignment][NFC] Use llvmTargetFrameLowering::getStackAlign · b727aabc

Guillaume Chatelet authored Mar 26, 2020

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Reviewed By: courbet

Subscribers: wuzish, arsenm, jyknight, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, fedor.sergeev, jrtc27, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76613

b727aabc

[InstCombine] Fix Incorrect fold of ashr+xor -> lshr w/ vectors · 7a89a5d8
Jon Roelofs authored Mar 25, 2020
```
Fixes https://bugs.llvm.org/show_bug.cgi?id=43665
```
7a89a5d8

[docs][Phabricator] git migration related update · fe025a34

Jinsong Ji authored Mar 26, 2020

1.Add instructions to update author when committing other's patch

We have updated DeveloperPolicy to show how to change author in
https://reviews.llvm.org/D72468

We should also update Phabricator page to include such infomation,
in case people follow the steps here and forget to update author info.

2. Replace `git llvm push` with `git push`

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D76718

fe025a34

[WebAssembly] Add test for event section order change · f033f201

Heejin Ahn authored Mar 25, 2020

Summary:
This adds a test for D76752. Now the global section comes after the
event section, and this change makes sure it is satisfied.

Reviewers: sbc100, tlively

Reviewed By: tlively

Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76823

f033f201

[mlir] Rename CMake target MLIRQuantOps to MLIRQuant · 6946ca4b

Marius Brehler authored Mar 26, 2020

With commit 4d60f47b VectorOps was renamed to Vector and the naming of
the CMake target was adjusted. With commit 363dd3f3 QuantOps was
renamed to Quant, but the naming of the CMake target is left
untouched. This renames the CMake target.

6946ca4b

[ASan] Fix issue where system log buffer was not cleared after reporting an issue. · 445b810f

Dan Liew authored Mar 24, 2020

Summary:
When ASan reports an issue the contents of the system log buffer
(`error_message_buffer`) get flushed to the system log (via
`LogFullErrorReport()`). After this happens the buffer is not cleared
but this is usually fine because the process usually exits soon after
reporting the issue.

However, when ASan runs in `halt_on_error=0` mode execution continues
without clearing the buffer. This leads to problems if more ASan
issues are found and reported.

1. Duplicate ASan reports in the system log. The Nth (start counting from 1)
ASan report  will be duplicated (M - N) times in the system log if M is the
number of ASan issues reported.

2. Lost ASan reports. Given a sufficient
number of reports the buffer will fill up and consequently cannot be appended
to. This means reports can be lost.

The fix here is to reset `error_message_buffer_pos` to 0 which
effectively clears the system log buffer.

A test case is included but unfortunately it is Darwin specific because
querying the system log is an OS specific activity.

rdar://problem/55986279

Reviewers: kubamracek, yln, vitalybuka, kcc, filcab

Subscribers: #sanitizers, llvm-commits

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D76749

445b810f

Allow IndexType inside tensors. · 3dceb6d2

Sean Silva authored Mar 24, 2020

It's common in many dialects to use tensors to themselves hold tensor shapes (for example, the shape is itself the result of some non-trivial calculation). Currently, such dialects have to use `tensor<?xi64>` or worse (like allowing either i32 or i64 tensors to represent shapes). `tensor<?xindex>` is the natural type to represent this, but is currently disallowed. This patch allows it.

Differential Revision: https://reviews.llvm.org/D76726

3dceb6d2

Test that would have caught recovery-expr crashes in 0788acbc . NFC · 47e7bdb1
Sam McCall authored Mar 26, 2020

47e7bdb1

[mlir] StandardToLLVM: clean up conversion patterns for vector operations · 04ed07bc

Alex Zinenko authored Mar 26, 2020

Summary:
Provide a public VectorConvertToLLVMPattern utility class to implement
conversions with automatic unrolling of operation on multidimensional vectors
to lists of operations on single-dimensional vectors when lowering to the LLVM
dialect. Drop the template-based check on the number of operands since the
actual implementation does not depend on the operand number anymore. This check
only creates spurious concepts (UnaryOpLowering, BinaryOpLowering, etc).

Differential Revision: https://reviews.llvm.org/D76865

04ed07bc

[mlir] StandardToLLVM: make one-to-one convresion pattern publicly available · 987fbae0

Alex Zinenko authored Mar 26, 2020

Summary:
The Standard-to-LLVM dialect convresion has a set of utility classes that
simplify conversions, including patterns that provide one-to-one conversion
operation conversion with optional result packing. Expose these classes in a
public header so that conversions other than Standard-to-LLVM (e.g. vectors, or
LLVM-based intrinsics) could also use them. Since the patterns are implemented
as class templates and in order to keep the code size limited, keep the
implementation private by resorting to op identifiers instead of template-based
builders.

Differential Revision: https://reviews.llvm.org/D76864

987fbae0

[libc++abi] Remove unused lit feature · abcb9bb7
Louis Dionne authored Mar 26, 2020

abcb9bb7

[GWP-ASan] Use functions in backtrace test, not line numbers. · 1216f4c0

Mitch Phillips authored Mar 26, 2020

Summary:
There's no unwinding functionality on Android that allows for line
numbers to be retrieved in-process. As a result, we can't have
this backtrace test run on Android.

Cleanup the test to use optnone functions instead, which is more stable
than line numbers anyway.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: #sanitizers, morehouse, cferris

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D76807

1216f4c0

[lldb/CMake] Fix `install` for multi-configuration generators. · 17e4c387

Jonas Devlieghere authored Mar 26, 2020

For multi-generator builds like MSVC and Xcode, the install source and
destination of the lldb-python-scripts target contains configuration
dependent paths and therefore need to be substituted.

Differential revision: https://reviews.llvm.org/D76827

17e4c387

[analyzer] Add the Preprocessor to CheckerManager · 4dc84729
Kirstóf Umann authored Mar 26, 2020

4dc84729

CUDA: Fix broken test run lines · 40076c14

Matt Arsenault authored Mar 26, 2020

There was a misisng space between the -march and --cuda-gpu-arch
arguments, so --cuda-gpu-arch wasn't actually being parsed. I'm not
sure what the intent of the sm_10 run lines were, but they error as an
unsupported architecture. Switch these to something else.

40076c14

[AMDGPU] Make use of divideCeil. NFC. · 0602c20b
Jay Foad authored Mar 26, 2020

0602c20b
[AMDGPU] Remove unused methods. NFC. · 596bed3f
Jay Foad authored Mar 26, 2020

596bed3f