Commits · fe0197e194a64f950602fb50736b6648a9e5b2a9 · Lorenzo Albano / LLVM bpEVL

Oct 07, 2020

[InstCombine] Add checks for and(logicalshift(zext(x),undef),y) cases · fe0197e1
Simon Pilgrim authored Oct 07, 2020
```
Prep work before some cleanup in narrowMaskedBinOp
```
fe0197e1
Add REQUIRES: x86-registered-target to test as it was failing on build bots without x86. · ea274be7
Douglas Yung authored Oct 07, 2020
```
This should fix the failure on http://lab.llvm.org:8011/#/builders/91/builds/30
```
ea274be7

Add a clarifying a comment on CastInst::isNoopCast · 42ffba05

Philip Reames authored Oct 07, 2020

I made exactly the mistake described, so document the precondition.  It would be better to have an assert, but there is (currently) no "castIsValid" with purely type arguments.

42ffba05

Fix MSVC "not all control paths return a value" warning. NFCI. · 03280055
Simon Pilgrim authored Oct 07, 2020

03280055
Fix Wdocumentation warnings due to case mismatch. NFCI. · e9af30c3
Simon Pilgrim authored Oct 07, 2020

e9af30c3

[AMDGPU] Add tied operand to d16 scratch loads · 45014ce3

Stanislav Mekhanoshin authored Oct 06, 2020

This is still no-op because there is no selection for these
opcodes.

Differential Revision: https://reviews.llvm.org/D88927

45014ce3

[test][MC] Use %python in llvm/test/MC/COFF/bigobj.py · dd2f79ed

Edd Dawson authored Oct 07, 2020

... instead of the one on the $PATH.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D88986

dd2f79ed

[LAA] Use DL to get element size for bound computation. · a73166a4

Florian Hahn authored Oct 07, 2020

Currently LAA uses getScalarSizeInBits to compute the size of an element
when computing the end bound of an access.

This does not work as expected for pointers to pointers, because
getScalarSizeInBits will return 0 for pointer types.

By using DataLayout to get the size of the element we can also correctly
handle pointer element types.

Note the changes to the existing test, which seems to also use the wrong
offset for the end.

Fixes PR47751.

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D88953

a73166a4

[AMDGPU] Use default zero flag operands in flat scratch · 7361ce73
Stanislav Mekhanoshin authored Oct 06, 2020
```
This is no-op so far because we do not select these yet.

Differential Revision: https://reviews.llvm.org/D88920
```
7361ce73

Rename the VECREDUCE_STRICT_{FADD,FMUL} SDNodes to VECREDUCE_SEQ_{FADD,FMUL}. · e72cfd93

Amara Emerson authored Oct 03, 2020

The STRICT was causing unnecessary confusion. I think SEQ is a more accurate
name for what they actually do, and the other obvious option of "ORDERED"
has the issue of already having a meaning in FP contexts.

Differential Revision: https://reviews.llvm.org/D88791

e72cfd93

[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. · 322d0afd

Amara Emerson authored Oct 02, 2020

This change renames the intrinsics to not have "experimental" in the name.

The autoupgrader will handle legacy intrinsics.

Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html

Differential Revision: https://reviews.llvm.org/D88787

322d0afd

[NFC] Add contributors names to CREDITS.TXT · 19bc894d
Fanbo Meng authored Oct 07, 2020

19bc894d

[WebAssembly] Rename Emscripten EH functions · 3bba91f6

Heejin Ahn authored Sep 28, 2020

Renaming for some Emscripten EH functions has so far been done in
wasm-emscripten-finalize tool in Binaryen. But recently we decided to
make a compilation/linking path that does not rely on
wasm-emscripten-finalize for modifications, so here we move that
functionality to LLVM.

Invoke wrappers are generated in LowerEmscriptenEHSjLj pass, but final
wasm types are not available in the IR pass, we need to rename them at
the end of the pipeline.

This patch also removes uses of `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass, replacing that with `emscripten_longjmp`.
`emscripten_longjmp_jmpbuf` is lowered to `emscripten_longjmp`, but
previously we generated calls to `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass because it takes `jmp_buf*` instead of `i32`.
But we were able use `ptrtoint` to make it use `emscripten_longjmp`
directly here.

Addresses:
https://github.com/WebAssembly/binaryen/issues/3043
https://github.com/WebAssembly/binaryen/issues/3081

Companions:
https://github.com/WebAssembly/binaryen/pull/3191
https://github.com/emscripten-core/emscripten/pull/12399

Reviewed By: dschuff, tlively, sbc100

Differential Revision: https://reviews.llvm.org/D88697

3bba91f6

[json] Provide a means to delegate writing a value to another API · 91a98ec1

Daniel Sanders authored Oct 06, 2020

(Based on D87170 by dsanders)

I recently had need to call out to an external API to emit a JSON object as part
of one an LLVM tool was emitting. However, our JSON support didn't provide a way
to delegate part of the JSON output to that API.

Add rawValueBegin() and rawValueEnd() to maintain and check the internal state
while something else is writing to the stream. It's the users responsibility to
ensure that the resulting JSON output is still valid.

Differential Revision: https://reviews.llvm.org/D88902

91a98ec1

Reapply [ADT] function_ref's constructor is unavailable if the argument is not callable. · b953a01b

Sam McCall authored Oct 07, 2020

This reverts commit 281703e6.

GCC 5.4 bugs are worked around by avoiding use of variable templates.

Differential Revision: https://reviews.llvm.org/D88977

b953a01b

[MemCpyOpt] Add additional callslot test cases (NFC) · 7a01fc5a
Nikita Popov authored Oct 06, 2020
```
For cases where the destination is captured.
```
7a01fc5a
[NFC][InstCombine] Autogenerate a few tests being affected by upcoming patch · bef27e50
Roman Lebedev authored Oct 07, 2020

bef27e50
[Tests] Precommit test showing gap around load forwarding of vectors in instcombine · 14d5ee63
Philip Reames authored Oct 07, 2020

14d5ee63
[gn build] Port ddf1864a · d6af25e0
LLVM GN Syncbot authored Oct 07, 2020

d6af25e0

BPF: add AdjustOpt IR pass to generate verifier friendly codes · ddf1864a

Yonghong Song authored Aug 06, 2020

Add an IR phase right before main module optimization.
This is to modify IR to restrict certain downward optimizations
in order to generate verifier friendly code.
  > prevent certain instcombine optimizations, handling both
    in-block/cross-block instcombines.
  > avoid speculative code motion if the variable used in
    condition is also used in the later blocks.

Internally, a bpf IR builtin
  result = __builtin_bpf_passthrough(seq_num, result)
is used to enforce ordering. This builtin is only used
during target independent IR optimizations and it will
be removed at the beginning of target dependent IR
optimizations.

For example, removing the following workaround,
  --- a/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
  +++ b/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
  @@ -47,7 +47,7 @@ int sysctl_tcp_mem(struct bpf_sysctl *ctx)
          /* a workaround to prevent compiler from generating
           * codes verifier cannot handle yet.
           */
  -       volatile int ret;
  +       int ret;
this patch is able to generate code which passed the verifier.

To disable optimization, users need to use "opt" command like below:
  clang -target bpf -O2 -S -emit-llvm -Xclang -disable-llvm-passes test.c
  // disable icmp serialization
  opt -O2 -bpf-disable-serialize-icmp test.ll | llvm-dis > t.ll
  // disable avoid-speculation
  opt -O2 -bpf-disable-avoid-speculation test.ll | llvm-dis > t.ll
  llc t.ll

Differential Revision: https://reviews.llvm.org/D85570

ddf1864a

[AMDGPU] Support disassembly for AMDGPU kernel descriptors · 528057c1

Ronak Chauhan authored Oct 07, 2020

Decode AMDGPU Kernel descriptors as assembler directives.

Reviewed By: scott.linder, jhenderson, kzhuravl

Differential Revision: https://reviews.llvm.org/D80713

528057c1

[SVE] Lower fixed length VECREDUCE_OR operation · 333b2ab6
Cameron McInally authored Oct 07, 2020
```
Differential Revision: https://reviews.llvm.org/D88847
```
333b2ab6
[AMDGPU] Use @LINE for error checking in gfx10.3 assembler tests · fc819b69
Jay Foad authored Oct 07, 2020

fc819b69
Revert "[ADT] function_ref's constructor is unavailable if the argument is not callable." · 281703e6
Sam McCall authored Oct 07, 2020
```
This reverts commit 4cae6228.

Breaks GCC build:
http://lab.llvm.org:8011/#/builders/8/builds/33/steps/6/logs/stdio
```
281703e6
[gn build] (manually) port ce1365f8 · fbce456f
Nico Weber authored Oct 07, 2020

fbce456f

[ADT] function_ref's constructor is unavailable if the argument is not callable. · 4cae6228

Sam McCall authored Oct 06, 2020

This allows overload sets containing function_ref arguments to work correctly
Otherwise they're ambiguous as anything "could be" converted to a function_ref.

This matches proposed std::function_ref, absl::function_ref, etc.

Differential Revision: https://reviews.llvm.org/D88901

4cae6228

[obj2yaml] - Rename `Group` to `GroupSection`. NFC. · 82311766

Georgii Rymar authored Oct 06, 2020

The `Group` class represents a group section and it is
named inconsistently with other sections which all has
the "Section" suffix. It is sometimes confusing,
this patch addresses the issue.

Differential revision: https://reviews.llvm.org/D88892

82311766

[llvm-readelf] - Implement --addrsig option. · 55a60af2

Georgii Rymar authored Oct 05, 2020

We have `--addrsig` implemented for `llvm-readobj`.
Usually it is convenient to use a single tool for dumping,
so it seems we might want to implement `--addrsig` for `llvm-readelf` too.

I've selected a simple output format which is a bit similar to one,
used for dumping of the symbol table. It looks like:

```
Address-significant symbols section '.llvm_addrsig' contains 2 entries:
   Num: Name
     1: foo
     2: bar
```

Differential revision: https://reviews.llvm.org/D88835

55a60af2

[AMDGPU][MC] Improved diagnostics for instructions with missing features · 4a7e7620
Dmitry Preobrazhensky authored Oct 07, 2020
```
Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D88887
```
4a7e7620

InstCombine: Negator: don't rely on complexity sorting already being performed (PR47752) · fed0f890

Roman Lebedev authored Oct 07, 2020

In some cases, we can negate instruction if only one of it's operands
negates. Previously, we assumed that constants would have been
canonicalized to RHS already, but that isn't guaranteed to happen,
because of InstCombine worklist visitation order,
as the added test (previously-hanging) shows.

So if we only need to negate a single operand,
we should ensure ourselves that we try constant operand first.
Do that by re-doing the complexity sorting ourselves,
when we actually care about it.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47752

fed0f890

[AMDGPU] Implement hardware bug workaround for image instructions · f71f5f39

Rodrigo Dominguez authored Apr 03, 2020

Summary:
This implements a workaround for a hardware bug in gfx8 and gfx9,
where register usage is not estimated correctly for image_store and
image_gather4 instructions when D16 is used.

Change-Id: I4e30744da6796acac53a9b5ad37ac1c2035c8899

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81172

f71f5f39

[InstCombine] Tweak funnel by constant tests for better shl/lshr commutation coverage · dce03e30
Simon Pilgrim authored Oct 07, 2020

dce03e30
[ARM] Regenerate vldlane tests · 6625892d
Simon Pilgrim authored Oct 07, 2020
```
To help make the diffs in D88569 clearer
```
6625892d
[LAA] Add test for PR47751, which currently uses wrong bounds. · 20cfd5fa
Florian Hahn authored Oct 07, 2020

20cfd5fa

[SDag] SimplifyDemandedBits: simplify to FP constant if all bits known · 1aa8e6a5

Jay Foad authored Sep 30, 2020

We were already doing this for integer constants. This patch implements
the same thing for floating point constants.

Differential Revision: https://reviews.llvm.org/D88570

1aa8e6a5

[Test] Add one more test where we can avoid creating trunc · 85a6f8fc
Max Kazantsev authored Oct 07, 2020

85a6f8fc

[Support][unittests] Enforce alignment in ConvertUTFTest · 53b3873c

Rainer Orth authored Oct 07, 2020

`LLVM-Unit :: Support/./SupportTests/ConvertUTFTest.ConvertUTF16LittleEndianToUTF8String`
`FAIL`s on Solaris/sparcv9:

In `llvm/lib/Support/ConvertUTFWrapper.cpp` (`convertUTF16ToUTF8String`)
the `SrcBytes` arg is reinterpreted/accessed as `UTF16` (`unsigned short`,
which requires 2-byte alignment on strict-alignment targets like Sparc)
without anything guaranteeing the alignment, so the access yields a
`SIGBUS`.

This patch avoids this by enforcing the required alignment in the callers.

Tested on `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D88824

53b3873c

[NFC] Use getZero instead of getConstant(0) · fba42aea
Max Kazantsev authored Oct 07, 2020

fba42aea

[SROA] rewritePartition()/findCommonType(): if uses have conflicting type, try... · 7fa503ef

Roman Lebedev authored Oct 07, 2020

[SROA] rewritePartition()/findCommonType(): if uses have conflicting type, try getTypePartition() before falling back to largest integral use type (PR47592)

And another step towards transformss not introducing inttoptr and/or
ptrtoint casts that weren't there already.

In this case, when load/store uses have conflicting types,
instead of falling back to the iN, we can try to use allocated sub-type.
As disscussed, this isn't the best idea overall (we shouldn't rely on
allocated type), but it works fine as a temporary measure.

I've measured, and @ `-O3` as of vanilla llvm test-suite + RawSpeed,
this results in +0.05% more bitcasts, -5.51% less inttoptr
and -1.05% less ptrtoint (at the end of middle-end opt pipeline)

See https://bugs.llvm.org/show_bug.cgi?id=47592

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D88788

7fa503ef

BPF: avoid duplicated globals for CORE relocations · edd71db3

Yonghong Song authored Oct 06, 2020

This patch fixed two issues related with relocation globals.
In LLVM, if a global, e.g. with name "g", is created and
conflict with another global with the same name, LLVM will
rename the global, e.g., with a new name "g.2". Since
relocation global name has special meaning, we do not want
llvm to change it, so internally we have logic to check
whether duplication happens or not. If happens, just reuse
the previous global.

The first bug is related to non-btf-id relocation
(BPFAbstractMemberAccess.cpp). Commit 54d9f743
("BPF: move AbstractMemberAccess and PreserveDIType passes
to EP_EarlyAsPossible") changed ModulePass to FunctionPass,
i.e., handling each function at a time. But still just
one BPFAbstractMemberAccess object is created so module
level de-duplication still possible. Commit 40251fee
("[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer
pipeline") made a change to create a BPFAbstractMemberAccess
object per function so module level de-duplication is not
possible any more without going through all module globals.
This patch simply changed the map which holds reloc globals
as class static, so it will be available to all
BPFAbstractMemberAccess objects for different functions.

The second bug is related to btf-id relocation
(BPFPreserveDIType.cpp). Before Commit 54d9f743, the pass
is a ModulePass, so we have a local variable, incremented for
each instance, and works fine. But after Commit 54d9f743,
the pass becomes a FunctionPass. Local variable won't work
properly since different functions will start with the same
initial value. Fix the issue by change the local count variable
as static, so it will be truely unique across the whole module
compilation.

Differential Revision: https://reviews.llvm.org/D88942

edd71db3