Commits · 21a4faab60c34b8a8c4d09a5ffac50ded8163208 · Roger Ferrer / llvm-epi

Feb 22, 2021

[ARM] Move double vector insert patterns using vins to DAG combine · 21a4faab

David Green authored Feb 22, 2021

This removes the existing patterns for inserting two lanes into an
f16/i16 vector register using VINS, instead using a DAG combine to
pattern match the same code sequences. The tablegen patterns were
already on the large side (foreach LANE = [0, 2, 4, 6]) and were not
handling all the cases they could. Moving that to a DAG combine, whilst
not less code, allows us to better control and expand the selection of
VINSs. Additionally this allows us to remove the AddedComplexity on
VCVTT.

The extra trick that this has learned in the process is to move two
adjacent lanes using a single f32 vmov, allowing some extra
inefficiencies to be removed.

Differenial Revision: https://reviews.llvm.org/D96876

21a4faab

[WebAssembly] call_indirect issues table number relocs · 861dbe1a

Andy Wingo authored Feb 12, 2021

If the reference-types feature is enabled, call_indirect will explicitly
reference its corresponding function table via `TABLE_NUMBER`
relocations against a table symbol.

Also, as before, address-taken functions can also cause the function
table to be created, only with reference-types they additionally cause a
symbol table entry to be emitted.

We abuse the used-in-reloc flag on symbols to indicate which tables
should end up in the symbol table.  We do this because unfortunately
older wasm-ld will carp if it see a table symbol.

Differential Revision: https://reviews.llvm.org/D90948

861dbe1a

[clang][CodeComplete] Ensure there are no crashes when completing with ParenListExprs as LHS · f1013739
Kadir Cetinkaya authored Feb 18, 2021
```
Differential Revision: https://reviews.llvm.org/D96950
```
f1013739

[clang][cli] Pass '-Wspir-compat' to cc1 from driver · 820e0c49

Jan Svoboda authored Feb 22, 2021

This patch moves the creation of the '-Wspir-compat' argument from cc1 to the driver.

Without this change, generating command line arguments from `CompilerInvocation` cannot be done reliably: there's no way to distinguish whether '-Wspir-compat' was passed to cc1 on the command line (should be generated), or if it was created within `CompilerInvocation::CreateFromArgs` (should not be generated).

This is also in line with how other '-W' flags are handled.

(This was introduced in D21567.)

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D97041

820e0c49

[clang][cli] Stop creating '-Wno-stdlibcxx-not-found' in cc1 · bf15697e

Jan Svoboda authored Feb 22, 2021

This patch stops creating the '-Wno-stdlibcxx-not-found' argument in `CompilerInvocation::CreateFromArgs`.

The code was added in 2e7ab55e (a follow-up to D48297). However, D61963 removes relevant tests and starts explicitly passing '-Wno-stdlibcxx-not-found' to the driver. I think it's fair to assume this is a dead code.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D97042

bf15697e

[mlir] Mark std.subview as NoSideEffect · 5b20d80a
Tres Popp authored Feb 18, 2021
```
Differential Revision: https://reviews.llvm.org/D96951
```
5b20d80a

[NFC][llvm-dwarfdump] Don't calculate unnecessary stats · 52113451

Djordje Todorovic authored Feb 21, 2021

Small optimization of the code -- No need to calculate any stats
for NULL nodes, and also no need to call the collectStatsForDie()
if it is the CU itself.

Differential Revision: https://reviews.llvm.org/D96871

52113451

[InstrProfiling] Fix instrprof-gc-sections.c test · 97184ab9
Petr Hosek authored Feb 21, 2021
```
After D97110 __llvm_prof_cnts has the nobits type so it's empty.
```
97184ab9
[mlir] Export CUDA and Vulkan runtime wrappers on Windows · 2d62212b
Kern Handa authored Feb 21, 2021
```
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97140
```
2d62212b
[AArch64][GlobalISel] Fix <16 x s8> G_DUP regbankselect to assign source to gpr. · 6ff09ce0
Amara Emerson authored Feb 21, 2021
```
We can only select this type if the source is on GPR, not FPR.
```
6ff09ce0
[CodeGen] Use range-based for loops (NFC) · ffba9e59
Kazu Hirata authored Feb 21, 2021

ffba9e59
[llvm] Fix header guards (NFC) · 5032b589
Kazu Hirata authored Feb 21, 2021
```
Identified with llvm-header-guard.
```
5032b589
[Analysis] Use ListSeparator (NFC) · 047fc3bf
Kazu Hirata authored Feb 21, 2021

047fc3bf
Revert "[sanitizers] Pass CMAKE_C_FLAGS into TSan buildgo script" · 4b34e0c7
Nico Weber authored Feb 21, 2021
```
This reverts commit ac6c13bf.
Breaks building with PGO, see https://reviews.llvm.org/D96762#2574009
```
4b34e0c7

[mlir] Add simple jupyter kernel · 04c66edd

Jacques Pienaar authored Jan 30, 2021

Simple jupyter kernel using mlir-opt and reproducer to run passes.
Useful for local experimentation & generating examples. The export to
markdown from here is not immediately useful nor did I define a
CodeMirror synax to make the HTML output prettier. It only supports one
level of history (e.g., `_`) as I was mostly using with expanding a
pipeline one pass at a time and so was all I needed.

I placed this in utils directory next to editor & debugger utils.

Differential Revision: https://reviews.llvm.org/D95742

04c66edd

[InstrProfiling] Use ELF section groups for counters, data and values · 5ca21175

Petr Hosek authored Jul 13, 2019

__start_/__stop_ references retain C identifier name sections such as
__llvm_prf_*. Putting these into a section group disables this logic.

The ELF section group semantics ensures that group members are retained
or discarded as a unit. When a function symbol is discarded, this allows
allows linker to discard counters, data and values associated with that
function symbol as well.

Note that `noduplicates` COMDAT is lowered to zero-flag section group in
ELF. We only set this for functions that aren't already in a COMDAT and
for those that don't have available_externally linkage since we already
use regular COMDAT groups for those.

Differential Revision: https://reviews.llvm.org/D96757

5ca21175

Feb 21, 2021

[KnownBits][RISCV] Improve known bits for srem. · 183bbad1

Craig Topper authored Feb 21, 2021

The result must be less than or equal to the LHS side, so any
leading zeros in the left hand side must also exist in the result.
This is stronger than the previous behavior where we only considered
the sign bit being 0.

The affected test case used the sign bit being known 0 to change
a sign extend to a zero extend pre type legalization. After type
legalization the types were promoted to i64, but we no longer
knew bit 31 was zero. This shifts are are the equivalent of an
AND with 0xffffffff or zext_inreg X, i32. This patch allows us to
see that bit 31 is zero and remove the shifts.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97124

183bbad1

Implement simple type polymorphism for linalg named ops. · 6c9541d4

Stella Laurenzo authored Feb 20, 2021

* It was decided that this was the end of the line for the existing custom tc parser/generator, and this is the first step to replacing it with a declarative format that maps well to mathy source languages.
* One such source language is implemented here: https://github.com/stellaraccident/mlir-linalgpy/blob/main/samples/mm.py
  * In fact, this is the exact source of the declarative `polymorphic_matmul` in this change.
  * I am working separately to clean this python implementation up and add it to MLIR (probably as `mlir.tools.linalg_opgen` or equiv). The scope of the python side is greater than just generating named ops: the ops are callable and directly emit `linalg.generic` ops fully dynamically, and this is intended to be a feature for frontends like npcomp to define custom linear algebra ops at runtime.
* There is more work required to handle full type polymorphism, especially with respect to integer formulations, since they require more specificity wrt types.
* Followups to this change will bring the new generator to feature parity with the current one and delete the current. Roughly, this involves adding support for interface declarations and attribute symbol bindings.

Differential Revision: https://reviews.llvm.org/D97135

6c9541d4

[X86] Add vector support to sub(C1, xor(X, C2)) -> add(xor(X, ~C2), C1+1) fold. · b568d3d6
Simon Pilgrim authored Feb 21, 2021

b568d3d6

[X86] Replace explicit constant handling in sub(C1, xor(X, C2)) -> add(xor(X,... · 3ab32c94

Simon Pilgrim authored Feb 21, 2021

[X86] Replace explicit constant handling in sub(C1, xor(X, C2)) -> add(xor(X, ~C2), C1+1) fold. NFCI.

NFC cleanup before adding vector support - rely on the SelectionDAG to handle everything for us.

3ab32c94

[X86] Regenerate sub.ll test · e7e35e17
Simon Pilgrim authored Feb 21, 2021

e7e35e17
[X86] Add 'sub C1, (xor X, C1) -> add (xor X, ~C2), C1+1' tests · 9872cfc5
Simon Pilgrim authored Feb 21, 2021
```
This is also in sub.ll but that's for a specific i686 pattern - this adds x86_64 and vector tests
```
9872cfc5
[X86] Add common CHECK check-prefix to sub combine tests · 0b372c02
Simon Pilgrim authored Feb 21, 2021

0b372c02
Revert "[lldb-vscode] Emit the breakpoint changed event on location resolved" · 878d82c4
António Afonso authored Feb 21, 2021
```
This reverts commit 1f21d488.
```
878d82c4

[lldb] [docs] Update platform support status · 7850bb5f

Michał Górny authored Feb 20, 2021

Update supported features on FreeBSD, and supported platform list
on FreeBSD, Linux and NetBSD.

Differential Revision: https://reviews.llvm.org/D97114

7850bb5f

[LLDB] [docs] Update the list of supported architectures on Windows · ae14f3fd
Martin Storsjö authored Feb 17, 2021
```
Differential Revision: https://reviews.llvm.org/D96840
```
ae14f3fd

Reapply "[lldb/test] Automatically find debug servers to test" · 3ca7b2d0

Pavel Labath authored Feb 04, 2021

This reapplies 7df4eaaa/D96202, which was reverted due to issues on
windows. These were caused by problems in the computation of the liblldb
directory, which was fixed by D96779.

The original commit message was:
Our test configuration logic assumes that the tests can be run either
with debugserver or with lldb-server. This is not entirely correct,
since lldb server has two "personalities" (platform server and debug
server) and debugserver is only a replacement for the latter.

A consequence of this is that it's not possible to test the platform
behavior of lldb-server on macos, as it is not possible to get a hold of
the lldb-server binary.

One solution to that would be to duplicate the server configuration
logic to be able to specify both executables. However, that seems
excessively redundant.

A well-behaved lldb should be able to find the debug server on its own,
and testing lldb with a different (lldb-|debug)server does not seem very
useful (even in the out-of-tree debugserver setup, we copy the server
into the build tree to make it appear "real").

Therefore, this patch deletes the configuration altogether and changes
the low-level server retrieval functions to be able to both lldb-server
and debugserver paths. They do this by consulting the "support
executable" directory of the lldb under test.

Differential Revision: https://reviews.llvm.org/D96202

3ca7b2d0

[SelectionDAG][RISCV] Teach ComputeNumSignBits to handle SREM. · 1a6c1ac6

Craig Topper authored Feb 21, 2021

This also removes a pattern from RISCV that is no longer needed
since the sexti32 on the LHS of the srem in the pattern implies
the result is sign extended so the sign_extend_inreg should be
removed in DAG combine now.

Reviewed By: luismarques, RKSimon

Differential Revision: https://reviews.llvm.org/D97133

1a6c1ac6

[X86][AVX] canonicalizeLaneShuffleWithRepeatedOps - remove unnecessary BITCASTs. · bae04a3e

Simon Pilgrim authored Feb 21, 2021

In conjunction with the 'vperm2x128(bitcast(x),bitcast(y),c) -> bitcast(vperm2x128(x,y,c))' fold in combineTargetShuffle, this should remove any unnecessary bitcasts around vperm2x128 lane shuffles.

bae04a3e

Revert "Make sure the interpreter module was loaded before making checks against it" · b19d3b09
António Afonso authored Feb 21, 2021
```
This reverts commit a83a825e.
```
b19d3b09
[NFC] Remove redundant word in comment · 5fe23de5
madhur13490 authored Feb 21, 2021
```
Differential Revision: https://reviews.llvm.org/D97157
```
5fe23de5

[lldb-vscode] Emit the breakpoint changed event on location resolved · 1f21d488

António Afonso authored Feb 19, 2021

VSCode was not being informed whenever a location had been resolved (after being initated as non-resolved), so even though it was actually resolved, the IDE would show a hollow dot (instead of a red dot) because it didn't know about the change.

Differential Revision: https://reviews.llvm.org/D96680

1f21d488

[Loads] Add optimized FindAvailableLoadedValue() overload (NFCI) · e0615bcd

Nikita Popov authored Feb 21, 2021

FindAvailableLoadedValue() accepts an iterator by reference. If no
available value is found, then the iterator will either be left
at a clobbering instruction or the beginning of the basic block.
This allows using FindAvailableLoadedValue() across multiple blocks.

If this functionality is not needed, as is the case in InstCombine,
then we can use a much more efficient implementation: First try
to find an available value, and only perform clobber checks if
we actually found one. As this function only looks at a very small
number of instructions (6 by default) and usually doesn't find an
available value, this saves many expensive alias analysis queries.

e0615bcd

[IR] restrict vector reduction intrinsic types · 215bb157

Sanjay Patel authored Feb 21, 2021

The arguments in all cases should be vectors of exactly one of integer or FP.

All of the tests currently pass the verifier because we check for any vector
type regardless of the type of reduction.
This obviously can't work if we mix up integer and FP, and based on current
LangRef text it was not intended to work for pointers either.

The pointer case from https://llvm.org/PR49215 is what led me here. That
example was avoided with 5b250a27.

Differential Revision: https://reviews.llvm.org/D96904

215bb157

Make sure the interpreter module was loaded before making checks against it · a83a825e

António Afonso authored Feb 18, 2021

This issue was introduced in https://reviews.llvm.org/D92187.
The guard I'm changing were is supposed to act when linux is loading the linker for the second time (due to differences in paths like symlinks).
This is done by checking `module_sp != m_interpreter_module.lock()` however this will be true when `m_interpreter_module` wasn't initialized, making linux unload the linker module (the most visible result here is that lldb will stop getting notified about new modules loaded by the process, because it can't set the rendezvous breakpoint again after the stepping over it once).
The `m_interpreter_module` is not getting initialize when it goes through this path: https://github.com/llvm/llvm-project/blob/dbfdb139f75470a9abc78e7c9faf743fdd963c2d/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp#L332, which happens when lldb was able to read the address from the dynamic section of the executable.

What I'm not sure about though, is if when we go through this path if we still load the linker twice on linux. If that's the case then it means we need to somehow set the m_interpreter_module instead of the fix I provide here. I've only tested this on Android.

Differential Revision: https://reviews.llvm.org/D96637

a83a825e

[Loads] Extract helper frunction for available load/store (NFC) · 7c706aa0

Nikita Popov authored Feb 21, 2021

This contains the logic for extracting an available load/store
from a given instruction, to be reused in a following patch.

7c706aa0

[ThinLTO] Fix import of multiply defined global variables · e97aab8d

Kristina Bessonova authored Feb 02, 2021

Currently, if there is a module that contains a strong definition of
a global variable and a module that has both a weak definition for
the same global and a reference to it, it may result in an undefined symbol error
while linking with ThinLTO.

It happens because:
* the strong definition become internal because it is read-only and can be imported;
* the weak definition gets replaced by a declaration because it's non-prevailing;
* the strong definition failed to be imported because the destination module
  already contains another definition of the global yet this def is non-prevailing.

The patch adds a check to computeImportForReferencedGlobals() that allows
considering a global variable for being imported even if the module contains
a definition of it in the case this def has an interposable linkage type.

Note that currently the check is based only on the linkage type
(and this seems to be enough at the moment), but it might be worth to account
the information whether the def is prevailing or not.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D95943

e97aab8d

[DAG] Match USUBSAT patterns through zext/trunc · 38ab47c8

Simon Pilgrim authored Feb 21, 2021

This patch handles usubsat patterns hidden through zext/trunc and uses the getTruncatedUSUBSAT helper to determine if the USUBSAT can be correctly performed in the truncated form:

zext(x) >= y ? x - trunc(y) : 0 --> usubsat(x,trunc(umin(y,SatLimit)))
zext(x) >  y ? x - trunc(y) : 0 --> usubsat(x,trunc(umin(y,SatLimit)))

Based on original examples:

void foo(unsigned short *p, int max, int n) {
    int i;
    unsigned m;
    for (i = 0; i < n; i++) {
        m = *--p;
        *p = (unsigned short)(m >= max ? m-max : 0);
    }
}

Differential Revision: https://reviews.llvm.org/D25987

38ab47c8

[X86][AVX] Fold concat(extract_subvector(v0,c0), extract_subvector(v1,c1)) -> vperm2x128 · a6a258f1
Simon Pilgrim authored Feb 21, 2021
```
Fixes regression exposed by removing bitcasts across logic-ops in D96206.

Differential Revision: https://reviews.llvm.org/D96206
```
a6a258f1

[X86] Fold bitcast(logic(bitcast(X), Y)) --> logic'(X, bitcast(Y)) for int-int bitcasts · 2885d125

Simon Pilgrim authored Feb 21, 2021

Extend the existing combine that handles bitcasting for fp-logic ops to also help remove logic ops across bitcasts to/from the same integer types.

This helps improve AVX512 predicate handling for D/Q logic ops and also allows DAGCombine's scalarizeExtractedBinop to remove some annoying gpr->simd->gpr transfers.

The concat_vectors regression in pr40891.ll will be addressed in a followup commit on this patch.

Differential Revision: https://reviews.llvm.org/D96206

2885d125