Commits · 77e2e5e07d01fe0b83c39d0c527c0d3d2e659146 · Roger Ferrer / llvm-epi

May 08, 2021

[X86] Support AMX fast register allocation · 77e2e5e0
Xiang1 Zhang authored May 07, 2021

77e2e5e0
Replace a remaining CRLF with LF. NFC. · 631da3b1
Michael Liao authored May 08, 2021

631da3b1

[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose · 34a8a437

Arthur Eubanks authored May 03, 2021

Printing pass manager invocations is fairly verbose and not super
useful.

This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.

This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D101797

34a8a437

[DebugInfo] UnwindTable::create() should not add empty rows to CFI unwind table · 223852d7

RamNalamothu authored May 08, 2021

UnwindTable::parseRows() may return successfully if the CFIProgram has either
no CFI instructions or only DW_CFA_nop instructions and the UnwindRow return
argument will be empty. But currently, the callers are not checking for this case
which is leading to incorrect dumps in the unwind tables in such cases i.e.

  CFA=unspecified

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D101892

223852d7

[mlir] Refactor the representation of function-like argument/result attributes. · 53b946aa

River Riddle authored May 07, 2021

The current design uses a unique entry for each argument/result attribute, with the name of the entry being something like "arg0". This provides for a somewhat sparse design, but ends up being much more expensive (from a runtime perspective) in-practice. The design requires building a string every time we lookup the dictionary for a specific arg/result, and also requires N attribute lookups when collecting all of the arg/result attribute dictionaries.

This revision restructures the design to instead have an ArrayAttr that contains all of the attribute dictionaries for arguments and another for results. This design reduces the number of attribute name lookups to 1, and allows for O(1) lookup for individual element dictionaries. The major downside is that we can end up with larger memory usage, as the ArrayAttr contains an entry for each element even if that element has no attributes. If the memory usage becomes too problematic, we can experiment with a more sparse structure that still provides a lot of the wins in this revision.

This dropped the compilation time of a somewhat large TensorFlow model from ~650 seconds to ~400 seconds.

Differential Revision: https://reviews.llvm.org/D102035

53b946aa

[lit] Bump up the Windows process cap from 32 to 60 · 44d14d5d

Arthur Eubanks authored May 07, 2021

At 61 or over, I see messages like

  File "...\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait
    res = _winapi.WaitForMultipleObjects(L, False, timeout)

  ValueError: need at most 63 handles, got a sequence of length 64

60 seems to work for me.

If this causes issues for anybody else, feel free to revert.

44d14d5d

[mlir] Add hover support to mlir-lsp-server · 5c84195b

River Riddle authored May 07, 2021

This provides information when the user hovers over a part of the source .mlir file. This revision adds the following hover behavior:
* Operation:
  - Shows the generic form.
* Operation Result:
  - Shows the parent operation name, result number(s), and type(s).
* Block:
  - Shows the parent operation name, block number, predecessors, and successors.
* Block Argument:
  - Shows the parent operation name, parent block, argument number, and type.

Differential Revision: https://reviews.llvm.org/D101113

5c84195b

Revert "lit: revert " · ddff81f6

Arthur Eubanks authored May 07, 2021

This reverts commit d319005a.

Causing messages like:

  File "...\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait
    res = _winapi.WaitForMultipleObjects(L, False, timeout)
ValueError: need at most 63 handles, got a sequence of length 74

ddff81f6

[gn build] Manually port 5b158093 · d82bc9e8
Arthur Eubanks authored May 07, 2021

d82bc9e8

[mlir][vector] Fix warning · 6aaf06f9

thomasraoux authored May 07, 2021

Previous change caused another warning in some build configuration:
"default label in switch which covers all enumeration values"

6aaf06f9

[AArch64][GlobalISel] Create a new minimal combiner pass just for -O0. · 5b158093

Amara Emerson authored May 06, 2021

We never bothered to have a separate set of combines for -O0 in the prelegalizer
before. This results in some minor performance hits for a mode where performance
isn't a concern (although not regressing code size significantly is still preferable).

This also removes the CSE option since we don't need it for -O0.

Through experiments, I've arrived at a set of combines that gets the most code
size improvement at -O0, while reducing the amount of time spent in the combiner
by around 35% give or take.

Differential Revision: https://reviews.llvm.org/D102038

5b158093

[GlobalISel] Don't form zero/sign extending loads for atomics. · 808bc11d

Amara Emerson authored May 05, 2021

For importing patterns, we only support matching G_LOAD, not G_ZEXTLOAD or
G_SEXTLOAD.

Differential Revision: https://reviews.llvm.org/D101932

808bc11d

Make `hasTypeLoc` matcher support more node types. · 1f65f42d
Weston Carvalho authored Apr 29, 2021
```
Differential Revision: https://reviews.llvm.org/D101572
```
1f65f42d
NFC: Move TypeList implementation up the file · 0ad49483
Weston Carvalho authored May 08, 2021
```
This will make it possible for more code to use it.
```
0ad49483

[NewPM] Move analysis invalidation/clearing logging to instrumentation · 6f713100

Arthur Eubanks authored May 07, 2021

We're trying to move DebugLogging into instrumentation, rather than
being part of PassManagers/AnalysisManagers.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D102093

6f713100

May 07, 2021

[AArch64][GlobalISel] Legalize narrow type G_CTPOPs · 13128520

Jessica Paquette authored Apr 20, 2021

Using `clampScalar` here because we ought to mark s128 as custom eventually.

(Right now, it will just fall back.)

With this legalization, we get the same code as SDAG:
https://godbolt.org/z/TneoPKrKG

Differential Revision: https://reviews.llvm.org/D100908

13128520

Fix the module-enabled build by removing a redundant type definition. · c6ddf669
Adrian Prantl authored May 07, 2021

c6ddf669

[BareMetal] Ensure that sysroot always comes after library paths · 167906c1

Petr Hosek authored May 06, 2021

This addresses an issue introduced in D91559. We would invoke the
compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both
locations contain libraries with the same name, but we expect linker
to pick up the library in path/to/lib since that version is more
specialized. This was the case before D91559 where the sysroot path
would be ignored, but after that change linker would now pick up the
library from the sysroot which resulted in unexpected behavior.

The sysroot path should always come after any user provided library
paths, followed by compiler runtime paths. We want for libraries in user
provided library paths to always take precedence over sysroot libraries.
This matches the behavior of other toolchains used with other targets.

Differential Revision: https://reviews.llvm.org/D102049

167906c1

[lld/mac] Write every weak symbol only once in the output · d5a70db1

Nico Weber authored May 06, 2021

Before this, if an inline function was defined in several input files,
lld would write each copy of the inline function the output. With this
patch, it only writes one copy.

Reduces the size of Chromium Framework from 378MB to 345MB (compared
to 290MB linked with ld64, which also does dead-stripping, which we
don't do yet), and makes linking it faster:

        N           Min           Max        Median           Avg        Stddev
    x  10     3.9957051     4.3496981     4.1411121      4.156837    0.10092097
    +  10      3.908154      4.169318     3.9712729     3.9846753   0.075773012
    Difference at 95.0% confidence
            -0.172162 +/- 0.083847
            -4.14165% +/- 2.01709%
            (Student's t, pooled s = 0.0892373)

Implementation-wise, when merging two weak symbols, this sets a
"canOmitFromOutput" on the InputSection belonging to the weak symbol not put in
the symbol table. We then don't write InputSections that have this set, as long
as they are not referenced from other symbols. (This happens e.g. for object
files that don't set .subsections_via_symbols or that use .alt_entry.)

Some restrictions:
- not yet done for bitcode inputs
- no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) --
  Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs)
  (that is, catch block unwind information) and Personality Routines
  associated with weak functions still not stripped. This is wasteful,
  but harmless.
- However, this does strip weaks from __unwind_info (which is needed for
  correctness and not just for size)
- This nopes out on InputSections that are referenced form more than
  one symbol (eg from .alt_entry) for now

Things that work based on symbols Just Work:
- map files (change in MapFile.cpp is no-op and not needed; I just
  found it a bit more explicit)
- exports

Things that work with inputSections need to explicitly check if
an inputSection is written (e.g. unwind info).

This patch is useful in itself, but it's also likely also a useful foundation
for dead_strip.

I used to have a "canoncialRepresentative" pointer on InputSection instead of
just the bool, which would be handy for ICF too. But I ended up not needing it
for this patch, so I removed that again for now.

Differential Revision: https://reviews.llvm.org/D102076

d5a70db1

[mlir] Missed clang-format · b90b66bc
thomasraoux authored May 07, 2021

b90b66bc
[mlir][vector] Extend pattern to trim lead unit dimension to Splat Op · d0453a89
thomasraoux authored May 07, 2021
```
Differential Revision: https://reviews.llvm.org/D102091
```
d0453a89
Revert "[BareMetal] Ensure that sysroot always comes after library paths" · f97ada27
Petr Hosek authored May 07, 2021
```
This reverts commit 6b00b34b.
```
f97ada27

[LV] Remove reference of PHI from comment, they are not recorded (NFC). · 75b99977

Florian Hahn authored May 07, 2021

The comment incorrectly states that the PHI is recorded. That's not
accurate, only the recipe for the incoming value is recorded.

Suggested post-commit for 4ba8720f.

75b99977

[MCA][RegisterFile] Fix register class check for move elimination (PR50265) · 3822ac90

Andrea Di Biagio authored May 07, 2021

The register file should always check if the destination register is from a
register class that allows move elimination.

Before this change, the check on the register class was only performed in a few
very specific cases. However, it should have always been performed.
This patch fixes the issue.

Note that none of the upstream scheduling models is currently affected by this
bug, so there is no test for it. The issue was found by Roman while working on
the znver3 model. I was able to reproduce the issue locally by tweaking the
btver2 model. I then verified that this patch fixes the issue.

3822ac90

[SEH] Fix regression with SEH in noexpect functions · c4adc49a

Olivier Goffart authored May 07, 2021

Commit 5baea056 set the CurCodeDecl
because it was needed to pass the assert in CodeGenFunction::EmitLValueForLambdaField,
But this was not right to do as CodeGenFunction::FinishFunction passes it to EmitEndEHSpec
and cause corruption of the EHStack.

Revert the part of the commit that changes the CurCodeDecl, and instead
adjust the assert to check for a null CurCodeDecl.

Differential Revision: https://reviews.llvm.org/D102027

c4adc49a

[LV] Assert if trying to sink replicate region into another region (NFC) · 337d7652

Florian Hahn authored May 07, 2021

Currently sinking a replicate region into another replicate region is
not supported. Add an assert, to make the problem more obvious, should
it occur.

Discussed post-commit for ccebf7a1.

337d7652

[LV] Rename Region to TargetRegion, similar to SinkRegion (NFC). · 01c26d4e

Florian Hahn authored May 07, 2021

Adjust the name to make it clearer this is the region containing the
target recipe, similar to SinkRegion below.

Suggested post-commit for ccebf7a1.

01c26d4e

[flang] Implement NORM2 in the runtime · 01c78a0b

peter klausler authored May 06, 2021

Implement the reduction transformational intrinsic function NORM2 in
the runtime, using infrastructure already in place for MAXVAL & al.

Differential Revision: https://reviews.llvm.org/D102024

01c78a0b

[BareMetal] Ensure that sysroot always comes after library paths · 6b00b34b

Petr Hosek authored May 06, 2021

Differential Revision: https://reviews.llvm.org/D102049

6b00b34b

[RISCV] Consider scalar types for required extensions. · c04c66d7

Hsiangkai Wang authored May 07, 2021

We have vector operations on double vector and float scalar. For
example, vfwadd.wf is such a instruction.

vfloat64m1_t vfwadd_wf(vfloat64m1_t op0, float op1, size_t op2);

We should specify F and D extensions for it.

Differential Revision: https://reviews.llvm.org/D102051

c04c66d7

An attempt to abandon omptarget out-of-tree builds. · f2f88f3e

Vyacheslav Zakharin authored May 07, 2021

I want to start using LLVM component libraries in libomptarget
to stop duplicating implementations already available in LLVM
(e.g. LLVMObject, LLVMSupport, etc.). Without relying on LLVM
in all libomptarget builds one has to provide fallback implementation
for each used LLVM feature.

This is an attempt to stop supporting out-of-llvm-tree builds of libomptarget.

I understand that I may need to revert this,
if this affects downstream projects in a bad way.

Differential Revision: https://reviews.llvm.org/D101509

f2f88f3e

[mlir] Add a pattern to bufferize std.index_cast. · 3444996b
Alexander Belyaev authored May 07, 2021
```
Differential Revision: https://reviews.llvm.org/D102088
```
3444996b
[mlir] Add a pattern to bufferize linalg.tensor_reshape. · a3f22d02
Alexander Belyaev authored May 07, 2021
```
Differential Revision: https://reviews.llvm.org/D102089
```
a3f22d02

[mlir][docs] remove stale statement about index type in vectors · 21db1e3b

Emilio Cota authored May 07, 2021

b614ada0 ("[mlir] add support for index type in vectors.") removed
this limitation.

Differential Revision: https://reviews.llvm.org/D102081

21db1e3b

Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" · 7ca26c5f
Arthur Eubanks authored May 07, 2021
```
This reverts commit 0791f968.

Causing crashes: https://crbug.com/1206764
```
7ca26c5f

[SCEV] By more careful when traversing phis in isImpliedViaMerge. · 6c99e631

Florian Hahn authored May 07, 2021

I think currently isImpliedViaMerge can incorrectly return true for phis
in a loop/cycle, if the found condition involves the previous value of

Consider the case in exit_cond_depends_on_inner_loop.

At some point, we call (modulo simplifications)
isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1).

The existing code tries to prove IncV <= -1 for all incoming values
InvV using the found condition (%call <= -1). At the moment this succeeds,
but only because it does not compare the same runtime value. The found
condition checks the value of the last iteration, but the incoming value
is from the *previous* iteration.

Hence we incorrectly determine that the *previous* value was <= -1,
which may not be true.

I think we need to be more careful when looking at the incoming values
here. In particular, we need to rule out that a found condition refers to
any value that may refer to one of the previous iterations. I'm not sure
there's a reliable way to do so (that also works of irreducible control
flow).

So for now this patch adds an additional requirement that the incoming
value must properly dominate the phi block. This should ensure the
values do not change in a cycle. I am not entirely sure if will catch
all cases and I appreciate a through second look in that regard.

Alternatively we could also unconditionally bail out in this case,
instead of checking the incoming values

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101829

6c99e631

[WebAssembly] Use functions instead of macros for const SIMD intrinsics · 1e9c39a3

Thomas Lively authored May 07, 2021

To improve hygiene, consistency, and usability, it would be good to replace all
the macro intrinsics in wasm_simd128.h with functions. The reason for using
macros in the first place was to enforce the use of constants for some arguments
using `_Static_assert` with `__builtin_constant_p`. This commit switches to
using functions and uses the `__diagnose_if__` attribute rather than
`_Static_assert` to enforce constantness.

The remaining macro intrinsics cannot be made into functions until the builtin
functions they are implemented with can be replaced with normal code patterns
because the builtin functions themselves require that their arguments are
constants.

This commit also fixes a bug with the const_splat intrinsics in which the f32x4
and f64x2 variants were incorrectly producing integer vectors.

Differential Revision: https://reviews.llvm.org/D102018

1e9c39a3

[unittest] Fix -Wunused-variable after D94717 · 72460490
Fangrui Song authored May 07, 2021

72460490

Allow empty value list in propagateMetadata(Inst, ArrayOf...) · 50cf0a1d

Krzysztof Parzyszek authored May 07, 2021

This will allow writing
  propagateMetadata(Inst, collectInterestingValues(...))
without concern about empty lists. In case of an empty list,
Inst is returned without any changes.

50cf0a1d

Internalize some cl::opt global variables or move them under namespace llvm · d8aba75a
Fangrui Song authored May 07, 2021

d8aba75a