Commits · a6e09391bbe7cb42d591f2a8c169cf4e8980e121 · Lorenzo Albano / LLVM bpEVL

May 03, 2021

[mlir][Linalg] Add a utility method to get reassociations maps for reshape. · a6e09391

MaheshRavishankar authored May 03, 2021

Given the source and destination shapes, if they are static, or if the
expanded/collapsed dimensions are unit-extent, it is possible to
compute the reassociation maps that can be used to reshape one type
into another. Add a utility method to return the reassociation maps
when possible.

This utility function can be used to fuse a sequence of reshape ops,
given the type of the source of the producer and the final result
type. This pattern supercedes a more constrained folding pattern added
to DropUnitDims pass.

Differential Revision: https://reviews.llvm.org/D101343

a6e09391

[libcxx][iterator][ranges] adds `bidirectional_iterator` and `bidirectional_range` · 9c5d86aa
Christopher Di Bella authored Apr 12, 2021
```
Implements parts of:
    * P0896R4 The One Ranges Proposal`

Depends on D100275.

Differential Revision: https://reviews.llvm.org/D100278
```
9c5d86aa

[mlir][sparse] fixed typo: sparse -> sparse_tensor · 90d18e10

Aart Bik authored May 03, 2021

Test passes either way, but this is full name of dialect

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D101774

90d18e10

Revert "[MC][ELF] Work around R_MIPS_LO16 relocation handling problem" · e1babfc2

Dimitry Andric authored May 03, 2021

This reverts commit ab40c027.

Some additional test cases are influenced by the workaround, and I need
to do a complete test run to identify and check them all.

e1babfc2

[MC][ELF] Work around R_MIPS_LO16 relocation handling problem · ab40c027

Dimitry Andric authored May 03, 2021

This fixes PR49821, and avoids "ld.lld: error: test.o:(.rodata.str1.1):
offset is outside the section" errors when linking MIPS objects with
negative R_MIPS_LO16 implicit addends.

ld.lld handles R_MIPS_HI16/R_MIPS_LO16 separately, not as a whole, so it
doesn't know that an R_MIPS_HI16 with implicit addend 1 and an
R_MIPS_LO16 with implicit addend -32768 represents 32768, which is in
range of a MergeInputSection. We could introduce a new RelExpr member
(like R_RISCV_PC_INDIRECT for R_RISCV_PCREL_HI20 / R_RISCV_PCREL_LO12)
but the complexity is unnecessary given that GNU as keeps the original
symbol for this case as well.

Reviewed By: atanasyan, MaskRay

Differential Revision: https://reviews.llvm.org/D101773

ab40c027

[sanitizer] Set IndentPPDirectives: AfterHash in .clang-format · 2fec8860

Fangrui Song authored May 03, 2021

Code patterns like this are common, `#` at the line beginning
(https://google.github.io/styleguide/cppguide.html#Preprocessor_Directives),
one space indentation for if/elif/else directives.
```
#if SANITIZER_LINUX
# if defined(__aarch64__)
# endif
#endif
```

However, currently clang-format wants to reformat the code to
```
#if SANITIZER_LINUX
#if defined(__aarch64__)
#endif
#endif
```

This significantly harms readability in my review.  Use `IndentPPDirectives:
AfterHash` to defeat the diagnostic. clang-format will now suggest:

```
#if SANITIZER_LINUX
#  if defined(__aarch64__)
#  endif
#endif
```

Unfortunately there is no clang-format option using indent with 1 for
just preprocessor directives. However, this is still one step forward
from the current behavior.

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D100238

2fec8860

Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" · 9d86095f
Tomas Matheson authored May 03, 2021
```
This reverts commit 75318503.
```
9d86095f

[libcxx][iterator][ranges] adds `forward_iterator` and `forward_range` · fa3e2626

Christopher Di Bella authored Apr 11, 2021

Implements parts of:
    * P0896R4 The One Ranges Proposal`

Depends on D100271.

Differential Revision: https://reviews.llvm.org/D100275

fa3e2626

[SimplifyCFG] Look for control flow changes instead of side effects. · ea817d79

Teresa Johnson authored Apr 28, 2021

When passingValueIsAlwaysUndefined scans for an instruction between an
inst with a null or undef argument and its first use, it was checking
for instructions that may have side effects, which is a superset of the
instructions it intended to find (as per the comments, control flow
changing instructions that would prevent reaching the uses). Switch
to using isGuaranteedToTransferExecutionToSuccessor() instead.

Without this change, when enabling -fwhole-program-vtables, which causes
assumes to be inserted by clang, we can get different simplification
decisions. In particular, when building with instrumentation FDO it can
affect the optimizations decisions before FDO matching, leading to some
mismatches.

I had to modify d83507-knowledge-retention-bug.ll since this fix enables
more aggressive optimization of that code such that it no longer tested
the original bug it was meant to test. I removed the undef which still
provokes the original failure (confirmed by temporarily reverting the
fix) and also changed it to just invoke the passes of interest to narrow
the testing.

Similarly I needed to adjust code for UnreachableEliminate.ll to avoid
an undef which was causing the function body to get optimized away with
this fix.

Differential Revision: https://reviews.llvm.org/D101507

ea817d79

[WebAssembly] Fixup order of ins variables for table instructions · cd460c4d

Paulo Matos authored May 03, 2021

WebAssembly instruction arguments should have their arguments ordered from
the deepest to the shallowest on the stack.

cd460c4d

[ValueTracking] soften assert for invertible recurrence matching · 15a42339

Sanjay Patel authored May 03, 2021

There's a TODO comment in the code and discussion in D99912
about generalizing this, but I wasn't sure how to implement that,
so just going with a potential minimal fix to avoid crashing.

The test is a reduction beyond useful code (there's no user of
%user...), but it is based on https://llvm.org/PR50191, so this
is asserting on real code.

Differential Revision: https://reviews.llvm.org/D101772

15a42339

[mlir][Linalg] Use rank-reduced versions of subtensor and subtensor insert when possible. · fd15e2b8

MaheshRavishankar authored May 03, 2021

Convert subtensor and subtensor_insert operations to use their
rank-reduced versions to drop unit dimensions.

Differential Revision: https://reviews.llvm.org/D101495

fd15e2b8

[OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions · 63f8226f

Valentin Clement authored May 03, 2021

Add function to create the offload_maptypes and the offload_mapnames globals. These two functions
are used in clang. They will be used in the Flang/MLIR lowering as well.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D101503

63f8226f

[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 · 75318503

Tomas Matheson authored Mar 31, 2021

atomicrmw instructions are expanded by AtomicExpandPass before register allocation
into cmpxchg loops. Register allocation can insert spills between the exclusive loads
and stores, which invalidates the exclusive monitor and can lead to infinite loops.

To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them
after register allocation.

Floating point legalisation:
f16 ATOMIC_LOAD_FADD(*f16, f16) is legalised to
f32 ATOMIC_LOAD_FADD(*i16, f32) and then eventually
f32 ATOMIC_LOAD_FADD_16(*i16, f32)

Differential Revision: https://reviews.llvm.org/D101164

Originally submitted as 3338290c.
Reverted in c7df6b12.

75318503

[mlir][linalg] Fix vectorization bug in vector transfer indexing map calculation · 9621c1ef

thomasraoux authored May 03, 2021

The current implementation had a bug as it was relying on the target vector
dimension sizes to calculate where to insert broadcast. If several dimensions
have the same size we may insert the broadcast on the wrong dimension. The
correct broadcast cannot be inferred from the type of the source and
destination vector.

Instead when we want to extend transfer ops we calculate an "inverse" map to the
projected permutation and insert broadcast in place of the projected dimensions.

Differential Revision: https://reviews.llvm.org/D101738

9621c1ef

[MLIR][Linalg] Avoid forward declaration in `Loops.cpp` · 456efbc0
Frederik Gossen authored May 03, 2021
```
Differential Revision: https://reviews.llvm.org/D101771
```
456efbc0

[MLIR][Linalg] Lower `linalg.tiled_loop` in a separate pass · ec339163

Frederik Gossen authored May 03, 2021

Add dedicated pass `convert-linalg-tiled-loops-to-scf` to lower
`linalg.tiled_loop`s.

Differential Revision: https://reviews.llvm.org/D101768

ec339163

[AsmParser][SystemZ][z/OS] Implement HLASM location counter syntax ("*") for Z... · ca02fab7

Anirudh Prasad authored May 03, 2021

[AsmParser][SystemZ][z/OS] Implement HLASM location counter syntax ("*") for Z PC-relative instructions.

- This patch attempts to implement the location counter syntax (*) for the HLASM variant for PC-relative instructions.
- In the HLASM variant, for purely constant relocatable values, we expect a * token preceding it, with special support for " *" which is parsed as "<pc-rel-insn 0>"
- For combinations of absolute values and relocatable values, we don't expect the "*" preceding the token.

When you have a " * "  what’s accepted is:

```
*<space>.*{.*} -> <pc-rel-insn> 0
*[+|-][constant-value] -> <pc-rel-insn> [+|-]constant-value
```

When you don’t have a " * " what’s accepted is:

```
brasl  1,func           is allowed (MCSymbolRef type)
brasl  1,func+4         is allowed (MCBinary type)
brasl  1,4+func         is allowed (MCBinary type)
brasl  1,-4+func        is allowed (MCBinary type)
brasl  1,func-4         is allowed (MCBinary type)
brasl  1,*func          is not allowed (* cannot be used for non-MCConstantExprs)
brasl  1,*+func         is not allowed (* cannot be used for non-MCConstantExprs)
brasl  1,*+func+4       is not allowed (* cannot be used for non-MCConstantExprs)
brasl  1,*+4+func       is not allowed (* cannot be used for non-MCConstantExprs)
brasl  1,*-4+8+func     is not allowed (* cannot be used for non-MCConstantExprs)
```

Reviewed By: Kai

Differential Revision: https://reviews.llvm.org/D100987

ca02fab7

[scudo] Don't track free/use stats for transfer batches. · e8f7241e

Mitch Phillips authored May 03, 2021

The Scudo C unit tests are currently non-hermetic. In particular, adding
or removing a transfer batch is a global state of the allocator that
persists between tests. This can cause flakiness in
ScudoWrappersCTest.MallInfo, because the creation or teardown of a batch
causes mallinfo's uordblks or fordblks to move up or down by the size of
a transfer batch on malloc/free.

It's my opinion that uordblks and fordblks should track the statistics
related to the user's malloc() and free() usage, and not the state of
the internal allocator structures. Thus, excluding the transfer batches
from stat collection does the trick and makes these tests pass.

Repro instructions of the bug:
1. ninja ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoCUnitTest-x86_64-Test
2. ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoCUnitTest-x86_64-Test --gtest_filter=ScudoWrappersCTest.MallInfo

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D101653

e8f7241e

[libc++] Use the internal Lit shell to run the tests · 39bbfb77

Louis Dionne authored Apr 30, 2020

This makes the libc++ tests more portable -- almost all of them should
now work on Windows, except for some tests that assume a shell is
available on the target. We should probably provide a way to exclude
those anyway for the purpose of running tests on embedded targets.

Differential Revision: https://reviews.llvm.org/D89495

39bbfb77

[libc++] Fix template instantiation depth issues with std::tuple · 84f0bb61

Louis Dionne authored May 03, 2021

This fixes the issue by implementing _And using the short-circuiting
SFINAE trick that we previously used only in std::tuple. One thing we
could look into is use the naive recursive implementation for disjunctions
with a small number of arguments, and use that trick with larger numbers
of arguments. It might be the case that the constant overhead for setting
up the SFINAE trick makes it only worth doing for larger packs, but that's
left for further work.

This problem was raised in https://reviews.llvm.org/D96523.

Differential Revision: https://reviews.llvm.org/D101661

84f0bb61

Move MLIR python sources to mlir/python. · 9f3f6d7b

Stella Laurenzo authored Apr 28, 2021

* NFC but has some fixes for CMake glitches discovered along the way (things not cleaning properly, co-mingled depends).
* Includes previously unsubmitted fix in D98681 and a TODO to fix it more appropriately in a smaller followup.

Differential Revision: https://reviews.llvm.org/D101493

9f3f6d7b

[libc++] Disentangle std::pointer_safety · 49e7be2e

Louis Dionne authored Apr 13, 2021

This patch gets rid of technical debt around std::pointer_safety which,
I claim, is entirely unnecessary. I don't think anybody has used
std::pointer_safety in actual code because we do not implement the
underlying garbage collection support. In fact, P2186 even proposes
removing these facilities entirely from a future C++ version. As such,
I think it's entirely fine to get rid of complex workarounds whose goals
were to avoid breaking the ABI back in 2017.

I'm putting this up both to get reviews and to discuss this proposal for
a breaking change. I think we should be comfortable with making these
tiny breaks if we are confident they won't hurt anyone, which I'm fairly
confident is the case here.

Differential Revision: https://reviews.llvm.org/D100410

49e7be2e

[DebuggerTuning] Move a comment to a more useful place. · 1d299252

Paul Robinson authored May 03, 2021

The comment about how to make use of debugger tuning within DwarfDebug
really belongs inside the DwarfDebug declaration, where it will be
easier to find.

1d299252

[mlir][spirv] Add support to convert std.splat op · d51275cb
thomasraoux authored May 03, 2021
```
Differential Revision: https://reviews.llvm.org/D101511
```
d51275cb

[AMDGPU] Change FLAT Scratch SADDR to VADDR form in moveToVALU · 4d6ebe8a

Stanislav Mekhanoshin authored Apr 30, 2021

Extend the legalization of global SADDR loads and stores
with changing to VADDR to the FLAT scratch instructions.

Differential Revision: https://reviews.llvm.org/D101408

4d6ebe8a

[AIX] Remove unused vector registers from allocation order in the default AltiVec ABI · d98e5e02

Zarko Todorovski authored May 03, 2021

The previous implementation of the default AltiVec ABI marked registers V20-V31
as reserved.  This failed to prevent reserved VFRC registers being allocated.
In this patch instead of marking the registers reserved we remove unallowed
registers from the allocation order completely.

This is a slight rework of an implementation by @nemanjai

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D100050

d98e5e02

Modules: Remove an extra early return, NFC · 64a390c1

Duncan P. N. Exon Smith authored Apr 30, 2021

Remove an early return from an `else` block that's immediately followed
by an equivalent early return after the `else` block.

Differential Revision: https://reviews.llvm.org/D101671

64a390c1

[mlir][vector] Extend vector transfer unrolling to support permutations and broadcast · f44c76d6
thomasraoux authored May 03, 2021
```
Differential Revision: https://reviews.llvm.org/D101637
```
f44c76d6
[mlir][vector] Add canonicalization for extract/insert -> shapecast · 7417541f
thomasraoux authored May 03, 2021
```
Differential Revision: https://reviews.llvm.org/D101643
```
7417541f
[libFuzzer] Deflake entropic exec-time test. · ac512890
Matt Morehouse authored May 03, 2021

ac512890

[libFuzzer] Fix off-by-one error in ApplyDictionaryEntry · 62e4dca9

Fabian Meumertzheim authored Apr 30, 2021

In the overwrite branch of MutationDispatcher::ApplyDictionaryEntry in
FuzzerMutate.cpp, the index Idx at which W.size() bytes are overwritten
with the word W is chosen uniformly at random in the interval
[0, Size - W.size()). This means that Idx + W.size() will always be
strictly less than Size, i.e., the last byte of the current unit will
never be overwritten.

This is fixed by adding 1 to the exclusive upper bound.

Addresses https://bugs.llvm.org/show_bug.cgi?id=49989.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101625

62e4dca9

[AMDGPU] Change FLAT SADDR to VADDR form in moveToVALU · 89a94be1

Stanislav Mekhanoshin authored Apr 26, 2021

Instead of legalizing saddr operand with a readfirstlane
when address is moved from SGPR to VGPR we can just
change the opcode.

Differential Revision: https://reviews.llvm.org/D101405

89a94be1

[OpenMP] Fix non-determinism in clang task codegen · a27ca15d
Giorgis Georgakoudis authored May 02, 2021
```
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101739
```
a27ca15d

[mlir] Fix multidimensional lowering from std.select to llvm.select · 96a7900e

Benjamin Kramer authored May 03, 2021

The converter assumed that all operands have the same type, that's not
true for select.

Differential Revision: https://reviews.llvm.org/D101767

96a7900e

[mlir][vector][NFC] split TransposeOp lowerning out of contractLowering · be8e2801

thomasraoux authored May 03, 2021

Move TransposeOp lowering in its own populate function as in some cases
it is better to keep it during ContractOp lowering to better
canonicalize it rather than emiting scalar insert/extract.

Differential Revision: https://reviews.llvm.org/D101647

be8e2801

[docs][NewPM] Add section on analyses · 9779b664
Arthur Eubanks authored Apr 20, 2021
```
Reviewed By: asbirlea, ychen

Differential Revision: https://reviews.llvm.org/D100912
```
9779b664

[MLIR] Fix TestAffineDataCopy for test cases with no load ops · 92153575

Uday Bondhugula authored May 02, 2021

Add missing check in -test-affine-data-copy without which a test case
that has no affine.loads at all would crash this test pass. Fix two
clang-tidy warnings in the file while at this. (Not adding a test case
given the triviality.)

Differential Revision: https://reviews.llvm.org/D101719

92153575

[mlir][Python] Add casting constructor to Type and Attribute. · b57d6fe4

Stella Laurenzo authored May 02, 2021

* This makes them consistent with custom types/attributes, whose constructors will do a type checked conversion. Of course, the base classes can represent everything so never error.
* More importantly, this makes it possible to subclass Type and Attribute out of tree in sensible ways.

Differential Revision: https://reviews.llvm.org/D101734

b57d6fe4

[Support/Parallel] Add a special case for 0/1 items to llvm::parallel_for_each. · 5fa9d416

Chris Lattner authored May 01, 2021

This avoids the non-trivial overhead of creating a TaskGroup in these degenerate
cases, but also exposes parallelism. It turns out that the default executor
underlying TaskGroup prevents recursive parallelism - so an instance of a task
group being alive will make nested ones become serial.

This is a big issue in MLIR in some dialects, if they have a single instance of
an outer op (e.g. a firrtl.circuit) that has many parallel ops within it (e.g.
a firrtl.module). This patch side-steps the problem by avoiding creating the
TaskGroup in the unneeded case. See this issue for more details:
https://github.com/llvm/circt/issues/993

Note that this isn't a really great solution for the general case of nested
parallelism. A redesign of the TaskGroup stuff would be better, but would be
a much more invasive change.

Differential Revision: https://reviews.llvm.org/D101699

5fa9d416