Commits · 3fffffa882c0c3702e1ce4c6eaf8a380ac4ab065 · Lorenzo Albano / LLVM bpEVL

Oct 27, 2020

[mlir][Pattern] Add a new FrozenRewritePatternList class · 3fffffa8

River Riddle authored Oct 26, 2020

This class represents a rewrite pattern list that has been frozen, and thus immutable. This replaces the uses of OwningRewritePatternList in pattern driver related API, such as dialect conversion. When PDL becomes more prevalent, this API will allow for optimizing a set of patterns once without the need to do this per run of a pass.

Differential Revision: https://reviews.llvm.org/D89104

3fffffa8

[mlir][NFC] Move around the code related to PatternRewriting to improve layering · b6eb26fd

River Riddle authored Oct 26, 2020

There are several pieces of pattern rewriting infra in IR/ that really shouldn't be there. This revision moves those pieces to a better location such that they are easier to evolve in the future(e.g. with PDL). More concretely this revision does the following:

* Create a Transforms/GreedyPatternRewriteDriver.h and move the apply*andFold methods there.
The definitions for these methods are already in Transforms/ so it doesn't make sense for the declarations to be in IR.

* Create a new lib/Rewrite library and move PatternApplicator there.
This new library will be focused on applying rewrites, and will also include compiling rewrites with PDL.

Differential Revision: https://reviews.llvm.org/D89103

b6eb26fd

[mlir][Pattern] Refactor the Pattern class into a "metadata only" class · b99bd771

River Riddle authored Oct 26, 2020

The Pattern class was originally intended to be used for solely matching operations, but that use never materialized. All of the pattern infrastructure uses RewritePattern, and the infrastructure for pure matching(Matchers.h) is implemented inline. This means that this class isn't a useful abstraction at the moment, so this revision refactors it to solely encapsulate the "metadata" of a pattern. The metadata includes the various state describing a pattern; benefit, root operation, etc. The API on PatternApplicator is updated to now operate on `Pattern`s as nothing special from `RewritePattern` is necessary.

This refactoring is also necessary for the upcoming use of PDL patterns alongside C++ rewrite patterns.

Differential Revision: https://reviews.llvm.org/D86258

b99bd771

[mlir] Add a conversion pass between PDL and the PDL Interpreter Dialect · 8a1ca2cd

River Riddle authored Oct 26, 2020

The conversion between PDL and the interpreter is split into several different parts.
** The Matcher:

The matching section of all incoming pdl.pattern operations is converted into a predicate tree and merged. Each pattern is first converted into an ordered list of predicates starting from the root operation. A predicate is composed of three distinct parts:
* Position
  - A position refers to a specific location on the input DAG, i.e. an
    existing MLIR entity being matched. These can be attributes, operands,
    operations, results, and types. Each position also defines a relation to
    its parent. For example, the operand `[0] -> 1` has a parent operation
    position `[0]` (the root).
* Question
  - A question refers to a query on a specific positional value. For
  example, an operation name question checks the name of an operation
  position.
* Answer
  - An answer is the expected result of a question. For example, when
  matching an operation with the name "foo.op". The question would be an
  operation name question, with an expected answer of "foo.op".

After the predicate lists have been created and ordered(based on occurrence of common predicates and other factors), they are formed into a tree of nodes that represent the branching flow of a pattern match. This structure allows for efficient construction and merging of the input patterns. There are currently only 4 simple nodes in the tree:
* ExitNode: Represents the termination of a match
* SuccessNode: Represents a successful match of a specific pattern
* BoolNode/SwitchNode: Branch to a specific child node based on the expected answer to a predicate question.

Once the matcher tree has been generated, this tree is walked to generate the corresponding interpreter operations.

 ** The Rewriter:
The rewriter portion of a pattern is generated in a very straightforward manor, similarly to lowerings in other dialects. Each PDL operation that may exist within a rewrite has a mapping into the interpreter dialect. The code for the rewriter is generated within a FuncOp, that is invoked by the interpreter on a successful pattern match. Referenced values defined in the matcher become inputs the generated rewriter function.

An example lowering is shown below:

```mlir
// The following high level PDL pattern:
pdl.pattern : benefit(1) {
  %resultType = pdl.type
  %inputOperand = pdl.input
  %root, %results = pdl.operation "foo.op"(%inputOperand) -> %resultType
  pdl.rewrite %root {
    pdl.replace %root with (%inputOperand)
  }
}

// is lowered to the following:
module {
  // The matcher function takes the root operation as an input.
  func @matcher(%arg0: !pdl.operation) {
    pdl_interp.check_operation_name of %arg0 is "foo.op" -> ^bb2, ^bb1
  ^bb1:
    pdl_interp.return
  ^bb2:
    pdl_interp.check_operand_count of %arg0 is 1 -> ^bb3, ^bb1
  ^bb3:
    pdl_interp.check_result_count of %arg0 is 1 -> ^bb4, ^bb1
  ^bb4:
    %0 = pdl_interp.get_operand 0 of %arg0
    pdl_interp.is_not_null %0 : !pdl.value -> ^bb5, ^bb1
  ^bb5:
    %1 = pdl_interp.get_result 0 of %arg0
    pdl_interp.is_not_null %1 : !pdl.value -> ^bb6, ^bb1
  ^bb6:
    // This operation corresponds to a successful pattern match.
    pdl_interp.record_match @rewriters::@rewriter(%0, %arg0 : !pdl.value, !pdl.operation) : benefit(1), loc([%arg0]), root("foo.op") -> ^bb1
  }
  module @rewriters {
    // The inputs to the rewriter from the matcher are passed as arguments.
    func @rewriter(%arg0: !pdl.value, %arg1: !pdl.operation) {
      pdl_interp.replace %arg1 with(%arg0)
      pdl_interp.return
    }
  }
}
```

Differential Revision: https://reviews.llvm.org/D84580

8a1ca2cd

SourceManager: Use the same fake SLocEntry whenever it fails to load · aab50af8

Duncan P. N. Exon Smith authored Oct 19, 2020

Instead of putting a fake `SLocEntry` at `LoadedSLocEntryTable[Index]`
when it fails to load in `SourceManager::loadSLocEntry`, allocate a fake
one. Unless someone is sniffing the address of the returned `SLocEntry`
(doubtful), this won't be a functionality change. Note that
`SLocEntryLoaded[Index]` wasn't being set to `true` either before or
after this change so no accessor is every going to look at
`LoadedSLocEntryTable[Index]`.

As a side effect, drop the `mutable` from `LoadedSLocEntryTable`.

Differential Revision: https://reviews.llvm.org/D89748

aab50af8

[NFC] Use [MC]Register in RegAllocPBQP & RegisterCoalescer · 17cdba61
Gaurav Jain authored Oct 22, 2020
```
Differential Revision: https://reviews.llvm.org/D90008
```
17cdba61
[lldb][NativePDB] fix test load-pdb.cpp · 779deb97
Zequan Wu authored Oct 26, 2020

779deb97

[clang][NFC] Rearrange Comment Token and Lexer fields to reduce padding · b698ad00

Nathan James authored Oct 27, 2020

Rearrange the fields to reduce the size of the classes

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D90127

b698ad00

Fix checking for C++98 ICEs in C++11-and-later mode to not consider use · a5c7b468
Richard Smith authored Oct 26, 2020
```
of a reference to be acceptable.
```
a5c7b468

[PowerPC] Implement Set Boolean Condition Instructions · 803cc3af

Amy Kwan authored Oct 26, 2020

This patch implements the set boolean condition instructions introduced in
POWER10.

The set boolean condition instructions (set[n]bc[r]) are used during
the following situations:
- sign/zero/any extending i1 to an i32 or i64,
- reg+reg, reg+imm or floating point comparisons being sign/zero extended to i32 or i64,
- spilling CR bits (using the setnbc instruction)

Differential Revision: https://reviews.llvm.org/D87705

803cc3af

[profile] Suppress spurious 'expected profile to require unlock' warning · a77a739a

Vedant Kumar authored Oct 26, 2020

In %c (continuous sync) mode, avoid attempting to unlock an
already-unlocked profile.

The profile is only locked when profile merging is enabled.

a77a739a

[DebugInfo] Expose Fortran array debug info attributes through DIBuilder. · 5b3bf8b4

Adrian Prantl authored Oct 26, 2020

The support of a few debug info attributes specifically for Fortran
arrays have been added to LLVM recently, but there's no way to take
advantage of them through DIBuilder. This patch extends
DIBuilder::createArrayType to enable the settings of those attributes.

Patch by Chih-Ping Chen!

Differential Revision: https://reviews.llvm.org/D89817

5b3bf8b4

[mlir][Linalg] Miscalleneous enhancements to cover more fusion cases. · 78f37b74

MaheshRavishankar authored Oct 26, 2020

Adds support for
- Dropping unit dimension loops for indexed_generic ops.
- Folding consecutive folding (or expanding) reshapes when the result
  (or src) is a scalar.
- Fixes to indexed_generic -> generic fusion when zero-dim tensors are
  involved.

Differential Revision: https://reviews.llvm.org/D90118

78f37b74

Explicitly check for entry basic block, rather than relying on MachineBasicBlock::pred_empty. · 0b2f4cdf

Rahman Lavaee authored Oct 26, 2020

Sometimes in unoptimized code, we have dangling unreachable basic blocks with no predecessors. Basic block sections should be emitted for those as well. Without this patch, the included test fails with a fatal error in `AsmPrinter::emitBasicBlockEnd`.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D89423

0b2f4cdf

Fixed release build after D89170 · d176e13c
Stanislav Mekhanoshin authored Oct 26, 2020

d176e13c

Oct 26, 2020

[mlir] Document 'ParentOneOf' with the HasParent trait · 745c1671
Stephen Neuendorffer authored Oct 26, 2020
```
Differential Revision: https://reviews.llvm.org/D90197
```
745c1671

[cmake] Add LLVM_UBSAN_FLAGS, to allow overriding UBSan flags · 905f874c

Vedant Kumar authored Oct 14, 2020

Allow overriding the default set of flags used to enable UBSan when
building llvm.

This can be used to test new checks or opt out of certain checks.

Differential Revision: https://reviews.llvm.org/D89439

905f874c

IR: Clarify ownership of ConstantDataSequentials, NFC · b2b7cf39

Duncan P. N. Exon Smith authored Oct 23, 2020

Change `ConstantDataSequential::Next` to a
`unique_ptr<ConstantDataSequential>` and update `CDSConstants` to a
`StringMap<unique_ptr<ConstantDataSequential>>`, making the ownership
more obvious.

Differential Revision: https://reviews.llvm.org/D90083

b2b7cf39

[MLIR] Fix AttributeInterface declaration. · db4863ff

Ulysse Beaugnon authored Oct 26, 2020

Substitues `Type` by `Attribute` in the declaration of AttributeInterface. It
looks like the code was written by copy-pasting the definition of TypeInterface,
but the substitution of Type by Attribute was missing at some places.

Reviewed By: rriddle, ftynse

Differential Revision: https://reviews.llvm.org/D90138

db4863ff

[CodeView] Emit static data members as S_CONSTANTs. · 51597322

Amy Huang authored Oct 07, 2020

We used to only emit static const data members in CodeView as
S_CONSTANTS when they were used; this patch makes it so they are always emitted.

I changed CodeViewDebug.cpp to find the static const members from the
class debug info instead of creating DIGlobalVariables in the IR
whenever a static const data member is used.

Bug: https://bugs.llvm.org/show_bug.cgi?id=47580

Differential Revision: https://reviews.llvm.org/D89072

51597322

[TargetRegisterInfo] Fix a couple of typos in the comments · 78a7941e
Quentin Colombet authored Oct 26, 2020
```
Spotted by Nicolas Guillemot <nguillemot@apple.com>.

Thanks Nicolas!

NFC
```
78a7941e

[mlir] Do not print back 0 alignment in LLVM dialect 'alloca' op · 03e6f40c

Alex Zinenko authored Oct 26, 2020

The alignment attribute in the 'alloca' op treats the '0' value as 'unset'.
When parsing the custom form of the 'alloca' op, ignore the alignment attribute
with if its value is '0' instead of actually creating it and producing a
slightly different textually yet equivalent semantically form in the output.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D90179

03e6f40c

[nfc] [lldb] Refactor DWARFUnit::GetDIE · 7611c5bb
Jan Kratochvil authored Oct 26, 2020
```
Reduce indentation of the code by early returns for failed code paths.
```
7611c5bb
[NFC] Fixing comment heading for MachineStableHash.h. · 4f98eaf6
Puyan Lotfi authored Oct 26, 2020
```
Wrong filename and description.
```
4f98eaf6
[libc++] Remove the reliance of several <random> tests on <iostream> · d1afe2e2
Louis Dionne authored Oct 26, 2020

d1afe2e2
[mlir] NFC: properly align IR in comments · f52b4a65
Lei Zhang authored Oct 26, 2020
```
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D90164
```
f52b4a65

[AMDGPU] Use flat scratch instructions where available · 038d884a

Stanislav Mekhanoshin authored Oct 21, 2020

The support is disabled by default. So far there is instruction
selection, spilling, and frame elimination. It also changes SP
from unswizzled to swizzled as used by flat scratch instructions,
so it cannot be mixed with MUBUF stack access.

At the very least missing:

- GlobalISel;
- Some optimizations in frame elimination in between vector
  and scalar ALU;
- It shall finally allow to always materialize frame index
  as an SGPR, but that is not implemented and frame elimination
  cannot handle it yet;
- Unaligned and/or multidword flat scratch shall work, but it
  is legalized now for MUBUF;
- Operand folding cannot optimize FI like with MUBUF yet;
- It will need scaling the value of the SP/FP in the DWARF
  expression to recover the unswizzled scratch address;

Differential Revision: https://reviews.llvm.org/D89170

038d884a

Run test only if X86 target is available · c551ba0e

Kiran Chandramohan authored Oct 26, 2020

This fixes failures in AArch64 buildbots by running the
clang/test/CodeGen/X86/att-inline-asm-prefix.c only when the X86
target is available.

c551ba0e

Prepend "__uniq" to symbol names hash with -funique-internal-linkage-names. · ad1b9daa

Sriraman Tallam authored Oct 26, 2020

Prepend the module name hash with a fixed string ".__uniq." which helps tools
that consume sampled profiles and attribute it to functions to understand
that this symbol belongs to a unique internal linkage type symbol.

Symbols with suffixes can result from various optimizations in the compiler.
Function Multiversioning, function splitting, parameter constant propogation,
unique internal linkage names.

External tools like sampled profile aggregators combine profiles from multiple
runs of a binary. They use various heuristics with symbols that have suffixes
to try and attribute the profile to the right function instance. For instance
multi-versioned symbols like foo.avx, foo.sse4.2, etc even though different
should be attributed to the same source function if a single function is
versioned, using attribute target_clones (supported in GCC but yet to land in
LLVM). Similarly, functions that are split (split part having a .cold suffix)
could have profiles for both the original and split symbols but would be
aggregated and attributed to the original function that was split.

Unique internal linkage functions however have different source instances and
the aggregator must not put them together but attribute it to the appropriate
function instance. To be sure that we are dealing with a symbol of a unique
internal linkage function, we would like to prepend the hash with a known
string ".__uniq." which these tools can check to understand the suffix type.

Differential Revision: https://reviews.llvm.org/D89617

ad1b9daa

[libunwind] Add -Wno-dll-attribute-on-redeclaration when building for windows · df6d2e8a

Martin Storsjö authored Oct 23, 2020

It's not worth trying to fix these warnings within libunwind, instead
silence them.

Differential Revision: https://reviews.llvm.org/D90075

df6d2e8a

[NFC] Remove max_align.c LIT testcase · 357715ce

Xiangling Liao authored Oct 26, 2020

Since we fixed the definition of `SuitableAlign`[https://reviews.llvm.org/D88659],
`max_align_t` and `__BIGGEST_ALIGNMENT__` are not necessarily the same always.

The original testcase was added here: https://reviews.llvm.org/D59048

Differential Revision: https://reviews.llvm.org/D90187

357715ce

Test to check backtraces with machine function splitting. · 9aa7a721

Sriraman Tallam authored Oct 26, 2020

clang supports option -fsplit-machine-functions and this test checks if the
backtraces are sane when functions are split.

With -fsplit-machine-functions, a function with profiles can get split into 2
parts, the original function containing hot code and a cold part as determined
by the profile info and the cold cutoff threshold.. The cold part gets the
".cold" suffix to disambiguate its symbol from the hot part and can be placed
arbitrarily in the address space.

This test checks if the back-trace looks correct when the cold part is executed.

Differential Revision: https://reviews.llvm.org/D90081

9aa7a721

Avoid unnecessary uses of `MDNode::getTemporary`, NFC · d4c667c9

Duncan P. N. Exon Smith authored Oct 23, 2020

This is a long-delayed follow-up to
5e5b8509.

`TempMDNode` includes a bunch of machinery for RAUW, and should only be
used when necessary. RAUW wasn't being used in any of these cases... it
was just a placeholder for a self-reference.

Where the real node was using `MDNode::getDistinct`, just replace the
temporary argument with `nullptr`.

Where the real node was using `MDNode::get`, the `replaceOperandWith`
call was "promoting" the node to a distinct one implicitly due to
self-reference detection in `MDNode::handleChangedOperand`. The
`TempMDNode` was serving a purpose by delaying uniquing, but it's way
simpler to just call `MDNode::getDistinct` in the first place.

Note that using a self-reference at all in these places is a hold-over
from before `distinct` metadata existed. It was an old trick to create
distinct nodes. It would be intrusive to change, including bitcode
upgrades, etc., and it's harmless so I'm not sure there's much value in
removing it from existing schemas. After this commit it still has a tiny
memory cost (in the extra metadata operand) but no more overhead in
construction.

Differential Revision: https://reviews.llvm.org/D90079

d4c667c9

[libc++] Get rid of <iostream> in a filesystem test · 89ec5091
Louis Dionne authored Oct 26, 2020

89ec5091

[MemProf] Decouple memprof build from COMPILER_RT_BUILD_SANITIZERS · ba71a074

Teresa Johnson authored Oct 26, 2020

The MemProf compiler-rt support relies on some of the support only built
when COMPILER_RT_BUILD_SANITIZERS was enabled. This showed up in some
initial bot failures, and I addressed those by making the memprof
runtime build also conditional on COMPILER_RT_BUILD_SANITIZERS
(3ed77ecd). However, this resulted in
another inconsistency with how the tests were set up that was hit by
Chromium:
  https://bugs.chromium.org/p/chromium/issues/detail?id=1142191

Undo the original bot fix and address this with a more comprehensive fix
that enables memprof to be built even when COMPILER_RT_BUILD_SANITIZERS
is disabled, by also building the necessary pieces under
COMPILER_RT_BUILD_MEMPROF.

Tested by configuring with a similar command as to what was used in the
failing Chromium configure. I reproduced the Chromium failure, as well
as the original bot failure I tried to fix in
3ed77ecd, with that fix reverted.
Confirmed it now works.

Differential Revision: https://reviews.llvm.org/D90190

ba71a074

[AIX] Also error on -G for link-only step · 3d4aebbb

Xiangling Liao authored Oct 26, 2020

Error on -G on AIX for all modes(preprocess, assemble, compile, link).

Differential Revision: https://reviews.llvm.org/D90063

3d4aebbb

[InstCombine] add folds for icmp+ctpop · 5a6e66ec

Sanjay Patel authored Oct 26, 2020

https://alive2.llvm.org/ce/z/XjFPQJ

  define void @src(i64 %value) {
    %t0 = call i64 @llvm.ctpop.i64(i64 %value)
    %gt = icmp ugt i64 %t0, 63
    %lt = icmp ult i64 %t0, 64
    call void @use(i1 %gt, i1 %lt)
    ret void
  }

  define void @tgt(i64 %value) {
    %eq = icmp eq i64 %value, -1
    %ne = icmp ne i64 %value, -1
    call void @use(i1 %eq, i1 %ne)
    ret void
  }

  declare i64 @llvm.ctpop.i64(i64) #1
  declare void @use(i1, i1)

5a6e66ec

[InstCombine] add tests for ctpop at bitwidth limit; NFC · 05f011b2
Sanjay Patel authored Oct 26, 2020

05f011b2
[InstCombine] reduce code duplication in icmp intrinsic folds; NFC · 437d7551
Sanjay Patel authored Oct 26, 2020

437d7551

[libc++] NFC: Minor refactoring in filesystem_test_helper.h to ease readability · b03ea054

Louis Dionne authored Oct 26, 2020

The variable declarations interleaved with logic was really difficult
to read. Instead, simply have two different implementations for _WIN32
and others.

b03ea054