Commits · d55be79d754890b45352611d04b6b16c4fd3c737 · Lorenzo Albano / LLVM bpEVL

Oct 21, 2021

[RISCV] Expand scalable vector CTTZ/CTLZ/CTPOP. · d55be79d
Craig Topper authored Oct 21, 2021
```
Differential Revision: https://reviews.llvm.org/D112233
```
d55be79d
Revert "[IPT] Restructure cache to allow lazy update following invalidation [NFC]" · 3781a46c
Arthur Eubanks authored Oct 21, 2021
```
This reverts commit baea663a.

Causes crashes, e.g. https://lab.llvm.org/buildbot/#/builders/77/builds/10715.
```
3781a46c
Add the papers that were applied to the latest C2x working draft · 408075ec
Aaron Ballman authored Oct 21, 2021

408075ec

Revert "[CMake] Cache the compiler-rt library search results" · ba4920e9

Petr Hosek authored Oct 21, 2021

This reverts commit 0eed292f, there
are compiler-rt build failures that appear to have been introduced
by this change.

ba4920e9

[SLP] Add additional tests which caused crashes with versioning. · a4b8979a
Florian Hahn authored Sep 16, 2021

a4b8979a

[mlir:GreedyPatternRewriter] Add debug logging for pattern rewriter actions · 5652ecc3

River Riddle authored Oct 21, 2021

This effectively mirrors the logging in dialect conversion, which has proven
very useful for understanding the pattern application process.

Differential Revision: https://reviews.llvm.org/D112120

5652ecc3

[NFC] Clean up a few methods within GreedyPatternRewriter · b7144ab7
River Riddle authored Oct 21, 2021
```
Move a few methods out of line and clean up comments.
```
b7144ab7

Avoid infinity arithmetics when computing exp approximations · 21f9e4a1

Ahmed Taei authored Oct 19, 2021

Otherwise this can result a poison value on some platforms see https://bugs.llvm.org/show_bug.cgi?id=51204

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D112115

21f9e4a1

[test][ORC-RT] Disable x86_64 tests when target arch does not match · 92a6dd6e

Ben Langmuir authored Oct 21, 2021

When cross-compiling, these tests will fail. For now leave the host arch
check that was already there since I don't know why it was added.

92a6dd6e

[fir] Add Character helper · 13c31539

Valentin Clement authored Oct 21, 2021

This patch is extracted from D111337. It introduce the
CharacterExprHelper that helps dealing with character in FIR.

Reviewed By: schweitz, awarzynski

Differential Revision: https://reviews.llvm.org/D112140



Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>

13c31539

[VectorCombine] fold shuffle-of-binops with common operand · 66d22b4d

Sanjay Patel authored Oct 21, 2021

shuf (bo X, Y), (bo X, W) --> bo (shuf X), (shuf Y, W)

This is motivated by an example in D111800
(although that patch avoids the problem for that particular example).

The pattern is shown in reduced form with:
https://llvm.org/PR52178
https://alive2.llvm.org/ce/z/d8zB4D

There is no difference on the PhaseOrdering test from D111800
because the aarch64 cost model says that the shuffle cost is 3 while
the fadd cost is 2.

Differential Revision: https://reviews.llvm.org/D111901

66d22b4d

Reland [clang] Pass -clear-ast-before-backend in Clang::ConstructJob() · 19b07ec0

Arthur Eubanks authored Oct 06, 2021

This clears the memory used for the Clang AST before we run LLVM passes.

https://llvm-compile-time-tracker.com/compare.php?from=d0a5f61c4f6fccec87fd5207e3fcd9502dd59854&to=b7437fee79e04464dd968e1a29185495f3590481&stat=max-rss
shows significant memory savings with no slowdown (in fact -O0 slightly speeds up).

For more background, see
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.

Turn this off for the interpreter since it does codegen multiple times.

Relanding with fix for -print-stats: D111973

Relanding with fix for plugins: D112190

If you'd like to use this even with plugins, consider using the features
introduced in D112096.

This can be turned off with -Xclang -no-clear-ast-before-backend.

Differential Revision: https://reviews.llvm.org/D111270

19b07ec0

[RISCV] Add a test showing incorrect VSETVLI insertion · 92673fad

Fraser Cormack authored Oct 21, 2021

This test case, reduced from an internal test failure, shows how we may
incorrectly skip the insertion of VSETVLI instructions when doing
cross-basic-block analysis.

The entry block ends in a `e32,mf2`. Its single successor, %bb.1, ends with a
`e8,mf8`, but for a mask-type instruction, so is considered compatible.
This means that the info %bb.1 is merged into its predecessor so
produces a `e32,mf2`. When it comes to the last block, which requires a
`e32,mf2`, we skip the insertion of a vsetvli because all predecessors
were determined to preserve the right vtype.

However, when %bb.1 is actually laid out it does actually need a
`e8,mf8` vsetvli, since the previous instruction has a different tail
policy. This means that when execution flows from %bb.1 to %bb.3, the
`vadd.vx` is misconfigured.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D112223

92673fad

[IPT] Restructure cache to allow lazy update following invalidation [NFC] · baea663a

Philip Reames authored Oct 21, 2021

This change restructures the cache used in IPT to point not to the first special instruction, but to the first instruction which *could* be special. That is, the cached reference is always equal to the first special, or comes before it in the block.

This avoids expensive block scans when we are removing special instructions from the beginning of the block. At the moment, this case is not heavily used, though it does trigger in GVN when doing CSE of calls. The main motivation was a change I'm no longer planning to move forward with, but the cache optimization seemed worthwhile as a minor perf win at low cost.

Differential Revision: https://reviews.llvm.org/D111768

baea663a

Update the title and encoding for the C++ status page · acfe7d89
Aaron Ballman authored Oct 21, 2021

acfe7d89
Update the C++ and C status pages now that Clang 13 has been released · cfca2ae1
Aaron Ballman authored Oct 21, 2021

cfca2ae1

[clang] Don't clear AST if we have consumers running after the main action · 2dcad775

Arthur Eubanks authored Oct 20, 2021

Downstream users may have Clang plugins. By default these plugins run
after the main action if they are specified on the command line.

Since these plugins are ASTConsumers, presumably they inspect the AST.
So we shouldn't clear it if any plugins run after the main action.

Reviewed By: dblaikie, hans

Differential Revision: https://reviews.llvm.org/D112190

2dcad775

Reapply [ORC-RT] Configure the ORC runtime for more architectures and platforms · b8da5947

Ben Langmuir authored Oct 20, 2021

Reapply 5692ed0c, but with the ORC runtime disabled explicitly on
CrossWinToARMLinux to match the other compiler-rt runtime libraries.

Differential Revision: https://reviews.llvm.org/D112229

---

Enable building the ORC runtime for 64-bit and 32-bit ARM architectures,
and for all Darwin embedded platforms (iOS, tvOS, and watchOS). This
covers building the cross-platform code, but does not add TLV runtime
support for the new architectures, which can be added independently.

Incidentally, stop building the Mach-O TLS support file unnecessarily on
other platforms.

Differential Revision: https://reviews.llvm.org/D112111

b8da5947

[clang] Use StringRef::contains (NFC) · dccfaddc
Kazu Hirata authored Oct 21, 2021

dccfaddc

[DebugInfo] Support typedef with btf_decl_tag attributes · f6811cec

Yonghong Song authored Sep 20, 2021

Clang patch ([1]) added support for btf_decl_tag attributes with typedef
types. This patch added llvm support including dwarf generation.
For example, for typedef
   typedef unsigned * __u __attribute__((btf_decl_tag("tag1")));
   __u u;
the following shows llvm-dwarfdump result:
   0x00000033:   DW_TAG_typedef
                   DW_AT_type      (0x00000048 "unsigned int *")
                   DW_AT_name      ("__u")
                   DW_AT_decl_file ("/home/yhs/work/tests/llvm/btf_tag/t.c")
                   DW_AT_decl_line (1)

   0x0000003e:     DW_TAG_LLVM_annotation
                     DW_AT_name    ("btf_decl_tag")
                     DW_AT_const_value     ("tag1")

   0x00000047:     NULL

  [1] https://reviews.llvm.org/D110127

Differential Revision: https://reviews.llvm.org/D110129

f6811cec

[Clang] Support typedef with btf_decl_tag attributes · b3960102

Yonghong Song authored Sep 20, 2021

Previously, btf_del_tag attribute supports record, field, global variable,
function and function parameter ([1], [2]). This patch added support for typedef.
The main reason is for typedef of an anonymous struct/union, we can only apply
btf_decl_tag attribute to the anonymous struct/union like below:
  typedef struct { ... } __btf_decl_tag target_type
In this case, the __btf_decl_tag attribute applies to anonymous struct,
which increases downstream implementation complexity. But if
typedef with btf_decl_tag attribute is supported, we can have
  typedef struct { ... } target_type __btf_decl_tag
which applies __btf_decl_tag to typedef "target_type" which make it
easier to directly associate btf_decl_tag with a named type.
This patch permitted btf_decl_tag with typedef types with this reason.

 [1] https://reviews.llvm.org/D106614
 [2] https://reviews.llvm.org/D111588

Differential Revision: https://reviews.llvm.org/D110127

b3960102

[libc++] Use addressof in vector. · 56df1d80

Mark de Wever authored Oct 10, 2021

This addresses the usage of `operator&` in `<vector>`.

I now added tests for the current offending cases. I wonder whether it
would be better to add one addressof test per directory and test all
possible violations. Also to guard against possible future errors?

(Note there are still more headers with the same issue.)

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D111961

56df1d80

[lld-macho] Simplify lc-linker-option.ll and re-enable it on Windows · 77fdc0e5

Jez Ng authored Oct 21, 2021

While attempting to simplify it, I discovered a concerning discrepancy
between our handling of LC_LINKER_OPTION vs ld64's. In particular, ld64
does not appear to check for `-all_load` nor `-ObjC` when processing
those options. Thus, if/when we fix this behavior, no duplicate symbol
error will be expected regardless of the use-after-free. As such, I've
removed the test logic that tries to induce the duplicate symbol error.
We can just rely on ASAN to do the verification.

In order to make the test run on Windows, I've removed the symlink
logic. Both ld64 and LLD handle this un-symlinked framework just fine.

I also capitalized the framework name, since that's the typical
convention.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D112195

77fdc0e5

[ORC-RT] Remove stray printf debugging output. · 5c723231
Lang Hames authored Oct 20, 2021
```
These were accidentally picked up in an earlier commit.
```
5c723231

[mlir][Linalg] Improve conv vectorization for the stride==1 case. · 203accf0

Nicolas Vasilache authored Oct 20, 2021

In the stride == 1 case, conv1d reads contiguous data along the input dimension. This can be advantageaously used to bulk memory transfers and compute while avoiding unrolling. Experimentally, this can yield speedups of up to 50%.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D112139

203accf0

[libomptarget][DeviceRTL] Generalise and simplify cmakelists · a602c2b5

Jon Chesterfield authored Oct 21, 2021

Step towards building the DeviceRTL for amdgpu.

Mostly replaces cuda-specific toolchain finding logic with the
generic logic currently found in the amdgpu deviceRTL cmake. Also
deletes dead code and changes the default to build on systems
without cuda installed, as the library doesn't use cuda and the
amdgpu-only systems generally won't have cuda installed.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D111983

a602c2b5

[InstCombine] generalize reassociated Demorgan folds · 3888de95

Sanjay Patel authored Oct 21, 2021

This updates the recent D112108 / b92412fb
to handle the flipped logic ('or') sibling:
https://alive2.llvm.org/ce/z/Y2L6Ch

3888de95

[InstCombine] add tests for DeMorgan with reassociation; NFC · 6b560a8e
Sanjay Patel authored Oct 21, 2021
```
These are direct mutations of the tests added for D112108 -
we should handle the sibling folds for 'or'.
```
6b560a8e
Do not downcast uint64_t to unsigned in UniqueID hash computation · 88303693
Kirill Bobyrev authored Oct 21, 2021
```
Context: https://reviews.llvm.org/D110925#inline-1070046
```
88303693

[runtimes] Properly handle the sysroot/triple/gcc-toolchain · 72117f2f

Louis Dionne authored Oct 12, 2021

In 395271ad, I simplified how we handled the target triple for the
runtimes. However, in doing so, we stopped considering the default
in CMAKE_CXX_COMPILER_TARGET, so we'd use the LLVM_DEFAULT_TARGET_TRIPLE
(which is the host triple) even if CMAKE_CXX_COMPILER_TARGET was specified.
This commit fixes that problem and also refactors the code so that it's
easy to see what the default value is.

The fact that nobody seems to have been broken by this makes me think
that perhaps nobody is using CMAKE_CXX_COMPILER_TARGET to specify the
triple -- but it should still work.

Differential Revision: https://reviews.llvm.org/D111672

72117f2f

[SystemZ][z/OS] Initial implementation for lowerCall on z/OS · aa3519f1

Anirudh Prasad authored Oct 21, 2021

- This patch provides the initial implementation for lowering a call on z/OS according to the XPLINK64 calling convention
- A series of changes have been made to SystemZCallingConv.td to account for these additional XPLINK64 changes including adding a new helper function to shadow the stack along with allocation of a register wherever appropriate
- For the cases of copying a f64 to a gr64 and a f128 / 128-bit vector type to a gr64, a `CCBitConvertToType` has been added and has been bitcasted appropriately in the lowering phase
- Support for the ADA register (R5) will be provided in a later patch.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D111662

aa3519f1

[DAGCombiner] fold bit-hack form of usubsat · d2198771

Sanjay Patel authored Oct 21, 2021

(i8 X ^ 128) & (i8 X s>> 7) --> usubsat X, 128

I haven't found a generalization of this identity:
https://alive2.llvm.org/ce/z/_sriEQ

Note: I was actually looking at the first form of the pattern in that link,
but that's part of a long chain of potential missed transforms in codegen
and IR....that I hope ends here!

The predicates for when this is profitable are a bit tricky. This version of
the patch excludes multi-use but includes custom lowering (as opposed to
legal only).

On x86 for example, we have custom lowering for some vector types, and that
uses umax and sub. So to enable that fold, we need add use checks to avoid
regressions. Even with legal-only lowering, we could see code with extra
reg move instructions for extra uses, so that constraint would have to be
eased very carefully to avoid penalties.

Differential Revision: https://reviews.llvm.org/D112085

d2198771

[SystemZ][z/OS] Additional test coverage for validating dialect instructions for SystemZ · fa111d30

Anirudh Prasad authored Oct 21, 2021

- There are certain instructions most notably those with extended mnemonics that restricted to only the gnu/att variant
- There are also certain instruction aliases/mnemonic aliases that are restricted only to the HLASM variant (see https://reviews.llvm.org/D97581, https://reviews.llvm.org/D94250 and https://reviews.llvm.org/D92185 for reference)
- This patch adds a few tests to check for the behaviour introduced in the above patches. The testing coverage could not be added in at the same time, due to parallel work being done introducing the HLASM syntax

Reviewed By: uweigand, abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D112172

fa111d30

[SLP]Unify vectorization of PHI and store nodes with improved tiny tree vectorization. · 3ea7877c

Alexey Bataev authored Sep 15, 2021

Vectorization of PHIs and stores very similar, it might be beneficial to
try to revectorize stores (like PHIs) if the total number of stores with
the same/alternate opcode is less than the vector size but number of
stores with the same type is larger than the vector size.

Differential Revision: https://reviews.llvm.org/D109831

3ea7877c

[mlir][linalg][bufferize] Fix bufferizesToMemoryWrite for TiledLoopOp · 5f8228d3
Matthias Springer authored Oct 21, 2021
```
This is the same fix as for scf.for.

Differential Revision: https://reviews.llvm.org/D112218
```
5f8228d3
[mlir][linalg][bufferize] Fix bug in getInplaceableOpResult · 94213bc7
Matthias Springer authored Oct 21, 2021
```
Differential Revision: https://reviews.llvm.org/D112123
```
94213bc7
[mlir][linalg][bufferize] Avoid creating copies that are never read · 7a7e93f1
Matthias Springer authored Oct 21, 2021
```
Differential Revision: https://reviews.llvm.org/D111956
```
7a7e93f1

[mlir][linalg][bufferize] Eliminate InitTensorOps of InsertSliceOp sources · c5501a7a

Matthias Springer authored Oct 21, 2021

An InitTensorOp is replaced with an ExtractSliceOp on the InsertSliceOp's destination. This optimization is applied after analysis and only to InsertSliceOps that were decided to bufferize inplace. Another analysis on the new ExtractSliceOp is needed after the rewrite.

Differential Revision: https://reviews.llvm.org/D111955

c5501a7a

Relax assert in ExprConstant to a return None. · 7ff4f48a

Jon Chesterfield authored Oct 21, 2021

Fixes a compiler assert on passing a compile time integer to atomic builtins.

Assert introduced in D61522
Function changed from ->bool to ->Optional in D76646
Simplifies call sites to getIntegerConstantExpr to elide the now-redundant
isValueDependent checks.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D112159

7ff4f48a

[clang][deps] Make resource directory deduction configurable · b8b14b68

Jan Svoboda authored Oct 21, 2021

The `clang-scan-deps` CLI tool invokes the compiler with `-print-resource-dir` in case the `-resource-dir` argument is missing from the compilation command line. This is to enable running the tool on compilation databases that use compiler from a different toolchain than `clang-scan-deps` itself. While this doesn't make sense when scanning modular builds (due to the `-cc1` arguments the tool generates), the tool can can be used to efficiently scan for file dependencies of non-modular builds too.

This patch stops deducing the resource directory by invoking the compiler by default. This mode can still be enabled by invoking `clang-scan-deps` with `--resource-dir-recipe invoke-compiler`. The new default is `--resource-dir-recipe modify-compiler-path` which relies on the resource directory deduction taking place in `Driver::Driver` which is based on the compiler path. This makes the default more aligned with the intended usage of the tool while still allowing it to serve other use-cases.

Note that this functionality was also influenced by D108979, where the dependency scanner stopped going through `ClangTool::run`. The function tried to deduce the resource directory based on the current executable path, which might not be what the users expect when invoked from within a shared library.

Depends on D108979.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108366

b8b14b68