Commits · 4388c979dabbbe763275c1f94293f19b56c8ed3d · Lorenzo Albano / LLVM bpEVL

Apr 07, 2022

[VPlan] Use vector.body as header name in VPlan native path. · 4388c979
Florian Hahn authored Apr 07, 2022
```
This brings the VPlan block naming in line with the naming of the
generated basic blocks.
```
4388c979

[RISCV][VP] Add basic RVV codegen for vp.fcmp · 8216255c

Fraser Cormack authored Apr 04, 2022

This patch adds the necessary infrastructure to lower vp.fcmp via
ISD::VP_SETCC to RVV instructions.

Most notably this patch adds cond-code legalization for VP_SETCC,
reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in
additional SDValue parameters for the Mask and EVL. This method then
uses VP operations to legalize the condcode.

There is still a general lack of canonicalization on VP_SETCC as opposed
to SETCC which results in worse code than is theoretically possible.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123051

8216255c

[x86] Improve select lowering for smin(x, 0) & smax(x, 0) · 842d0bf9

Wei Xiao authored Apr 05, 2022

smin(x, 0):
  (select (x < 0), x, 0) -> ((x >> (size_in_bits(x)-1))) & x

smax(x, 0):
  (select (x > 0), x, 0) -> (~(x >> (size_in_bits(x)-1))) & x
  The comparison is testing for a positive value, we have to invert the sign
  bit mask, so only do that transform if the target has a bitwise 'and not'
  instruction (the invert is free).

The transform is performed only when CMP has a single user to avoid
increasing total instruction number.

https://alive2.llvm.org/ce/z/euUnNm
https://alive2.llvm.org/ce/z/37339J

Differential Revision: https://reviews.llvm.org/D123109

842d0bf9

[LoopSink] Use MemorySSA with legacy pass manager · 674ee4d3

Nikita Popov authored Apr 06, 2022

LoopSink with the legacy pass manager still uses AST, because we
can't compute MemorySSA conditionally. I think now that the legacy
pass manager will be removed soon(TM) we don't need to care about
compile-time impact here anymore. Additionally, since MemorySSA is
no longer eagerly optimized, the impact is actually not that high
anymore (~0.2% geomean regression on CTMark).

This just makes legacy PM and new PM behavior line up -- as a
followup I'll drop these options entirely and make MemorySSA use
mandatory.

Differential Revision: https://reviews.llvm.org/D123216

674ee4d3

[AMDGPU] Fix test difference in debug and release. NFC. · 78cb11c8

Stanislav Mekhanoshin authored Apr 06, 2022

Added -disable-gisel-legality-check to couple GlobalISel tests
which have not legal instructions to avoid difference in
debug and release builds.

78cb11c8

[RISCV] Add CMOV isel pattern for (select (setgt X, Imm), Y, Z) · f8911235
Liqin Weng authored Apr 07, 2022
```
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122644
```
f8911235

Reland "[Driver] Default CLANG_DEFAULT_PIE_ON_LINUX to ON"" · 2aca33ba

Fangrui Song authored Apr 06, 2022

(The upgrade of the ppc64le bot and D121257 have fixed compiler-rt failures. Tested by nemanjai.)

Default the option introduced in D113372 to ON to match all(?) major Linux
distros. This matches GCC and improves consistency with Android and linux-musl
which always default to PIE.
Note: CLANG_DEFAULT_PIE_ON_LINUX may be removed in the future.

Differential Revision: https://reviews.llvm.org/D120305

2aca33ba

[CSKY] Fix some Clang warnings. NFC · ef437a7d
Fangrui Song authored Apr 06, 2022
```
Reviewed By: zixuan-wu

Differential Revision: https://reviews.llvm.org/D122872
```
ef437a7d

AMDGPU: Handle private atomics · e6012c8e

Matt Arsenault authored Apr 05, 2022

Use new NotAtomic expansion to turn these into the equivalent
non-atomic operations. Independent lanes cannot access the private
memory of other lanes, so there's no possibility for synchronization.

These don't really appear directly in user code, but
InferAddressSpaces can make these appear after optimizations.

Fixes issues 54693 and 54274.

e6012c8e

AtomicExpand: Add NotAtomic lowering strategy · 7f14a1d4

Matt Arsenault authored Apr 05, 2022

Currently LowerAtomics exists as a separate pass which blindly
replaces all atomics. Add a new lowering strategy option to eliminate
the atomics which the target can control on a per-instruction level.

7f14a1d4

AtomicExpand: Change return type for shouldExpandAtomicStoreInIR · c4ea925f

Matt Arsenault authored Apr 05, 2022

Use the same enum as the other atomic instructions for consistency, in
preparation for addition of another strategy.

Introduce a new "Expand" option, since the store expansion does not
use cmpxchg. Alternatively, the existing CmpXChg strategy could be
renamed to Expand.

c4ea925f

[RISCV] Supplement patterns for vnsrl.wx/vnsra.wx when splat shift is sext or zext · 1b547799
Lian Wang authored Mar 31, 2022
```
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122786
```
1b547799
[AMDGPU] Check SI LDS offset bug in the allowsMisalignedMemoryAccesses · a41a676e
Stanislav Mekhanoshin authored Apr 06, 2022
```
Differential Revision: https://reviews.llvm.org/D123268
```
a41a676e
[gn build] Port 39f15686 · 7ac2e30f
LLVM GN Syncbot authored Apr 07, 2022

7ac2e30f

Transforms: Split LowerAtomics into separate Utils and pass · 39f15686

Matt Arsenault authored Apr 05, 2022

This will allow code sharing from AtomicExpandPass. Not entirely sure
why these exist as separate passes though.

39f15686

[mlir:PDL] Expand how native constraint/rewrite functions can be defined · ea64828a

River Riddle authored Mar 19, 2022

This commit refactors the expected form of native constraint and rewrite
functions, and greatly reduces the necessary user complexity required when
defining a native function. Namely, this commit adds in automatic processing
of the necessary PDLValue glue code, and allows for users to define
constraint/rewrite functions using the C++ types that they actually want to
use.

As an example, lets see a simple example rewrite defined today:

```
static void rewriteFn(PatternRewriter &rewriter, PDLResultList &results,
                      ArrayRef<PDLValue> args) {
  ValueRange operandValues = args[0].cast<ValueRange>();
  TypeRange typeValues = args[1].cast<TypeRange>();
  ...
  // Create an operation at some point and pass it back to PDL.
  Operation *op = rewriter.create<SomeOp>(...);
  results.push_back(op);
}
```

After this commit, that same rewrite could be defined as:

```
static Operation *rewriteFn(PatternRewriter &rewriter ValueRange operandValues,
                            TypeRange typeValues) {
  ...
  // Create an operation at some point and pass it back to PDL.
  return rewriter.create<SomeOp>(...);
}
```

Differential Revision: https://reviews.llvm.org/D122086

ea64828a

[MIPS] Initial support for MIPS-I load delay slots · 303c1801

Simon Dardis authored Apr 07, 2022

LLVM so far has only supported the MIPS-II and above architectures. MIPS-II is pretty close to MIPS-I, the major difference
being that "load" instructions always take one extra instruction slot to propogate to registers. This patch adds support for
MIPS-I by adding hazard handling for load delay slots, alongside MIPSR6 forbidden slots and FPU slots, inserting a NOP
instruction between a load and any instruction immediately following that reads the load's destination register. I also
included a simple regression test. Since no existing tests target MIPS-I, those all still pass.

Issue ref: https://github.com/simias/psx-sdk-rs/issues/1

I also tested by building a simple demo app with Clang and running it in an emulator.

Patch by: @impiaaa

Differential Revision: https://reviews.llvm.org/D122427

303c1801

[AMDGPU] Regenerate global isel lds ops test checks. NFC. · 09c2b7c3
Stanislav Mekhanoshin authored Apr 06, 2022

09c2b7c3

[MSSA] Print memory phis when inspecting walker. · 50d41f3e

Alina Sbirlea authored Apr 06, 2022

This makes the MemorySSA and MemorySSA Walker printers consistent.
Invokation `-print<memoryssa-walker>` should also have the MemoryPhis.

50d41f3e

Revert · 08075a7e

Alina Sbirlea authored Mar 29, 2022

Roll-forward 29fada4a.
Issue triggered was due to UB.

Differential Revision: https://reviews.llvm.org/D121987

08075a7e

[mips] Remove stale comment (NFC) · 8e1d9f00
Simon Dardis authored Apr 06, 2022
```
Test commit for my current email address.
```
8e1d9f00
gn build: Fix some tests for host_os to instead check current_os. · 38f92009
Peter Collingbourne authored Apr 06, 2022
```
Should fix Windows build:
http://45.33.8.238/win/55809/step_4.txt
```
38f92009

[AArch64][AMDGPU][WebAssembly] Use static_cast instead of a reinterpret_cast... · 1235aaef

Craig Topper authored Apr 06, 2022

[AArch64][AMDGPU][WebAssembly] Use static_cast instead of a reinterpret_cast to downcast in parseMachineFunctionInfo. NFC

static_cast is a little safer here since the compiler will
ensure we're casting to a class derived from
yaml::MachineFunctionInfo.

I believe this first appeared on AMDGPU and was copied to the
other two targets.

Spotted when it was being copied to RISCV in D123178.

Differential Revision: https://reviews.llvm.org/D123260

1235aaef

Apr 06, 2022

[demangler] Node precision dumper · 4a4d0985

Nathan Sidwell authored Mar 30, 2022

Add contents to the demangler node dumper's print(Prec) functions.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122740

4a4d0985

Reland "gn build: Fix support for building the builtins for baremetal." · 02a7b175

Peter Collingbourne authored Mar 31, 2022

Our support for building for baremetal was conditional on a default
off arg and would have failed to build if you had somehow arranged
to pass the correct --target flag; presumably nobody noticed because
nobody was turning it on. A better approach is to model baremetal
as a separate "OS" called "baremetal" and build it in the same way
as we cross-compile for other targets. That's what this patch does.
I only hooked up the arm64 target but others can be added.

Relanding after fixing Mac build breakage in D123244.

Differential Revision: https://reviews.llvm.org/D122862

02a7b175

gn build: Use target OS to control whether to use/depend on llvm-ar. · 096477e2

Peter Collingbourne authored Apr 06, 2022

When cross-compiling from Mac to non-Mac, we need to use the just-built
llvm-ar instead of libtool. We're currently doing the right thing
when determining which archiver command to use, but the path to ar
and the toolchain dependencies were being set based on the host OS
(current_os evaluated in host OS toolchain), instead of the target
OS. Fix the problem by looking up current_os inside toolchain_args.

Differential Revision: https://reviews.llvm.org/D123244

096477e2

[demangler][NFC] Rename SwapAndRestore to ScopedOverride · 51f6caf2

Nathan Sidwell authored Feb 28, 2022

The demangler has a utility class 'SwapAndRestore'. That name is
confusing. It's not swapping anything, and the restore part happens at
the object's destruction. What it's actually doing is allowing a
override of some value that is dynamically accessible within the
lifetime of a lexical scope. Thus rename it to ScopedOverride, and
tweak it's member variable names.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122606

51f6caf2

[Support] [BLAKE3] Fix building for Cygwin · 9edee89b

Martin Storsjö authored Apr 06, 2022

Use the windows-gnu assembly files on x86_64 Cygwin too.

This fixes https://github.com/llvm/llvm-project/issues/54685.

Differential Revision: https://reviews.llvm.org/D123187

9edee89b

[AArch64] Fix the upper limit for folded address offsets for COFF · 8d7a17b7

Martin Storsjö authored Apr 06, 2022

In COFF, the immediates in IMAGE_REL_ARM64_PAGEBASE_REL21 relocations
are limited to 21 bit signed, i.e. the offset has to be less than
(1 << 20). The previous limit did intend to cover for this case, but
had missed that the 21 bit field was signed.

This fixes issue https://github.com/llvm/llvm-project/issues/54753.

Differential Revision: https://reviews.llvm.org/D123160

8d7a17b7

[gn build] Port c03d6257 · ee5fda1f
LLVM GN Syncbot authored Apr 06, 2022

ee5fda1f

[LoopInterchange] Try to achieve the most optimal access pattern after interchange · eac34875

Congzhe Cao authored Apr 06, 2022

Motivated by pr43326 (https://bugs.llvm.org/show_bug.cgi?id=43326), where a slightly
modified case is as follows.

 void f(int e[10][10][10], int f[10][10][10]) {
   for (int a = 0; a < 10; a++)
     for (int b = 0; b < 10; b++)
       for (int c = 0; c < 10; c++)
         f[c][b][a] = e[c][b][a];
 }

The ideal optimal access pattern after running interchange is supposed to be the following

 void f(int e[10][10][10], int f[10][10][10]) {
   for (int c = 0;  c < 10; c++)
     for (int b = 0; b < 10; b++)
       for (int a = 0; a < 10; a++)
         f[c][b][a] = e[c][b][a];
 }

Currently loop interchange is limited to picking up the innermost loop and finding an order
that is locally optimal for it. However, the pass failed to produce the globally optimal
loop access order. For more complex examples what we get could be quite far from the
globally optimal ordering.

What is proposed in this patch is to do a "bubble-sort" fashion when doing interchange.
By comparing neighbors in `LoopList` in each iteration, we would be able to move each loop
onto a most appropriate place, hence this is an approach that tries to achieve the
globally optimal ordering.

The motivating example above is added as a test case.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D120386

eac34875

[RISCV] Merge rv32/rv64 test files. NFC · 0d237d1f
Craig Topper authored Apr 06, 2022

0d237d1f
Revert "gn build: Fix support for building the builtins for baremetal." · d384f2b2
Peter Collingbourne authored Apr 06, 2022
```
This reverts commit b02b9b3d.

Broke Mac build: http://45.33.8.238/macm1/32578/step_4.txt
```
d384f2b2
DebugInfo: Make the simplified template names prefix more unique · 6b306233
David Blaikie authored Apr 06, 2022

6b306233
[gn build] Port 9fc45ca0 · 2232d35f
LLVM GN Syncbot authored Apr 06, 2022

2232d35f

[NFC][CodeGen] Add comments for SDNode debug ID · 28cb9081

Daniil Kovalev authored Apr 06, 2022

Normally, we place fields serving for debug purpose declarations
under `#if LLVM_ENABLE_ABI_BREAKING_CHECKS`. For `SDNode::PersistentId` and
`SelectionDAG::NextPersistentId`, we do not want to do so because it adds
unneeded complexity without noticeable benefits (see discussion with @thakis
in D120714). This patch adds comments describing why we don't place those
fields under `#if` not to confuse anyone more.

Differential Revision: https://reviews.llvm.org/D123238

28cb9081

Revert "[gn build] (manually) port 83a798d4 (abi_breaking_checks in tests)" · 25b7efc9
Nico Weber authored Apr 06, 2022
```
This reverts commit edddf384.
83a798d4 was reverted in 62a983eb.
```
25b7efc9

gn build: Fix support for building the builtins for baremetal. · b02b9b3d

Peter Collingbourne authored Mar 31, 2022

Differential Revision: https://reviews.llvm.org/D122862

b02b9b3d

[LegalizeTypes][VP] Use LoVT/HiVT when splitting VP operations in SplitVecRes_UnaryOp. · bdb1ab98

Craig Topper authored Apr 06, 2022

The VP path was using the split source VTs instead of the split
destination VTs. This may not be a problem today because the VP
nodes going through this have the same source and dest VTs.
It will be a problem when we start using this function for legalizing
VP cast operations.

bdb1ab98

Add the /nologo flag to llvm-ml · 912551dc

Alan Zhao authored Apr 06, 2022

This flag is present in MSVC's ml.exe to suppress copyright info output.
LLVM doesn't output copyright info, so this flag does nothing in
llvm-ml. We still add this flag though so that when llvm-ml is used as a
drop-in replacement for MSVC ml.exe, we don't get any extra warnings.
Furthermore, this behavior is also consistent with other llvm binaries
for Windows (e.g. clang-cl, llvm-mt, lld-link, etc.)

Differential revision: https://reviews.llvm.org/D123068

912551dc