Commits · main · Lorenzo Albano / LLVM bpEVL

Jun 01, 2023

Lorenzo Albano authored Feb 23, 2023

This pass transforms full-length vector instructions or intrinsics calls
to VP ones by recovering the (mask,evl) information from one of the
memory writing VP operations and backpropagating it.

89997d47

May 11, 2023

Enable the use of Vector Predication intrinsics in the loop vectorizer. · 5ebd21b4

Lorenzo Albano authored Feb 15, 2023

Add new VP Recipes for the Explicit Vector Length (EVL) and add support
for VP memory intrinsics (vp.load, vp.store, vp.gather, vp.scatter).

5ebd21b4

May 03, 2023

[SelectionDAG][NFCI] Use common logic for identifying MMI vars · a524f847

Felipe de Azevedo Piovezan authored May 02, 2023

After function argument lowering, but prior to instruction selection,
dbg declares pointing to function arguments are lowered using special
logic.

Later, during instruction selection (both "fast" and regular ISel), this
logic is "repeated" in order to identify which intrinsics have already
been lowered. This is bad for two reasons:

1. The logic is not _really_ repeated, the code is different, which
could lead to duplicate lowering of the intrinsic.
2. Even if the logic were repeated properly, this is still code
duplication.

This patch addresses these issues by storing all preprocessed
dbg.declare intrinsics in a set inside FuncInfo; the set is queried upon
instruction selection.

Differential Revision: https://reviews.llvm.org/D149682

a524f847

[AArch64] Add more efficient bitwise vector reductions. · 8e46ac36

Sp00ph authored May 03, 2023

Improves the codegen for VECREDUCE_{AND,OR,XOR} operations on AArch64.
Currently, these are fully scalarized, except if the vector is a <N x i1>. This
patch improves the codegen down to O(log(N)) where N is the length of the
vector for vectors whose elements are not i1, by repeatedly applying the
bitwise operations to the two halves of the vector. <N x i1> bitwise reductions
are handled using VECREDUCE_{UMAX,UMIN,ADD} instead.

I had to update quite a few codegen tests with these changes, with a general
downward trend in instruction count. Since the vector reductions already have
tests, I haven't added any new tests myself.

Differential Revision: https://reviews.llvm.org/D148185

8e46ac36

[RISCV] Use vslidedown for undef sub-sequences in generic build_vector · 53710b43

Philip Reames authored May 03, 2023

This is a follow up to D149263 which extends the generic vslide1down handling to use vslidedown (without the one) for undef elements, and in particular for undef sub-sequences. This both removes the domain crossing, and for undef subsequences results in fewer instructions over all.

Differential Revision: https://reviews.llvm.org/D149658#inline-1446673

53710b43

[RISCV] Use vslide1down lowering for two element non-constant build_vectors · 9fc5af1b

Philip Reames authored May 03, 2023

When the values are in GPRs, the vslide1down lowering is always better. We need to greatly improve the splat-and-mask cost model to handle constants in a meaningful way, so for now, limit this to non-constant vectors.

This does send the "partially constant" case down the vslide1down path. This could cause some regressions, though I don't see any in practice.

The cost modeling for the general case is annoyingly tricky. We have a great amount of inconsistency around immediate operands, and as a result, the exact constant and exact lowering choice matters a lot. I'm hoping that we get a "good enough" result without modeling this exactly, but we may need to do something analogous to getIntMatCost (i.e. a search w/costing).

Differential Revision: https://reviews.llvm.org/D149667

9fc5af1b

[RISCV] Add MC support of RISCV zcmt Extension · 9f0d7257

WuXinlong authored May 02, 2023

This patch add the instructions of zcmt extension.
[[ https://github.com/riscv/riscv-code-size-reduction/releases/tag/v1.0.0-RC5.7 | spac is here ]]
Which includes two instructions (cm.jt&cm.jalt) and a CSR Reg JVT

co-author: @Scott Egerton

Reviewed By: kito-cheng, craig.topper

Differential Revision: https://reviews.llvm.org/D133863

9f0d7257

[llvm-objdump][COFF] Keep columns aligned in the console output when exports ordinals are large. · 72f6ea65
Alexandre Ganea authored May 03, 2023

72f6ea65

[LLD][COFF] Fix incorrect pattern in test · 14220fed

Alexandre Ganea authored May 02, 2023

The previous pattern was matching the RVA `0` to the first character of `0x1010`. Make sure now that the entire export entry is matched.

14220fed

[AArch64] Combine concat through rshrn · b96967ad

David Green authored May 03, 2023

This tries to push the concat in trunc(concat(rshr, rshr)) into the leaves, so
that we can generate rshrn(concat). This helps improve the codegen for small
types, using the existing rshrn patterns.

Differential Revision: https://reviews.llvm.org/D149636

b96967ad

Revert "[clang] Reject flexible array member in a union in C++" · 7178ee19

Mariya Podchishchaeva authored May 03, 2023

This reverts commit 22e2db60.

Broke buildbots on Windows. It seems standard headers on Windows contain
flexible array members in unions

7178ee19

[mlir][tblgen] Fix emitting wrong index for `either` directive. · 32032cbf
Chia-hung Duan authored May 03, 2023
```
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D149152
```
32032cbf

[clang] Reject flexible array member in a union in C++ · 22e2db60

Mariya Podchishchaeva authored May 03, 2023

It was rejected in C, and in a strange way accepted in C++. However, the
support was never properly tested and fully implemented, so just reject
it in C++ mode as well.

This change also fixes crash on attempt to initialize union with flexible
array member. Due to missing check on union, there was a null expression
added to init list that caused crash later.

Fixes https://github.com/llvm/llvm-project/issues/61746

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D147626

22e2db60

[AArch64] Additional tests for rshrn patterns. NFC · 15723e6f
David Green authored May 03, 2023
```
See D149636
```
15723e6f

[LLD][ELF] Fix --check-dynamic-relocations for 32-bit targets · ed5dd8e5

Andrew Ng authored Apr 26, 2023

OutputSection::checkDynRelAddends() incorrectly reports an internal
linker error for large addends on 32-bit targets. This is caused by the
lack of sign extension in DynamicReloc::computeAddend() for 32-bit
addends.

Differential Revision: https://reviews.llvm.org/D149347

ed5dd8e5

Fix MSVC "not all control paths return a value" warning. NFC. · c68e92d9
Simon Pilgrim authored May 03, 2023

c68e92d9

[ShrinkWrap] Use underlying object to rule out stack access. · 4e2b4f97

Florian Hahn authored May 03, 2023

Allow shrink-wrapping past memory accesses that only access globals or
function arguments. This patch uses getUnderlyingObject to try to
identify the accessed object by a given memory operand. If it is a
global or an argument, it does not access the stack of the current
function and should not block shrink wrapping.

Note that the caller's stack may get accessed when passing an argument
via the stack, but not the stack of the current function.

This addresses part of the TODO from D63152.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D149668

4e2b4f97

[flang][hlfir] Lower vector subscripted RHS designators · 583d492c

Jean Perier authored May 03, 2023

Lower vector subscripted designators as values when they appear outside
of the assignment left-hand side and input IO contexts.

This matches Fortran semantics where vector subscripted designators cannot
be written to outside of the two contexts mentioned above: they are
passed/taken by value where they appear.

This patch uses the added hlfir.element_addr to lower vector designators
in lowering. But when reaching the end of the designator lowering, the
hlfir.element_addr is turned into an hlfir.elemental when lowering is
not asking for the hlfir.elemental_addr.

This approach allows lowering vector subscripted in the same way in
while visiting the designator, and only adapt to the context at the
edge.

The part where lowering uses the hlfir.elemental_addr will be
done in further patch as it requires lowering assignments in the
new hlfir.region_assign op, and there is not codegen yet for these
new operations.

Differential Revision: https://reviews.llvm.org/D149480

583d492c

[flang][hlfir] Add hlfir.elemental_addr for vector subscripted assignment · bdc9914b

Jean Perier authored May 03, 2023

See the operation description in HLFIROps.td.

Depends on D149442

Differential Revision: https://reviews.llvm.org/D149449

bdc9914b

[flang][hlfir] Add hlfir.region_assign and its hlfir.yield terminator · 64b591a8

Jean Perier authored May 03, 2023

hlfir.region_assign is a Region based version of hlfir.assign: the
right-hand side and left-hand-side are evaluated in their own region,
and an optional region can be added to implement user defined
assignment.

This will be used for:
 - assignments inside where and forall
 - user defined assignments
 - assignments to vector subscripted entities.

Rational:

Forall and Where lowering requires solving an expression/assignment
evaluation scheduling problem based on data dependencies between the
variables being assigned and the one used in the expressions.
Keeping left-hand side and right-hand side in their own region will
make it really easy to analyse the dependency and move around the
expression evaluation as a whole. Operation DAGs are hard to scissor out
when the LHS and RHS evaluation are lowered in the same block. The pass
dealing with further forall/where lowering in HLFIR will need to
succeed. It is not acceptable for them to fail splitting the RHS/LHS
evaluation code. Keeping them in independent block is an approach that
cannot fail.

For user defined assignments, having a region allows implementing all
the call details in lowering, and even to allow inlining of the user
assignment, before it is decided if a temporary for the LHS or RHS is
required or not.

The operation description mention "hlfir.elemental_addr" (operation that
will be used for vector subscripted LHS) and "ordered assignment trees"
(concept/inetrface that will be used to represent forall/where structure
in HLFIR). These will be pushed in follow-up patch, but I do not want t
scissor out the descriptions.

Differential Revision: https://reviews.llvm.org/D149442

64b591a8

[llvm-readobj][AMDGPU] Bypass MD verification for PAL · 415956fe

pvanhout authored Mar 15, 2023

Small split change from D146023.

Migrate elf-notes to v4 and fix llvm-readobj to work with PAL metadata.

Reviewed By: kzhuravl

Differential Revision: https://reviews.llvm.org/D146119

415956fe

Fix MLIR properties generic printing to honor eliding large attributes · dfee17d3

Mehdi Amini authored May 02, 2023

There was a discrepancy where the flag was honored when passed through the
command line, but not when passed through the API, which was leading to a
python test failing.

dfee17d3

Revert "[openmp] [test] Set __COMPAT_LAYER=RunAsInvoker when running tests on Windows" · 1bd3fba8

Martin Storsjö authored Apr 27, 2023

This reverts commit 63f0fdc2.

Since f1431bbf, this environment
variable is always set up by lit itself, so individual test suites
don't need to set it.

Differential Revision: https://reviews.llvm.org/D149356

1bd3fba8

[libc] Use -nolibc -nostdlib++ -nostartfiles for hermetic test link. · db171f2f

Siva Chandra Reddy authored May 02, 2023

We previously used a more stricter -nostdlib option which was also removing
compiler-rt/libgcc.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D149683

db171f2f

Update BUILD file for bazel. · af202eaa

Wenzhi Cui authored May 03, 2023

llvm::CodeGen was missing so add them to deps

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D149720

af202eaa

Fix MLIR Python test after · 61d0f803

Mehdi Amini authored May 02, 2023

Some mid-air collision between a change in the generic format and this
new python test.

61d0f803

[mlir][tosa] Fold log(exp) to no-op · 62ccc506

Kai Sasaki authored May 03, 2023

Element-wise log(exp) does no operation so that we can fold it into no-op effectively.

Reviewed By: eric-k256

Differential Revision: https://reviews.llvm.org/D149632

62ccc506

[docs] Prefer --target= -masm=intel to -target -mllvm -x86-asm-syntax=intel · 081cab0d
Fangrui Song authored May 02, 2023

081cab0d
[RISCV] Return false from isShuffleMaskLegal for i1 vectors. · a070cb5e
Craig Topper authored May 02, 2023
```
We don't have i1 vector shuffle lowering.
```
a070cb5e

Revert part of D149033 b/c original code is correct · 3910a9fc

Shengchen Kan authored May 02, 2023

This reverts part of D149033 and  rG8f966cedea594d9a91e585e88a80a42c04049e6c. The added test case
is kept to avoid future regression.

Reviewed By: vzakhari, vdonaldson

Differential Revision: https://reviews.llvm.org/D149639

3910a9fc

Adopt Properties to store operations inherent Attributes in the Arith dialect · 9fbe3b51

Mehdi Amini authored Apr 10, 2023

This is part of an on-going migration to adopt Properties inside MLIR.

Differential Revision: https://reviews.llvm.org/D148298

9fbe3b51

Adopt Properties to store operations inherent Attributes in the Func dialect · 7143e333

Mehdi Amini authored Apr 10, 2023

This is part of an on-going migration to adopt Properties inside MLIR.

Differential Revision: https://reviews.llvm.org/D148297

7143e333

Adopt Properties to store operations inherent Attributes in TOSA · a1f55bd3

Mehdi Amini authored Apr 10, 2023

This is part of an on-going migration to adopt Properties inside MLIR.

Differential Revision: https://reviews.llvm.org/D148296

a1f55bd3

Fix a typo in head comment of `CurPPLexer`. · e46aa7f9

Zhouyi Zhou authored May 02, 2023

In head comment of CurPPLexer field of class Preprocessor,
'The current top of the stack what we're lexing from' should be
'The current top of the stack that we're lexing from'.

Differential Revision: https://reviews.llvm.org/D149709

e46aa7f9

[AMDGPU][NFC] Preserve PDTWrapperPass in UnifyDivergentExitNodes · 40ed87a0
Anshil Gandhi authored May 02, 2023
```
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D149568
```
40ed87a0
[gn] Actually reformat files after adding CodeGen deps · 3d5932b1
Nico Weber authored May 02, 2023
```
This should've been part of 8221c316.
```
3d5932b1

[gn build] Port rest of (LowLevelType->Support) · 8221c316

Nico Weber authored May 02, 2023

This adds all the CodeGen deps all over the place.

I ran

    git show 9cfeba5b > foo2.txt

to get the original patch into a text file and then ran

    #!/usr/bin/env python3
    import os
    in_cmake = False
    for l in open('foo2.txt'):
      if l.startswith('+++ b/'):
        cmake = l[len('+++ b/'):-1]
        in_cmake = 'CMakeLists.txt' in cmake
      if not in_cmake:
        continue
      prefix = 'llvm/utils/gn/secondary/'
      gn_file = os.path.join(prefix, os.path.dirname(cmake), 'BUILD.gn')
      if l.startswith('+ '):
        add = l[1:].strip()
        if add == 'CodeGen':
          try:
            with open(gn_file) as f:
                contents = f.read()
          except:
            print(f'skipping {gn_file}')
            continue
          contents = contents.replace(' deps = [', ' deps = ["//llvm/lib/CodeGen",')
          with open(gn_file, 'w') as f:
              f.write(contents)

to update all the GN files.

(I manually removed the dep on CodeGen that this added to llvm-min-tblgen.)

Finally, I ran

    git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format

to fix up the formatting.

8221c316

[gn] reformat all gn files · b9e1e6af
Nico Weber authored May 02, 2023
```
I ran:

    git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format
```
b9e1e6af

[NFC] Add a test case to make sure EarlyCSE preserves !prof when one · bb4ba96e

Mingming Liu authored Apr 27, 2023

instruction CSE'ed another.

- This should be a part of D148877. Before that patch, !prof is not added to known-id-set [1], and turns out unknown types of metadata are dropped in the implementation [2].
  - This test is mainly added to make sure there won't be regressions for this kind of pattern. The pattern is observed it in application code; looks like the result of indirect call is used as function arguments initially; after the function is inlined load-after-store CSE opportunity is exposed.

  [1] https://github.com/llvm/llvm-project/blob/f478721231bdb71ba8f0f6fb21673b9b7f652add/llvm/lib/Transforms/Utils/Local.cpp#L2727-L2741
  [2] https://github.com/llvm/llvm-project/blob/ade3c6a6a88ed3a9b06c076406f196da9d3cc1b9/llvm/lib/Transforms/Utils/Local.cpp#L2639

Differential Revision: https://reviews.llvm.org/D149396

bb4ba96e

[NFC][EarlyCSE]Modify test case to ensure branch weights are preserved with cse. · 297c10fd
Mingming Liu authored Apr 27, 2023
```
Differential Revision: https://reviews.llvm.org/D149390
```
297c10fd