Commits · 991cfd1379f7d5184a3f6306ac10cabec742bbd2 · Lorenzo Albano / LLVM bpEVL

Aug 24, 2022

[docs] Add LICENSE.txt to the root of the mono-repo · 991cfd13

Tobias Hieta authored Aug 24, 2022

This will make it easier to find the LICENSE and some
software also looks in the root to automatically find it.

Reviewed By: kristof.beyls, lattner

Differential Revision: https://reviews.llvm.org/D132018

991cfd13

[AArch64][X86] Add some fixed-order-recurrence tests to check the costmodel of... · e29f9f75
David Green authored Aug 24, 2022
```
[AArch64][X86] Add some fixed-order-recurrence tests to check the costmodel of fixed order recurrences. NFC
```
e29f9f75

[AArch64][SVE] Remove -O1 from SVE intrinsic tests. · 8dc1eee7

David Green authored Aug 17, 2022

This removes -O1 from the SVE ACLE intrinsics tests and replaces it with
-O0 and "opt -mem2reg -instcombine -tailcallelim". Instrcombine and
TailCallElim are only added to keep the differences smaller and can be
removed in a followup patches. The only remaining differences in the
tests are tbaa nodes not being emitted under -O0, and the removable of
some tailcall flags.

8dc1eee7

[mlir][Bazel] Fix bazel build. · 23b3bcc7

Adrian Kuegel authored Aug 24, 2022

To avoid a dependency cycle, add BytecodeImplementation.h header to the
"IR" target.

23b3bcc7

Fix warning from a7bfdc23 · 10841fca
Mahesh Ravishankar authored Aug 24, 2022

10841fca

[RISCV] Add zihintntl compressed instructions · 07a700f8

Alex authored Aug 22, 2022

Add zihintntl compressed instructions and some files related to zihintntl.
This patch is base on {D121670}.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121779

07a700f8

[DAGCombine] Add more tests for cmp to sbb combination; NFC · 72faccc3

Paweł Bylica authored Aug 23, 2022

Add 2 more tests for potential DAG combine of cmp into sbb.

Differential Revision: https://reviews.llvm.org/D132463

72faccc3

[mlir][Linalg] Handle multi-result operations in Elementwise op fusion. · a7bfdc23

Mahesh Ravishankar authored Aug 24, 2022

This drops the artificial requirement of producers having a single
result value to be able to fuse with consumers.

The current default also only fuses producer with consumer when the
producer has a single use. This is a simplifying assumption. There are
legitimate use cases where a producer can be fused with consumer and
the fused o pcould be used to replace the uses of the producer as
well. This needs to be done with care to avoid use-def violations. To
allow for downstream users to explore more fusion opportunities, the
core transformation method is exposed as a utility function.

This patch also modifies the control function to take just the fused
operand as the argument. This is enough information for the callers to
get the producer and the consumer operations being considered to
fuse. It also provides information of which producer result is used.

Differential Revision: https://reviews.llvm.org/D132301

a7bfdc23

[AIX] use the original name as the input to create the new symbol for TLS symbol. · dfe55cc1

esmeyi authored Aug 24, 2022

Summary: Currently, an error was reported when a thread local symbol has an invalid name. D100956 create a new symbol to prefix the TLS symbol name with a dot. When the symbol name is renamed, the error occurs. This patch uses the original symbol name (name in the symbol table) as the input for the symbol for TOC entry.

Reviewed By: shchenz, lkail

Differential Revision: https://reviews.llvm.org/D132348

dfe55cc1

[RISCV] Handle register spill in branch relaxation · 9c85382a

ZHU Zijia authored Aug 24, 2022

In branch relaxation pass, `j`'s with offset over 1MiB will be relaxed
to `jump` pseudo-instructions.

This patch allocates a stack slot for functions with a size greater than
1MiB. If the register scavenger cannot find a scratch register for
`jump`, spill a register to the slot before the jump and restore it
after the jump.

.mbb:
        foo
        j       .dest_bb
        bar
        bar
        bar
.dest_bb:
        baz

The above code will be relaxed to the following code.

.mbb:
        foo
        sd      s11, 0(sp)
        jump    .restore_bb, s11
        bar
        bar
        bar
        j       .dest_bb
.restore_bb:
        ld      s11, 0(sp)
.dest_bb:
        baz

Depends on D129999.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D130560

9c85382a

[RISCV][TableGen] Mark MachineInstr with FrameIndex as not compressible · d51581ff

ZHU Zijia authored Aug 24, 2022

If a MachineInstr's operand should be Reg in compiler's output but is
currently FrameIndex, `isCompressibleInst()` will terminate at
`MachineOperandType::getReg()`.

This patch adds `.isReg()` checks to make `isCompressibleInst()` return
false for these MachineInstr, allowing `getInstSizeInBytes()` to return
a value and `EstimateFunctionSizeInBytes()` to work as intended.

See https://reviews.llvm.org/D129999#3694222 for details.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D129999

d51581ff

[mlir][math] Lower math.floor,ceil to libm · ad714d5b

Kai Sasaki authored Aug 24, 2022

Lower math.floor and math.ceil to libm

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D131876

ad714d5b

Reland "[MLIR]Extend vector.gather to support n-D result" · f250b972
Che-Yu Wu authored Aug 24, 2022
```
Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D132507
```
f250b972

[MSAN] Handle array alloca with non-i64 size specification · 30d7d74d

Keno Fischer authored Aug 24, 2022

The array size specification of the an alloca can be any integer,
so zext or trunc it to intptr before attempting to multiply it
with an intptr constant.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D131846

30d7d74d

[MSAN] Correct shadow type for atomicrmw instrumentation · 5739d29c

Keno Fischer authored Aug 24, 2022

We were passing the type of `Val` to `getShadowOriginPtr`, rather
than the type of `Val`'s shadow resulting in broken IR. The fix
is simple.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D131845

5739d29c

[Polly] Don't use `llvm-config` anymore (in CMake sad path) · 4c511425

John Ericson authored Aug 20, 2022

If `LLVM_BUILD_MAIN_SRC_DIR` is not defined, just assume we are in
regular monorepo layout. Non-standard (and not really supported) layouts
can still be configured manually.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D132314

4c511425

[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit · 6d8ddf53
Bing1 Yu authored Aug 24, 2022
```
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132141
```
6d8ddf53

[DAG] MatchRotate - bail if we fail to match a shl/srl pair · e624f8a3

Simon Pilgrim authored Aug 24, 2022

extractShiftForRotate may fail to return canonicalized shifts due to constant folding or other simplification that can occur in getNode()

Fixes Issue #57283

e624f8a3

[HLSL] Infer language from file extension · 887bafb5
Chris Bieneman authored Aug 23, 2022
```
This allows the language mode for HLSL to be inferred from the file
extension.
```
887bafb5

[NFC] Fix warning · 96169057

Chris Bieneman authored Aug 23, 2022

This change came in a few hours ago and introduced a warning. The fix
is trivial, so I'm providing it. The original change was reviewed here:

https://reviews.llvm.org/D132331

96169057

Revert "[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit" · 0d8f9520
Bing1 Yu authored Aug 24, 2022
```
This reverts commit 07e34763.
```
0d8f9520
[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit · 07e34763
Bing1 Yu authored Aug 23, 2022
```
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132141
```
07e34763
[mlir:Bytecode] Move variable to inside of the lambda to fix MSVC build · ae97b5ac
River Riddle authored Aug 23, 2022
```
MSVC is not picking up a variable capture somehow, try moving it inside.
```
ae97b5ac

[BOLT][NFC] Move out handleAArch64IndirectCall · 37cbbea6

Amir Ayupov authored Aug 23, 2022

Move the large lambda out of BinaryFunction::disassemble, reducing its size from
255 to 233 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132104

37cbbea6

[BOLT][NFC] Move out handleIndirectBranch · c844850b

Amir Ayupov authored Aug 23, 2022

Move the large lambda out of BinaryFunction::disassemble, reducing its size from
295 to 255 LoC.

Differential Revision: https://reviews.llvm.org/D132101

c844850b

[BOLT][NFC] Move out handleExternalReference · ec1fbf22

Amir Ayupov authored Aug 23, 2022

Move the large lambda out of BinaryFunction::disassemble, reducing its size from
338 to 295 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132100

ec1fbf22

[BOLT][NFC] Move out handlePCRelOperand · 6cd475f8

Amir Ayupov authored Aug 23, 2022

Move the large lambda out of BinaryFunction::disassemble, reducing its size from
377 to 338 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132099

6cd475f8

[Clang] Avoid using unwind library in the MSVC environment · eca29d4a

Petr Hosek authored Aug 23, 2022

We're seeing the following warnings with --rtlib=compiler-rt:

  lld-link: warning: ignoring unknown argument '--as-needed'
  lld-link: warning: ignoring unknown argument '-lunwind'
  lld-link: warning: ignoring unknown argument '--no-as-needed'

MSVC doesn't use the unwind library, so just omit it.

Differential Revision: https://reviews.llvm.org/D132440

eca29d4a

[mlir:Bytecode] Use UNSUPPORTED instead of XFAIL for s390x · df4e637c

River Riddle authored Aug 23, 2022

Some tests still pass even though we don't claim big-endian support. Using
UNSUPPORTED is a better indicator than XFAIL that we don't guarantee that
the tests work.

df4e637c

[mlir:Bytecode] Add initial support for dialect defined attribute/type encodings · 02c2ecb9

River Riddle authored Aug 23, 2022

Dialects can opt-in to providing custom encodings by implementing the
`BytecodeDialectInterface`. This interface provides hooks, namely
`readAttribute`/`readType` and `writeAttribute`/`writeType`, that will be used
by the bytecode reader and writer. These hooks are provided a reader and writer
implementation that can be used to encode various constructs in the underlying
bytecode format. A unique feature of this interface is that dialects may choose
to only encode a subset of their attributes and types in a custom bytecode
format, which can simplify adding new or experimental components that aren't
fully baked.

Differential Revision: https://reviews.llvm.org/D132498

02c2ecb9

[mlir:Bytecode][NFC] Cleanup Attribute/Type reading · b3449392

River Riddle authored Aug 23, 2022

This moves some parsing functionality from BytecodeReader to
AttrTypeReader, and removes some duplication between the attribute/type
code paths.

Differential Revision: https://reviews.llvm.org/D132497

b3449392

[mlir:Bytecode][NFC] Refactor string section writing and reading · 83dc9999

River Riddle authored Aug 23, 2022

This extracts the string section writer and reader into dedicated
classes, which better separates the logic and will also simplify future
patches that want to interact with the string section.

Differential Revision: https://reviews.llvm.org/D132496

83dc9999

[memprof] Correct max size and access count computations · d10c1b88

Teresa Johnson authored Aug 23, 2022

The existing code resulted in the max size and access counts being equal
to the min. Compute the max instead (max lifetime was already correct).

Differential Revision: https://reviews.llvm.org/D132515

d10c1b88

[analyzer] Process non-POD array element destructors · aac73a31

isuckatcs authored Jul 29, 2022

The constructors of non-POD array elements are evaluated under
certain conditions. This patch makes sure that in such cases
we also evaluate the destructors.

Differential Revision: https://reviews.llvm.org/D130737

aac73a31

[flang] Keep original data type for do-variable value. · af7edf15

Slava Zakharin authored Aug 18, 2022

Keep the original data type of integer do-variables
for structured loops. When do-variable's data type
is an integer type shorter than IndexType, processing
the do-variable separately from the DoLoop's iteration index
allows getting rid of type casts, which can make backend
optimizations easier.

For example,
```
  do i = 2, n-1
    do j = 2, n-1
      ... = a(j-1, i)
    end do
  end do
```

If value of 'j' is computed by casting the DoLoop's iteration
index to 'i32', then Flang will produce the following LLVM IR:
```
  %1 = trunc i64 %iter_index to i32
  %2 = sub i32 %1, 1
  %3 = sext i32 %2 to i64
```

LLVM's InstCombine may try to get rid of the sign extension,
and may transform this into:
```
  %1 = shl i64 %iter_index, 32
  %2 = add i64 %1, -4294967296
  %3 = ashr exact i64 %2, 32
```

The extra computations for the element address applied on top
of this awkward pattern confuse LLVM vectorizer so that
it does not recognize the unit-strided access of 'a'.

Measured performance improvements on `SPEC CPU2000@IceLake`:
```
168.wupwise:    11.96%
171.swim:       11.22%
172.mrgid:      56.38%
178.galgel:      7.29%
301.apsi:        8.32%
```

Differential Revision: https://reviews.llvm.org/D132176

af7edf15

[libc++] Extend check for non-ASCII characters to src/, test/ and benchmarks/ · 355e0ce3
Louis Dionne authored Aug 18, 2022
```
Differential Revision: https://reviews.llvm.org/D132180
```
355e0ce3
[libc++] Remove trailing whitespace from libcxx includes, source, tests and benchmarks · 89469df8
Louis Dionne authored Aug 18, 2022
```
Differential Revision: https://reviews.llvm.org/D132175
```
89469df8
[libc][Obvious] Fix typo is chmod implementation. · d00e97df
Siva Chandra authored Aug 23, 2022
```
This now allows enabling the chmod function on aarch64.
```
d00e97df

Aug 23, 2022

Print more information when JSON parsing fails for unittests. · ab0574da
Eli Friedman authored Aug 23, 2022
```
Trying to figure out intermittent failure on reverse-iteration buildbot.
```
ab0574da

[SDAG] expand more is-power-of-2 patterns that use popcount · f8dfbea3

Sanjay Patel authored Aug 23, 2022

(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)

Adjust the legality check to avoid the poor codegen on AArch64.
We probably only want to use popcount on this pattern when it
is a single instruction.

fixes #57225

Differential Revision: https://reviews.llvm.org/D132237

f8dfbea3