Commits · 49d00824bbbb8945b92c0f592c6951a881a6242f · Lorenzo Albano / LLVM bpEVL

Mar 29, 2020

[VPlan] Use one VPWidenRecipe per original IR instruction. (NFC). · 49d00824

Florian Hahn authored Mar 29, 2020

This patch changes VPWidenRecipe to only store a single original IR
instruction. This is the first required step towards modeling it's
operands as VPValues and also towards breaking it up into a
VPInstruction.

Discussed as part of D74695.

Reviewers: Ayal, gilr, rengolin

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D76988

49d00824

[PostOrderIterator] Use SmallVector to store stack; NFC · 6ba63510
Nikita Popov authored Mar 29, 2020
```
We use a SmallPtrSet to track visited nodes, use a SmallVector
of the same size for the stack.
```
6ba63510

[X86] X86CallFrameOptimization - generalize slow push code path · a7115d51

Simon Pilgrim authored Mar 29, 2020

Replace the explicit isAtom() || isSLM() test with the more general (and more specific) slowTwoMemOps() check to avoid the use of the PUSHrmm push from memory case.

This is actually very tricky to test in anything but quite complex code, but the atomic-idempotent.ll tests seem to be the most straightforward to use.

Differential Revision: https://reviews.llvm.org/D76239

a7115d51

[AlignmentFromAssumptions] Fix a SCEV assertion resulting from address space differences. · 4bf015c0

Richard Diamond authored Mar 02, 2020

Summary:
On targets with different pointer sizes, -alignment-from-assumptions could attempt to create SCEV expressions which use different effective SCEV types. The provided test illustrates the issue.

In `getNewAlignment`, AASCEV would be the (only) alloca, which would have an effective SCEV type of i32. But PtrSCEV, the GEP in this case, due to being in the flat/default address space, will have an effective SCEV of i64.

This patch resolves the issue by truncating PtrSCEV to AASCEV's effective type.

Reviewers: hfinkel, jdoerfert

Reviewed By: jdoerfert

Subscribers: jvesely, nhaehnle, hiraditya, javed.absar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75471

4bf015c0

[X86] Add cost model test cases for fmin/fmax reduction. · c0aa97b6
Craig Topper authored Mar 28, 2020

c0aa97b6
[MC][PowerPC] Make .reloc support arbitrary relocation types · fc93787d
Fangrui Song authored Mar 28, 2020
```
Generalizes ad7199f3 (R_PPC_NONE/R_PPC64_NONE).
```
fc93787d

Mar 28, 2020

AMDGPU: Make use of default operands · 9564f467
Matt Arsenault authored Sep 07, 2019

9564f467
Put back initializers that were dropped in 0ab5b5b8 · dd030036
Benjamin Kramer authored Mar 28, 2020
```
Found by msan.
```
dd030036
[MDBuilder] Don't use stable sort for sorting integers. · ba2e72c5
Benjamin Kramer authored Mar 28, 2020

ba2e72c5

[InstCombine] Remove unreachable blocks before DCE · 2215dcf1

Nikita Popov authored Mar 28, 2020

Dropping unreachable code may reduce use counts on other instructions,
so it's better to do this earlier rather than later.

NFC-ish, may only impact worklist order.

2215dcf1

[InstCombine] Merge two functions; NFC · 97cc1275

Nikita Popov authored Mar 28, 2020

Merge AddReachableCodeToWorklist() into prepareICWorklistFromFunction().
It's one logical step, and this makes it easier to move code.

97cc1275

[ADT] Automatically forward llvm::sort to array_pod_sort if safe · d3b6e1f1

Benjamin Kramer authored Mar 28, 2020

This is safe if the iterator type is a pointer and the comparator is
stateless. The enable_if pattern I'm adding here only uses
array_pod_sort for the default comparator (std::less).

Using array_pod_sort has a potential performance impact, but I didn't
notice anything when testing clang. Sorting doesn't seem to be on the
hot path anywhere in LLVM.

Shrinks Release+Asserts clang by 73k.

d3b6e1f1

[AMDGPU] Stabilize sort order · 2d24d74b
Benjamin Kramer authored Mar 28, 2020
```
Found by the expensive checks in llvm::sort.
```
2d24d74b

[BPF] support 128bit int explicitly in layout spec · ced0d1f4

Yonghong Song authored Mar 21, 2020

Currently, bpf does not specify 128bit alignment in its
layout spec. So for a structure like
  struct ipv6_key_t {
    unsigned pid;
    unsigned __int128 saddr;
    unsigned short lport;
  };
clang will generate IR type
  %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }
Additional padding is to ensure later IR->MIR can generate correct
stack layout with target layout spec.

But it is common practice for a tracing program to be
first compiled with target flag (e.g., x86_64 or aarch64) through
clang to generate IR and then go through llc to generate bpf
byte code. Tracing program often refers to kernel internal
data structures which needs to be compiled with non-bpf target.

But such a compilation model may cause a problem on aarch64.
The bcc issue https://github.com/iovisor/bcc/issues/2827
reported such a problem.

For the above structure, since aarch64 has "i128:128" in its
layout string, the generated IR will have
  %struct.ipv6_key_t = type { i32, i128, i16 }

Since bpf does not have "i128:128" in its spec string,
the selectionDAG assumes alignment 8 for i128 and
computes the stack storage size for the above is 32 bytes,
which leads incorrect code later.

The x86_64 does not have this issue as it does not have
"i128:128" in its layout spec as it does permits i128 to
be alignmented at 8 bytes at stack. Its IR type looks like
  %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }

The fix here is add i128 support in layout spec, the same as
aarch64. The only downside is we may have less optimal stack
allocation in certain cases since we require 16byte alignment
for i128 instead of 8. But this is probably fine as i128 is
not used widely and in most cases users should already
have proper alignment.

Differential Revision: https://reviews.llvm.org/D76587

ced0d1f4

Upgrade some instances of std::sort to llvm::sort. NFC. · 4065e921
Benjamin Kramer authored Mar 28, 2020

4065e921

[CodeGen] Fix sinking local values in lpads with phis · e5bf5037

Reid Kleckner authored Mar 28, 2020

There was already a test case for landingpads to handle this case, but I
had forgotten to consider PHI instructions preceding the EH_LABEL in the
landingpad.

PR45261

e5bf5037

[InstCombine] Use replaceOperand() API in GEP transforms · 30d71210

Nikita Popov authored Mar 28, 2020

To make sure that replaced operands get DCEd. This drops one
iteration from gepphigep.ll, which is still not optimal.

This was the last test case performing more than 3 iterations.

NFC-ish, only worklist order should change.

30d71210

[InstCombine] Reduce code duplication in GEP of PHI transform; NFC · b1f78bae

Nikita Popov authored Mar 28, 2020

The `NewGEP->setOperand(DI, NewPN)` call was duplicated, and the
insertion of NewGEP is the same in both if/else, so we can extract it.

b1f78bae

After , fix build when -DLLVM_ENABLE_THREADS=OFF · 3ab3f3c5

Alexandre Ganea authored Mar 28, 2020

Tested on Linux with Clang 9, and on Windows with Visual Studio 2019 16.5.1 with -DLLVM_ENABLE_THREADS=ON and OFF.

3ab3f3c5

[InstCombine] Fix worklist management in foldXorOfICmps() · 672e8bfb

Nikita Popov authored Feb 21, 2020

Because this code does not use the IC-aware replaceInstUsesWith()
helper, we need to manually push users to the worklist.

This is NFC-ish, in that it may only change worklist order.

672e8bfb

[InstCombine] Change limit-max-iterations test case; NFC · 337b671b
Nikita Popov authored Mar 28, 2020
```
This particular case will stop needing multiple iterations in
a followup change.
```
337b671b

Enna1 authored Mar 28, 2020

This statement

    if (ReplaceWith == S) ReplaceWith = UndefValue::get(S->getType());

is introduced in https://reviews.llvm.org/rG35609d97ae89b8e13f40f4e6b9b056954f8baa83
to fix a case where unreachable code can cause select instruction
simplification to fail. In https://reviews.llvm.org/rGd10480657527ffb44ea213460fb3676a6b1300aa,
we begin to perform a depth-first walk of basic blocks. This means
we will not visit unreachable blocks. So we do not need this the
special check any more.

Differential Revision: https://reviews.llvm.org/D76753

03bc311a

[AsmPrinter] Emit .weak directive for weak linkage on COFF for symbols without a comdat · e6112a56

Martin Storsjö authored Mar 28, 2020

MC already knows how to emulate the .weak directive (with its ELF
semantics; i.e., an undefined weak symbol resolves to 0, and a defined
weak symbol has lower link precedence than a strong symbol of the same
name) using COFF weak externals. Plumb this through the ASM printer too,
so that definitions marked with __attribute__((weak)) at the language
level (which gets translated to weak linkage at the IR level) have the
corresponding .weak directive emitted. Note that declarations marked
with __attribute__((weak)) at the language level (which translates to
extern_weak at the IR level) already have .weak directives emitted.

Weak*/linkonce* symbols without an associated comdat (in particular, ones
generated with __attribute__((weak)) in C/C++) were earlier emitted as
normal unique globals, as the comdat is required to provide the linkonce
semantics. This change makes sure they are emitted as .weak instead,
allowing other symbols to override them.

Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not
quite sure what that test covers, since the behavior being tested in it
(the emission of a one_only section) is just a result of passing
-function-sections to llc; the linkonce_odr makes no difference.

Add a new coff-weak.ll which tests the new directive emission.

Based on an previous patch by Shoaib Meenai.

Differential Revision: https://reviews.llvm.org/D44543

e6112a56

[SCCP] Remove LatticeVal alias now that transition is done (NFC). · 81f173ed

Florian Hahn authored Mar 28, 2020

The LatticeVal alias was introduced to reduce the diff size for the
transition to ValueLatticeElement, which is done now.

This patch removes the unnecessary alias and updates some very verbose
type uses with auto.

81f173ed

[SCCP] Remove unused toLatticeValue helper (NFC). · a44bf59c
Florian Hahn authored Mar 28, 2020
```
LatticeVal is an alias for ValueLatticeElement and the function is not
used any longer.
```
a44bf59c
Fix `-Wsign-compare` warning. NFC. · d2dd0fac
Michael Liao authored Mar 28, 2020

d2dd0fac

[llvm-rc] Allow -1 for menu item IDs · 8330dcad

Martin Storsjö authored Mar 27, 2020

This seems to be used in some resource files, e.g.
https://github.com/wxWidgets/wxWidgets/blob/f3217573d7240411e7817c9d76d965b2452987a2/include/wx/msw/wx.rc#L28.

MSVC rc.exe and GNU windres both allow any value here, and silently
just truncate to uint16_t range. This just explicitly allows the
-1 value and errors out on others - the same was done for control
IDs in dialogs in c1a67857.

Differential Revision: https://reviews.llvm.org/D76951

8330dcad

[X86][SSE] Add testnzc(~X,Y) -> testnzc(X,Y) test cases · 8c1dbd5c
Simon Pilgrim authored Mar 28, 2020

8c1dbd5c
[X86][SSE] Add original PR38522 test case · d34d2ec2
Simon Pilgrim authored Mar 27, 2020

d34d2ec2
[X86][SSE] Add combine tests for PTEST/TESTPS/TESTPD instructions · 8d85da5f
Simon Pilgrim authored Mar 27, 2020
```
Including some test coverage for PR38522
```
8d85da5f

[docs] Added solutions to slow build under common problems. · 37943e51

Evan LeClercq authored Mar 28, 2020

I added a list of options to configure should someone have issues with
long build time or running out of memory. This was added under common
problems in the getting started section of the documentation.

Reviewed By: Meinersbur, dim, e-leclercq

Differential Revision: https://reviews.llvm.org/D75425

37943e51

[NFC] Attributor comment updates / cast cleanup · 06066c40

Uday Bondhugula authored Mar 28, 2020

Minor update/fixes to comments for the Attributor pass, and dyn_cast -> cast.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D76972

06066c40

[FEnv] Constfold some unary constrained operations · f3987391

Serge Pavlov authored Jan 15, 2020

This change implements constant folding to constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.

Differential Revision: https://reviews.llvm.org/D72930

f3987391

Revert "[FileCollector] Add a method to add a whole directory and it contents." · 190df4a5
Jonas Devlieghere authored Mar 27, 2020
```
This reverts commit 8913769e because the
unit test is failing on the Windows bot.
```
190df4a5

[GlobalISel] Fix equality for copies from physregs in matchEqualDefs · 98d05f88

Jessica Paquette authored Mar 26, 2020

When we see this:

```
%a = COPY $physreg
...
SOMETHING implicit-def $physreg
...
%b = COPY $physreg
```

The two copies are not equivalent, and so we shouldn't perform any folding
on them.

When we have two instructions which use a physical register check that they
define the same virtual register(s) as well.

e.g., if we run into this case

```
%a = COPY $physreg
...
%b = COPY %a
```

we can say that the two copies are the same, and can be folded.

Differential Revision: https://reviews.llvm.org/D76890

98d05f88

[FileCollector] Devirtualize FileCollector (NFC) · a67f057f
Jonas Devlieghere authored Mar 27, 2020
```
This is not (yet) necessary.
```
a67f057f

[FileCollector] Add a method to add a whole directory and it contents. · 8913769e

Jonas Devlieghere authored Mar 27, 2020

Extend the FileCollector's API with addDirectory which adds a directory
and its contents to the VFS mapping.

Differential revision: https://reviews.llvm.org/D76671

8913769e

[RISCV] Support llvm.thread.pointer · aabc24ac

Kamlesh Kumar authored Mar 27, 2020

Fixes https://bugs.llvm.org/show_bug.cgi?id=45303 (clang crashed on __builtin_thread_pointer)

Reviewed By: lenary, MaskRay, luismarques

Differential Revision: https://reviews.llvm.org/D76828

aabc24ac

FunctionRef: Strip cv qualifiers in the converting constructor · cbce88dd

David Blaikie authored Mar 27, 2020

Without this some instances of copy construction would use the
converting constructor & lead to the destination function_ref referring
to the source function_ref instead of the underlying functor.

Discovered in feedback from 857bf5da

Thanks to Johannes Doerfert, Arthur O'Dwyer, and Richard Smith for the
discussion and debugging.

cbce88dd

[DAGCombine] Fix splitting indexed loads in ForwardStoreValueToDirectLoad() · 48214113

Nemanja Ivanovic authored Mar 27, 2020

In DAGCombiner::visitLOAD() we perform some checks before breaking up an indexed
load. However, we don't do the same checking in ForwardStoreValueToDirectLoad()
which can lead to failures later during combining
(see: https://bugs.llvm.org/show_bug.cgi?id=45301).

This patch just adds the same checks to this function as well.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45301

Differential revision: https://reviews.llvm.org/D76778

48214113