Commits · 100e797adb433724a17c9b42b6533cd634cb796b · Lorenzo Albano / LLVM bpEVL

Nov 05, 2019

[LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) · 100e797a

Gil Rapaport authored Oct 07, 2019

This recommits 2be17087 (reverted in
d3ec06d2 for heap-use-after-free) with a fix
in IAI's reset() which was not clearing the set of interleave groups after
deleting them.

100e797a

Fix uninitialized variable warning. NFCI. · 95a25d88
Simon Pilgrim authored Nov 05, 2019

95a25d88
[MCObjectFileInfo] Fix uninitialized variable warnings. NFCI. · dec21e44
Simon Pilgrim authored Nov 05, 2019

dec21e44
[MachineOutliner] Fix uninitialized variable warnings. NFCI. · c7f127d9
Simon Pilgrim authored Nov 05, 2019

c7f127d9

[ObjC][ARC] Ignore lifetime markers between *ReturnValue calls · 47d10297

Francis Visoiu Mistrih authored Nov 04, 2019

When eliminating a pair of

`llvm.objc.autoreleaseReturnValue`

followed by

`llvm.objc.retainAutoreleasedReturnValue`

we need to make sure that the instructions in between are safe to
ignore.

Other than bitcasts and useless GEPs, it's also safe to ignore lifetime
markers for both static allocas (lifetime.start/lifetime.end) and dynamic
allocas (stacksave/stackrestore).

These get added by the inliner as part of the return sequence and can
prevent the transformation from happening in practice.

Differential Revision: https://reviews.llvm.org/D69833

47d10297

[NFC][ObjC][ARC] Add tests for OptimizeRetainRVCall · 68f39de0
Francis Visoiu Mistrih authored Nov 04, 2019
```
Add tests for bitcasts + zero GEPs, and pre-commit tests for lifetime
markers.
```
68f39de0

[JumpThreading] Factor out common code to update the SSA form (NFC) · 0016c1f4

Kazu Hirata authored Nov 04, 2019

Summary:
This patch factors out common code to update the SSA form in
JumpThreading.cpp -- partly for readability and partly to facilitate
an coming patch of my own.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69811

0016c1f4

[GVN] Fix uninitialized variable warnings. NFCI. · 77debf51
Simon Pilgrim authored Nov 05, 2019

77debf51

Add missing GVN =operator. NFCI. · 1842fe6b

Simon Pilgrim authored Nov 05, 2019

Fixes PVS Studio warning that the 'ValueTable' class implements a copy constructor, but lacks the '=' operator.

1842fe6b

[InstCombine] add tests for shift-logic-shift; NFC · 3ce0c785

Sanjay Patel authored Nov 05, 2019

This is based on existing CodeGen test files for x86 and AArch64.
The corresponding potential transform is shown in:
rL370617

3ce0c785

[AtomicExpandPass] Silence static analyzer warnings about operator priority. NFCI. · 9f294fc4
Dávid Bolvanský authored Nov 05, 2019

9f294fc4

[MachineScheduler] Enable AA in PostRA Machine scheduler · f01b9aa8

David Green authored Nov 05, 2019

This adds AA to Post-RA Machine Scheduling, allowing the pass more
freedom when handling memory operations.

My understanding is that this was just never done, not that it is
inherently incorrect to do so. The older PostRA List scheduler already
makes use of AA, it's just that the MI PostRA Scheduler was never taught
to use it.

Differential Revision: https://reviews.llvm.org/D69814

f01b9aa8

[Docs] Add LangRef documentation for freeze instruction · 2d21068d

Nuno Lopes authored Nov 05, 2019

Summary:
 - Describe the new freeze instruction
 - Make it explicit that branch on undef/poison is UB

Reviewers: chandlerc, majnemer, efriedma, nikic, reames, jdoerfert, lebedev.ri, regehr

Subscribers: fhahn, bollu, lebedev.ri, delcypher, spatel, filcab, llvm-commits, aqjune

Differential Revision: https://reviews.llvm.org/D29121

2d21068d

Fix PR40644: miscompile indexed FP constant store · 646896a4

Thomas Preud'homme authored Oct 03, 2019

Summary:
Functions replaceStoreOfFPConstant() and OptimizeFloatStore() both
replace store of float by a store of an integer unconditionally. However
this generates wrong code when the store that is replaced is an indexed
or truncating store. This commit solves this issue by adding an early
return in these functions when the store being considered is not a
normal store.

Bug was only observed on out of tree targets, hence the lack of testcase
in this commit.

Reviewers: efriedma

Subscribers: hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68420

646896a4

[ARM] Always enable UseAA in the arm backend · cf581d79

David Green authored Nov 05, 2019

This feature controls whether AA is used into the backend, and was
previously turned on for certain subtargets to help create less
constrained scheduling graphs. This patch turns it on for all
subtargets, so that they can all make use of the extra information to
produce better code.

Differential Revision: https://reviews.llvm.org/D69796

cf581d79

[Scheduling][ARM] Consistently enable PostRA Machine scheduling · 7d9af03f

David Green authored Nov 05, 2019

In the ARM backend, for historical reasons we have only some targets
using Machine Scheduling. The rest use the old list scheduler as they
are using itinaries and the list scheduler seems to produce better code
(and not crash running out of register on v6m codes). So whether to use
the MIScheduler or not is checked at runtime from the subtarget
features.

This is fine, except for post-ra scheduling. Whether to use the old
post-ra list scheduler or the post-ra machine schedule is decided as the
pass manager is set up, in arms case from a newly constructed subtarget.
Under some situations, like LTO, this won't include the correct cpu so
can pick the wrong option. This can have a surprising effect on
performance.

To fix that, this patch overrides targetSchedulesPostRAScheduling and
addPreSched2 in the ARM backend, adding _both_ post-ra schedulers and
picking at runtime which to execute. To pick between the two I've had to
add a enablePostRAMachineScheduler() method that normally returns
enableMachineScheduler() && enablePostRAScheduler(), which can be
overridden to enable just one of PostRAMachineScheduler vs
PostRAScheduler.

Thanks to David Penry for the identifying this problem.

Differential Revision: https://reviews.llvm.org/D69775

7d9af03f

[LoopUnroll] peel-loop-conditions.ll: add some 'is even/odd' peeling tests · 12c4a71c
Roman Lebedev authored Nov 05, 2019

12c4a71c

[InstCombine] dropRedundantMaskingOfLeftShiftInput(): truncation (PR42563) · ccf1a5f4

Roman Lebedev authored Nov 05, 2019

Summary:
That fold keeps growing and growing :(
I think this may be one of the last pieces for it.

Since D67677/D67725, the fold knowns the general form
of the pattern - where some masking is needed:
https://rise4fun.com/Alive/F5R
https://rise4fun.com/Alive/gslRa

But there is one more huge piece missing - if you are extracting some bits,
it is not impossible that the origin is wider than the extraction,
i.e. there may be a truncation. And we don't deal with that yet.

But we can, and the generalization remains fully identical:
https://rise4fun.com/Alive/Uar
https://rise4fun.com/Alive/5SW

After a preparatory cleanup i think the diff looks rather clean.

One missing piece is that in some patterns (especially pat. b),
`-1` only needs to be `-1` in final type, but that is for later..

https://bugs.llvm.org/show_bug.cgi?id=42563

Reviewers: spatel, nikic

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69125

ccf1a5f4

[RISCV] Add InstrInfo areMemAccessesTriviallyDisjoint hook · 0d47c7ab

Luís Marques authored Nov 05, 2019

Summary: Introduces the `InstrInfo::areMemAccessesTriviallyDisjoint`
hook. The test could check for instruction reorderings, but to avoid
being brittle it just checks instruction dependencies.

Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67046

0d47c7ab

DWARFDebugLoclists: Make it possible to read relocated addresses · b4c5b8f3

Pavel Labath authored Oct 31, 2019

Summary:
Handling relocations was not needed when the loclists section was a
DWO-only thing. But since DWARF5, it is possible to use it in regular
objects too, and the standard permits embedding addresses into the
section directly. These addresses need to be relocated in unlinked
files.

Reviewers: JDevlieghere, dblaikie, probinson

Subscribers: aprantl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68271

b4c5b8f3

Recommit "[HardwareLoops] Optimisation remarks" · 92164cf2

Sjoerd Meijer authored Nov 05, 2019

With a few things fixed:
- initialisaiton of the optimisation remark pass (this was causing the buildbot
  failures on PPC),
- a test case.

Differential Revision: https://reviews.llvm.org/D69660

92164cf2

[AArch64] Update test checks on merge-store-dependency.ll. NFC · edfb8eea
David Green authored Nov 05, 2019

edfb8eea
[IR] Remove switch's default block that causes clang 8 raise error · 92ef101d
aqjune authored Nov 05, 2019

92ef101d

[X86] Lower the cost of avx512 horizontal bool and/or reductions to... · 103968d1

Craig Topper authored Nov 04, 2019

[X86] Lower the cost of avx512 horizontal bool and/or reductions to 2*log2(bitwidth)+1 for legal types.

This better represents the kshift+binop we'd get for each stage
before the final extract. Its likely we'll do even better by
doing a kmov and a cmp with a GPR, but this is a good start.

The default handling was costing a worst case single source
permute shuffle of the vector before the binop. This worst
case assumes the shuffle might have to be emulated with
extracts and inserts. But since we know we're doing a reduction
we can assume we'll get kshift lowering.

There's still some room for improvement here, but this is
much better than it was.

103968d1

[IR] Add Freeze instruction · 58acbce3

aqjune authored Nov 05, 2019

Summary:
- Define Instruction::Freeze, let it be UnaryOperator
- Add support for freeze to LLLexer/LLParser/BitcodeReader/BitcodeWriter
  The format is `%x = freeze <ty> %v`
- Add support for freeze instruction to llvm-c interface.
- Add m_Freeze in PatternMatch.
- Erase freeze when lowering IR to SelDag.

Reviewers: deadalnix, hfinkel, efriedma, lebedev.ri, nlopes, jdoerfert, regehr, filcab, delcypher, whitequark

Reviewed By: lebedev.ri, jdoerfert

Subscribers: jfb, kristof.beyls, hiraditya, lebedev.ri, steven_wu, dexonsmith, xbolva00, delcypher, spatel, regehr, trentxintong, vsk, filcab, nlopes, mehdi_amini, deadalnix, llvm-commits

Differential Revision: https://reviews.llvm.org/D29011

58acbce3

[BPF] fix a use after free bug · 9f34447f

Yonghong Song authored Nov 04, 2019

Commit fff27212 ("[BPF] Fix CO-RE bugs with bitfields")
fixed CO-RE handling bitfield issues. But the implementation
introduced a use after free bug. The "Base" of the intrinsic
might be freed so later on accessing the Type of "Base"
might access the freed memory. The failed test case,
  CodeGen/BPF/CORE/offset-reloc-middle-chain.ll
is exactly used to test such a case.

Similarly to previous attempt to remember Metadata etc,
remember "Base" pointee Alignment in advance to avoid
such use after free bug.

9f34447f

[X86] Teach X86MCInstLower to swap operands of commutable instructions to... · f65493a8

Craig Topper authored Nov 04, 2019

[X86] Teach X86MCInstLower to swap operands of commutable instructions to enable 2-byte VEX encoding.

Summary:
The 2 source operands commutable instructions are encoded in the
VEX.VVVV field and the r/m field of the MODRM byte plus the VEX.B
field.

The VEX.B field is missing from the 2-byte VEX encoding. If the
VEX.VVVV source is 0-7 and the other register is 8-15 we can
swap them to avoid needing the VEX.B field. This works as long as
the VEX.W, VEX.mmmmm, and VEX.X fields are also not needed.

Fixes PR36706.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68550

f65493a8

Fix clone_constant_impl to correctly deal with null pointers · 31be9f3f

aqjune authored Nov 05, 2019

Summary:
This patch resolves llvm-c-test's following error

```
LLVM ERROR: LLVMGetValueKind returned incorrect type
```

which arises when the input bitcode contains a null pointer.

Reviewers: jdoerfert, CodaFi, deadalnix

Reviewed By: jdoerfert

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68928

31be9f3f

[analyzer] Add test directory for scan-build. · 0aba69eb

Devin Coughlin authored Nov 04, 2019

The static analyzer's scan-build script is critical infrastructure but
is not well tested. To start to address this, add a new test directory under
tests/Analysis for scan-build lit tests and seed it with several tests. The
goal is that future scan-build changes will be accompanied by corresponding
tests.

Differential Revision: https://reviews.llvm.org/D69781

0aba69eb

[BPF] Fix CO-RE bugs with bitfields · fff27212

Yonghong Song authored Oct 30, 2019

bitfield handling is not robust with current implementation.
I have seen two issues as described below.

Issue 1:
  struct s {
    long long f1;
    char f2;
    char b1:1;
  } *p;
  The current approach will generate an access bit size
  56 (from b1 to the end of structure) which will be
  rejected as it is not power of 2.

Issue 2:
  struct s {
    char f1;
    char b1:3;
    char b2:5;
    char b3:6:
    char b4:2;
    char f2;
  };
  The LLVM will group 4 bitfields together with 2 bytes. But
  loading 2 bytes is not correct as it violates alignment
  requirement. Note that sometimes, LLVM breaks a large
  bitfield groups into multiple groups, but not in this case.

To resolve the above two issues, this patch takes a
different approach. The alignment for the structure is used
to construct the offset of the bitfield access. The bitfield
incurred memory access is an aligned memory access with alignment/size
equal to the alignment of the structure.
This also simplified the code.

This may not be the optimal memory access in terms of memory access
width. But this should be okay since extracting the bitfield value
will have the same amount of work regardless of what kind of
memory access width.

Differential Revision: https://reviews.llvm.org/D69837

fff27212

[CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood · a5c8ec4b

Vedant Kumar authored Nov 01, 2019

Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).

Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.

Pre-patch debug session with a cross-TU tail call:

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

Post-patch (note that the tail-calling frame, "helper", is visible):

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
    frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

rdar://46577651

Differential Revision: https://reviews.llvm.org/D69743

a5c8ec4b

Nov 04, 2019

[AArch64] Update for Exynos · 4cbe10ef
Evandro Menezes authored Nov 04, 2019
```
Fix the costs of integer division.
```
4cbe10ef

Add more binutils tools to LLVM_INSTALL_TOOLCHAIN_ONLY target · 1cce82ea

Sam Clegg authored Oct 30, 2019

Also add the aliases for these tools so that
LLVM_INSTALL_BINUTILS_SYMLINKS and LLVM_INSTALL_TOOLCHAIN_ONLY can work
together.

Differential Revision: https://reviews.llvm.org/D69635

1cce82ea

[AMDGPU] Added assert in SIFoldOperands before ptr use. NFC. · 1bfcc608
Stanislav Mekhanoshin authored Nov 04, 2019

1bfcc608

[AMDGPU] deduplicate tablegen predicates · 4312c4af

Stanislav Mekhanoshin authored Nov 04, 2019

We are duplicating predicates if several parts of the combined
predicate list contain the same condition. Added code to deduplicate
the list.

We have AssemblerPredicates and AssemblerPredicate in the
PredicateControl, but we never use AssemblerPredicates with an
actual list, so this one is dropped.

This addresses the first part of the llvm bug 43886:
https://bugs.llvm.org/show_bug.cgi?id=43886

Differential Revision: https://reviews.llvm.org/D69815

4312c4af

[demangle] NFC: get rid of NodeOrString · af11f417

Erik Pilkington authored Nov 04, 2019

This class was a bit overengineered, and was triggering some PVS warnings.
Instead, put strings into a NameType and let clients unconditionally treat it
as a Node.

af11f417

[X86] Add support for -mvzeroupper and -mno-vzeroupper to match gcc · b2b6a54f

Craig Topper authored Nov 04, 2019

-mvzeroupper will force the vzeroupper insertion pass to run on
CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs
where it normally runs.

To support this with the default feature handling in clang, we
need a vzeroupper feature flag in X86.td. Since this flag has
the opposite polarity of the fast-partial-ymm-or-zmm-write we
used to use to disable the pass, we now need to add this new
flag to every CPU except KNL/KNM and BTVER2 to keep identical
behavior.

Remove -fast-partial-ymm-or-zmm-write which is no longer used.

Differential Revision: https://reviews.llvm.org/D69786

b2b6a54f

[SimplifyCFG] Use a (trivially) dominanting widenable branch to remove later slow path blocks · 6ff439b5

Philip Reames authored Nov 04, 2019

This transformation is a variation on the GuardWidening transformation we have checked in as it's own pass. Instead of focusing on merge (i.e. hoisting and simplifying) two widenable branches, this transform makes the observation that simply removing a second slowpath block (by reusing an existing one) is often a very useful canonicalization. This may lead to later merging, or may not. This is a useful generalization when the intermediate block has loads whose dereferenceability is hard to establish.

As noted in the patch, this can be generalized further, and will be.

Differential Revision: https://reviews.llvm.org/D69689

6ff439b5

[DAGCombine][MSP430] use shift amount threshold in DAGCombine (2/2) · 113181e9

Sanjay Patel authored Nov 04, 2019

Continuation of:
D69116

Contributes to a fix for PR43559:
https://bugs.llvm.org/show_bug.cgi?id=43559

See also D69099 and D69116

Use the TLI hook in DAGCombine.cpp to guard against creating
shift nodes that are not optimal for a target.

Patch by: @joanlluch (Joan LLuch)

Differential Revision: https://reviews.llvm.org/D69120

113181e9

[lit] Move measurement of testing time out of Run.execute · bd14bb42
Julian Lettner authored Feb 25, 2019

bd14bb42