Commits · cee313d288a4faf0355d76fb6e0e927e211d08a5 · Lorenzo Albano / LLVM bpEVL

Apr 17, 2019

Revert "Temporarily Revert "Add basic loop fusion pass."" · cee313d2
Eric Christopher authored Apr 17, 2019
```
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
```
cee313d2

Temporarily Revert "Add basic loop fusion pass." · a8634351

Eric Christopher authored Apr 17, 2019

As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07.

llvm-svn: 358546

a8634351

Add basic loop fusion pass. · ab70da07

Kit Barton authored Apr 17, 2019

This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Phabricator: https://reviews.llvm.org/D55851
llvm-svn: 358543

ab70da07

[x86] adjust LEA tests for better coverage; NFC · d5bc5ca3
Sanjay Patel authored Apr 16, 2019
```
The scale can 1, 2, or 3.

llvm-svn: 358539
```
d5bc5ca3

Apr 16, 2019

[EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101) · e08783e2

Sanjay Patel authored Apr 16, 2019

This is 1 of the problems discussed in the post-commit thread for:
rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html
and filed as:
https://bugs.llvm.org/show_bug.cgi?id=41101

Instcombine tries to canonicalize some of these cases (and there's room for improvement
there independently of this patch), but it can't always do that because of extra uses.
So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar
to how we detect commuted compares and commuted min/max/abs.

Differential Revision: https://reviews.llvm.org/D60723

llvm-svn: 358523

e08783e2

[CVP] Simplify umulo and smulo that cannot overflow · 52b24ee9

Nikita Popov authored Apr 16, 2019

If a umul.with.overflow or smul.with.overflow operation cannot
overflow, simplify it to a simple mul nuw / mul nsw. After the
refactoring in D60668 this is just a matter of removing an
explicit check against multiplications.

Differential Revision: https://reviews.llvm.org/D60791

llvm-svn: 358521

52b24ee9

[SLP] Refactoring of the operand reordering code. · 82ffa88a

Simon Pilgrim authored Apr 16, 2019

This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold:
i. Cleanup and simplify the reordering code, and
ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2.

This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo .

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D59973

llvm-svn: 358519

82ffa88a

[CVP] Add tests for non-overflowing mulo; NFC · 5a301779
Nikita Popov authored Apr 16, 2019
```
Should be simplified to simple mul.

llvm-svn: 358517
```
5a301779

[X86][AVX] X86ISD::PERMV/PERMV3 node types can never fold index ops · d769bb1e

Simon Pilgrim authored Apr 16, 2019

Improves codegen demonstrated by D60512 - instructions represented by X86ISD::PERMV/PERMV3 can never memory fold the operand used for their index register.

This patch updates the 'isUseOfShuffle' helper into the more capable 'isFoldableUseOfShuffle' that recognises that the op is used for a X86ISD::PERMV/PERMV3 index mask and can't be folded - allowing us to use broadcast/subvector-broadcast ops to reduce the size of the mask constant pool data.

Differential Revision: https://reviews.llvm.org/D60562

llvm-svn: 358516

d769bb1e

[InstCombine] Prune fshl/fshr with masked operands · 5ecd6a48

Nikita Popov authored Apr 16, 2019

If a constant shift amount is used, then only some of the LHS/RHS
operand bits are demanded and we may be able to simplify based on
that. InstCombineSimplifyDemanded already had the necessary support
for that, we just weren't calling it with fshl/fshr as root.

In particular, this allows us to relax some masked funnel shifts
into simple shifts, as shown in the tests.

Patch by Shawn Landden.

Differential Revision: https://reviews.llvm.org/D60660

llvm-svn: 358515

5ecd6a48

[InstCombine] Add tests for fshl/fshr with masked operands; NFC · f700081a

Nikita Popov authored Apr 16, 2019

Baseline tests for D60660.

Patch by Shawn Landden.

Differential Revision: https://reviews.llvm.org/D60688

llvm-svn: 358514

f700081a

[x86] add more tests for LEA formation; NFC · f136c46b
Sanjay Patel authored Apr 16, 2019
```
Promoting the shift to the wider type should allow LEA.

llvm-svn: 358513
```
f136c46b

[Tests] Add branch_weights to latches so that test is not effected by future... · c44b68e2

Philip Reames authored Apr 16, 2019

[Tests] Add branch_weights to latches so that test is not effected by future profitability patch to LoopPredication

llvm-svn: 358506

c44b68e2

[llvm-objdump] Test tabs in disassemble-align.s with a more visible character · 29cca271

Fangrui Song authored Apr 16, 2019

Summary: Apply rupprecht's suggestion in D60376

Reviewers: rupprecht

Reviewed By: rupprecht

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60777

llvm-svn: 358504

29cca271

[RISCV] Custom lower SHL_PARTS, SRA_PARTS, SRL_PARTS · 20d24240

Luis Marques authored Apr 16, 2019

When not optimizing for minimum size (-Oz) we custom lower wide shifts
(SHL_PARTS, SRA_PARTS, SRL_PARTS) instead of expanding to a libcall.

Differential Revision: https://reviews.llvm.org/D59477

llvm-svn: 358498

20d24240

[SystemZ] Add missing intrinsics to intrinsics-immarg.ll · 452060ab

Ulrich Weigand authored Apr 16, 2019

As of r356091, support for the ImmArg intrinsics was added,
including a SystemZ test case.  However, that test case doesn't
actually verify all SystemZ intrinsics with immediate arguments,
only a subset.  The rest of them actually works correctly, there's
just no test for them.  This patch add all missing intrinsics.

llvm-svn: 358495

452060ab

llvm-undname: Fix nullptr deref on invalid structor names in template args · c035c243

Nico Weber authored Apr 16, 2019

Similar to r358421: A StructorIndentifierNode has a Class field which
is read when printing it, but if the StructorIndentifierNode appears in
a template argument then demangleFullyQualifiedSymbolName() which sets
Class isn't called. Since StructorIndentifierNodes are always leaf
names, we can just reject them as well.

Found by oss-fuzz.

llvm-svn: 358491

c035c243

llvm-undname: Tweak arena allocator · aa18ae86

Nico Weber authored Apr 16, 2019

- Make `allocUnalignedBuffer` look more like `allocArray` and `alloc`.
  No behavior change.
- Change `Head->Used < Head->Capacity` to `Head->Used <= Head->Capacity`
  in `allocArray` and `alloc`. No intended behavior change, might be a
  minuscule memory usage improvement. Noticed this since it was the logic
  used in `allocUnalignedBuffer`.
- Don't let `allocArray` alloc too small buffers for names that have
  more than 512 levels of nesting (in 64-bit builds). Fixes a heap
  buffer overflow found by oss-fuzz.

Differential Revision: https://reviews.llvm.org/D60774

llvm-svn: 358489

aa18ae86

llvm-undname: add a missing CHECK: to a passing test · 5961b020
Nico Weber authored Apr 16, 2019
```
llvm-svn: 358488
```
5961b020
Fix llvm-undname tests after r358485 · ff92e715
Nico Weber authored Apr 16, 2019
```
llvm-svn: 358487
```
ff92e715

Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink... · 21eb771d

Hans Wennborg authored Apr 16, 2019

Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)

The original commit caused false positives from AddressSanitizer's
use-after-scope checks, which have now been fixed in r358478.

> The code was previously checking that candidates for sinking had exactly
> one use or were a store instruction (which can't have uses). This meant
> we could sink call instructions only if they had a use.
>
> That limitation seemed a bit arbitrary, so this patch changes it to
> "instruction has zero or one use" which seems more natural and removes
> the need to special-case stores.
>
> Differential revision: https://reviews.llvm.org/D59936

llvm-svn: 358483

21eb771d

Asan use-after-scope: don't poison allocas if there were untraced lifetime... · 6ae05777

Hans Wennborg authored Apr 16, 2019

Asan use-after-scope: don't poison allocas if there were untraced lifetime intrinsics in the function (PR41481)

If there are any intrinsics that cannot be traced back to an alloca, we
might have missed the start of a variable's scope, leading to false
error reports if the variable is poisoned at function entry. Instead, if
there are some intrinsics that can't be traced, fail safe and don't
poison the variables in that function.

Differential revision: https://reviews.llvm.org/D60686

llvm-svn: 358478

6ae05777

[llvm-objdump] Align instructions to a tab stop in disassembly output · fa860ff7

Fangrui Song authored Apr 16, 2019

This relands D60376/rL358405, with the difference: sed 'y/\t/ /' -> tr '\t' ' '
BSD sed doesn't support escape characters for the 'y' command.
I didn't use it in rL358405 because it was not listed at
https://llvm.org/docs/GettingStarted.html#software but it
should be available.

Original description:

In GNU objdump, -w/--wide aligns instructions in the disassembly output.
This patch does the same to llvm-objdump. However, we always use the
wide format (-w/--wide is ignored), because the narrow format
(instructions are misaligned) is probably not very useful.

In llvm-readobj, we made a similar decision: always use the wide format,
accept but ignore -W/--wide.

To save some columns, we change the tab before hex bytes (controlled by
--[no-]show-raw-insn) to a space.

llvm-svn: 358474

fa860ff7

[llvm-objdump] Simplify PrintHelpMessage() logic · 051a699e

Fangrui Song authored Apr 16, 2019

This relands rL358418. It missed one test that should also use -macho
Note, all the other -private-header -exports-trie tests are used
together with -macho.

llvm-svn: 358472

051a699e

Revert r358405: "[llvm-objdump] Align instructions to a tab stop in disassembly output" · d9d0c3e1
Alex Lorenz authored Apr 15, 2019
```
The test fails on darwin due to a sed error:

sed: 1: "y/\t/ /": transform strings are not the same length
llvm-svn: 358459
```
d9d0c3e1

[AArch64][GlobalISel] Don't do extending loads combine for non-pow-2 types. · 02a90ea7

Amara Emerson authored Apr 15, 2019

Since non-pow-2 types are going to get split up into multiple loads anyway,
don't do the [SZ]EXTLOAD combine for those and save us trouble later in
legalization.

llvm-svn: 358458

02a90ea7

[LSR] Rewrite misses some fixup locations if it splits critical edge · fda04268

Quentin Colombet authored Apr 15, 2019

If LSR split critical edge during rewriting phi operands and
phi node has other pending fixup operands, we need to
update those pending fixups. Otherwise formulae will not be
implemented completely and some instructions will not be eliminated.

llvm.org/PR41445

Differential Revision: https://reviews.llvm.org/D60645

Patch by: Denis Bakhvalov <denis.bakhvalov@intel.com>

llvm-svn: 358457

fda04268

Apr 15, 2019

[EarlyCSE] add more tests for double-negated select condition; NFC · 800a0c3e
Sanjay Patel authored Apr 15, 2019
```
llvm-svn: 358454
```
800a0c3e
[X86] Limit the 'x' inline assembly constraint to zmm0-15 when used for a 512 type. · 0495f29e
Craig Topper authored Apr 15, 2019
```
The 'v' constraint is used to select zmm0-31. This makes 512 bit consistent with 128/256-bit.a

llvm-svn: 358450
```
0495f29e

[X86] Fix a stack folding test to have a full xmm2-31 clobber list instead of... · 77439bb1

Craig Topper authored Apr 15, 2019

[X86] Fix a stack folding test to have a full xmm2-31 clobber list instead of stopping at xmm15. Add an additional dependency to keep instruction below inline asm block.

llvm-svn: 358449

77439bb1

AMDGPU: Fix unreachable when counting register usage of SGPR96 · 101abd21
Matt Arsenault authored Apr 15, 2019
```
llvm-svn: 358447
```
101abd21

AMDGPU: Fix printed format of SReg_96 · fbdd2a18

Matt Arsenault authored Apr 15, 2019

These are artificial, so I think this should only come up with inline
asm comments.

llvm-svn: 358446

fbdd2a18

[EarlyCSE] add test for select condition double-negation; NFC · 5ae05d81
Sanjay Patel authored Apr 15, 2019
```
llvm-svn: 358444
```
5ae05d81
Revert r358418: "[llvm-objdump] Simplify PrintHelpMessage() logic" · 16256123
Alex Lorenz authored Apr 15, 2019
```
This reverts commit r358418 as it broke `test/Object/objdump-export-list`
on Darwin.

llvm-svn: 358443
```
16256123
[Tests] Add a few more tests for LoopPredication w/invariant loads · af808ee2
Philip Reames authored Apr 15, 2019
```
Making sure to cover an important legality cornercase.

llvm-svn: 358439
```
af808ee2
[X86] Block i32/i64 for 'k' and 'Yk' in getRegForInlineAsmConstraint without avx512bw. · 3d9b47c7
Craig Topper authored Apr 15, 2019
```
32 and 64 bit k-registers require avx512bw. If we don't block this properly, it leads to a crash.

llvm-svn: 358436
```
3d9b47c7
[x86] update test checks; NFC · 8ae68f26
Sanjay Patel authored Apr 15, 2019
```
llvm-svn: 358432
```
8ae68f26

[DEBUGINFO] Prevent Instcombine from dropping debuginfo when removing zexts · 4fe42214

Wolfgang Pieb authored Apr 15, 2019

Zexts can be treated like no-op casts when it comes to assessing whether their
removal affects debug info.

Reviewer: aprantl

Differential Revision: https://reviews.llvm.org/D60641

llvm-svn: 358431

4fe42214

[CommandLineParser] Add DefaultOption flag · b85f74a2

Don Hinton authored Apr 15, 2019

Summary: Add DefaultOption flag to CommandLineParser which provides a
default option or alias, but allows users to override it for some
other purpose as needed.

Also, add `-h` as a default alias to `-help`, which can be seamlessly
overridden by applications like llvm-objdump and llvm-readobj which
use `-h` as an alias for other options.

(relanding after revert, r358414)
Added DefaultOptions.clear() to reset().

Reviewers: alexfh, klimek

Reviewed By: klimek

Subscribers: kristina, MaskRay, mehdi_amini, inglorion, dexonsmith, hiraditya, llvm-commits, jhenderson, arphaman, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D59746

llvm-svn: 358428

b85f74a2

[X86] Restore the pavg intrinsics. · 8e364c68

Craig Topper authored Apr 15, 2019

The pattern we replaced these with may be too hard to match as demonstrated by
PR41496 and PR41316.

This patch restores the intrinsics and then we can start focusing
on the optimizing the intrinsics.

I've mostly reverted the original patch that removed them. Though I modified
the avx512 intrinsics to not have masking built in.

Differential Revision: https://reviews.llvm.org/D60674

llvm-svn: 358427

8e364c68