Commits · 4a1b95bda0c444798a5240fe924dd127b776d12d · Roger Ferrer / llvm-epi

Jan 21, 2019

Fix typos throughout the license files that somehow I and my reviewers · 4a1b95bd

Chandler Carruth authored Jan 21, 2019

all missed!

Thanks to Alex Bradbury for pointing this out, and the fact that I never
added the intended `legacy` anchor to the developer policy. Add that
anchor too. With hope, this will cause the links to all resolve
successfully.

llvm-svn: 351731

4a1b95bd

[X86] Remove and autoupgrade vpmovqd/vpmovwb intrinsics using trunc+select. · f608dc1f
Craig Topper authored Jan 21, 2019
```
llvm-svn: 351729
```
f608dc1f
[NFC] Make getExpressionSize unsigned short · dca1252a
Max Kazantsev authored Jan 21, 2019
```
llvm-svn: 351727
```
dca1252a
[NFC] Fix warnings in unit test of r351725 · 9d45edfa
Max Kazantsev authored Jan 21, 2019
```
llvm-svn: 351726
```
9d45edfa

[SCEV][NFC] Introduces expression sizes estimation · 85c98838

Max Kazantsev authored Jan 21, 2019

This patch introduces the field `ExpressionSize` in SCEV. This field is
calculated only once on SCEV creation, and it represents the complexity of
this SCEV from arithmetical point of view (not from the point of the number
of actual different SCEV nodes that are used in the expression). Roughly
saying, it is the number of operands and operations symbols when we print this
SCEV.

A formal definition is following: if SCEV `X` has operands
  `Op1`, `Op2`, ..., `OpN`,
then
  Size(X) = 1 + Size(Op1) + Size(Op2) + ... + Size(OpN).
Size of SCEVConstant and SCEVUnknown is one.

Expression size may be used as a universal way to limit SCEV transformations
for huge SCEVs. Currently, we have a bunch of options that represents various
limits (such as recursion depth limit) that may not make any sense from the
point of view of a LLVM users who is not familiar with SCEV internals, and all
these different options pursue one goal. A more general rule that may
potentially allow us to get rid of this redundancy in options is "do not make
transformations with SCEVs of huge size". It can apply to all SCEV traversals
and transformations that may need to visit a SCEV node more than once, hence
they are prone to combinatorial explosions.

This patch only introduces SCEV sizes calculation as NFC, its utilization will
be introduced in follow-up patches.

Differential Revision: https://reviews.llvm.org/D35989
Reviewed By: reames

llvm-svn: 351725

85c98838

[RISCV] Add R_RISCV_RELAX relocation to all possible relax candidates. · 5e8798f9

Kito Cheng authored Jan 21, 2019

Summary:
Add R_RISCV_RELAX relocation to all possible relax candidates and
update corresponding testcase.

Reviewers: asb, apazos

Differential Revision: https://reviews.llvm.org/D46677

llvm-svn: 351723

5e8798f9

[AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough · 5c23410f

Dylan McKay authored Jan 21, 2019

This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.

Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.

The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.

This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.

Thanks to Carl Peto for reporting this bug.

This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.

llvm-svn: 351721

5c23410f

[AVR] Enable emission of debug information · f15cc113

Dylan McKay authored Jan 21, 2019

Prior to this, the code was missing AVR-specific relocation logic in
RelocVisitor.h.

This patch teaches RelocVisitor about R_AVR_16 and R_AVR_32.

Debug information is emitted in the final object file, and understood by
'avr-readelf --debug-dump' from AVR-GCC.

llvm-dwarfdump is yet to understand how to dump AVR DWARF symbols.

llvm-svn: 351720

f15cc113

Revert "[AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough" · ce0ab063

Dylan McKay authored Jan 21, 2019

This reverts commit r351718.

Carl pointed out that the unit test could be improved.

This patch will be recommitted once the test is made more resilient.

llvm-svn: 351719

ce0ab063

[AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough · 33acba43

Dylan McKay authored Jan 21, 2019

This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.

Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.

The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.

This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.

Thanks to Carl Peto for reporting this bug.

This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.

llvm-svn: 351718

33acba43

Tentative fix for r351701 and gcc 6.2 build on ubuntu · 836aa270
Serge Guelton authored Jan 20, 2019
```
llvm-svn: 351705
```
836aa270

Jan 20, 2019

Add missing test file · f81edba3
Serge Guelton authored Jan 20, 2019
```
llvm-svn: 351702
```
f81edba3

Replace llvm::isPodLike<...> by llvm::is_trivially_copyable<...> · be88539b

Serge Guelton authored Jan 20, 2019

As noted in https://bugs.llvm.org/show_bug.cgi?id=36651, the specialization for
isPodLike<std::pair<...>> did not match the expectation of
std::is_trivially_copyable which makes the memcpy optimization invalid.

This patch renames the llvm::isPodLike trait into llvm::is_trivially_copyable.
Unfortunately std::is_trivially_copyable is not portable across compiler / STL
versions. So a portable version is provided too.

Note that the following specialization were invalid:

    std::pair<T0, T1>
    llvm::Optional<T>

Tests have been added to assert that former specialization are respected by the
standard usage of llvm::is_trivially_copyable, and that when a decent version
of std::is_trivially_copyable is available, llvm::is_trivially_copyable is
compared to std::is_trivially_copyable.

As of this patch, llvm::Optional is no longer considered trivially copyable,
even if T is. This is to be fixed in a later patch, as it has impact on a
long-running bug (see r347004)

Note that GCC warns about this UB, but this got silented by https://reviews.llvm.org/D50296.

Differential Revision: https://reviews.llvm.org/D54472

llvm-svn: 351701

be88539b

AMDGPU: Legalize more bitcasts · 7ac79ed8
Matt Arsenault authored Jan 20, 2019
```
llvm-svn: 351700
```
7ac79ed8
GlobalISel: Add isPointer legality predicates · a5195829
Matt Arsenault authored Jan 20, 2019
```
llvm-svn: 351699
```
a5195829

AMDGPU/GlobalISel: Really legalize exts from i1 · 46ffe68d

Matt Arsenault authored Jan 20, 2019

There is a combine that was hiding these tests
not actually testing what they should be, although
they were producing the expected end result.

llvm-svn: 351698

46ffe68d

[X86] Auto upgrade VPCOM/VPCOMU intrinsics to generic integer comparisons · e1143c13

Simon Pilgrim authored Jan 20, 2019

This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp.

Noticed while cleaning up vector integer comparison costs for PR40376.

llvm-svn: 351697

e1143c13

GlobalISel: Implement widenScalar for basic FP ops · 745fd9f5
Matt Arsenault authored Jan 20, 2019
```
llvm-svn: 351696
```
745fd9f5
AMDGPU/GlobalISel: Legalize f32->f16 fptrunc · cfd9e7f5
Matt Arsenault authored Jan 20, 2019
```
llvm-svn: 351695
```
cfd9e7f5

AMDGPU/GlobalISel: Fix some crashs in g_unmerge_values/g_merge_values · ff6a9a27

Matt Arsenault authored Jan 20, 2019

This was crashing in the predicate function assuming the value
is a vector.

Copy more of what AArch64 uses. This probably needs more refinement
later, but I don't exactly understand what it means in some cases,
particularly since any legalization for these seems to be missing.

llvm-svn: 351693

ff6a9a27

AMDGPU/GlobalISel: Regbank select for fpext · 2a2086b8
Matt Arsenault authored Jan 20, 2019
```
llvm-svn: 351692
```
2a2086b8
AMDGPU/GlobalISel: Cleanup legality for extensions · 24563ef6
Matt Arsenault authored Jan 20, 2019
```
llvm-svn: 351691
```
24563ef6

[X86] Auto upgrade old style VPCOM/VPCOMU intrinsics to generic integer comparisons · b590e4f7

Simon Pilgrim authored Jan 20, 2019

We were upgrading these to the new style VPCOM/VPCOMU intrinsics (which includes the condition code immediate), but we'll be getting rid of those shortly, so convert these to generics first.

Noticed while cleaning up vector integer comparison costs for PR40376.

llvm-svn: 351690

b590e4f7

[X86] Replace VPCOM/VPCOMU with generic integer comparisons (llvm) · 4fd2459c

Simon Pilgrim authored Jan 20, 2019

These intrinsics can always be replaced with generic integer comparisons without any regression in codegen, even for -O0/-fast-isel cases.

Noticed while cleaning up vector integer comparison costs for PR40376.

A future commit will remove/autoupgrade the existing VPCOM/VPCOMU llvm intrinsics.

llvm-svn: 351688

4fd2459c

[CostModel][X86] Add explicit vector select costs · c934d3a0

Simon Pilgrim authored Jan 20, 2019

Prior to SSE41 (and sometimes on AVX1), vector select has to be performed as a ((X & C)|(Y & ~C)) bit select.

Exposes a couple of issues with the min/max reduction costs (which only go down to SSE42 for some reason).

The increase pre-SSE41 selection costs also prevent a couple of tests from firing any longer, so I've either tweaked the target or added AVX tests as well to the existing SSE2 tests.

llvm-svn: 351685

c934d3a0

[CostModel][X86] Add explicit fcmp costs for pre-SSE42 targets · 1231904c
Simon Pilgrim authored Jan 20, 2019
```
Typical throughputs: cmpss/cmpps = 1cy and cmpsd/cmppd = 2cy before the Core2 era

llvm-svn: 351684
```
1231904c
[TTI][X86] Reordered getCmpSelInstrCost cost tables in descending ISA order. NFCI. · bf4b7702
Simon Pilgrim authored Jan 20, 2019
```
Minor tidyup to make it clearer whats going on before adding additional costs.

llvm-svn: 351683
```
bf4b7702
[CostModel][X86] Split icmp/fcmp costs tests and test all comparison codes · 60e5a3ac
Simon Pilgrim authored Jan 20, 2019
```
llvm-svn: 351682
```
60e5a3ac
[CostModel][X86] Add masked load/store/gather/scatter tests for SSE2/SSE42/AVX1 targets · 5d7182ec
Simon Pilgrim authored Jan 20, 2019
```
llvm-svn: 351681
```
5d7182ec
[CostModel][X86] Add non-constant vselect cost tests · a8b009fd
Simon Pilgrim authored Jan 20, 2019
```
Also add AVX512 costs at the same time

llvm-svn: 351680
```
a8b009fd

[AVR] Remove unneeded XFAILs from the Generic CodeGen tests · a6241a5d

Dylan McKay authored Jan 20, 2019

These have been in place for quite a while now.

Several bugs have since been fixed, and these tests now pass.

llvm-svn: 351679

a6241a5d

[AVR] Allow AVR to be explicitly set as the default target triple · c115d738

Dylan McKay authored Jan 20, 2019

This extends the CMake cross compilation logic so that AVR can be set as
the default target triple, and thus the generic codegen tests can be
run.

This used to be possible on AVR; the CMake configuration files have
since been changed.

With this patch, 'cmake -DLLVM_DEFAULT_TARGET_TRIPLE=avr-unknown-unknown' can
be passed on the command line, making the `-mcpu` argument redundant to
'llc' and friends.

llvm-svn: 351678

c115d738

[AVR] Replace two references to ARM's 't2_so_imm' type comments · cca7c733

Dylan McKay authored Jan 20, 2019

These were originally introduced in a copy-paste committed in r351526.

The reference to 't2_so_imm' have been updated to 'imm_com8' so the
comment is now accurate.

Thanks to Eli Friedman for noticing this.

llvm-svn: 351674

cca7c733

[AVR] Fix codegen bug in 16-bit loads · 6afef286

Dylan McKay authored Jan 20, 2019

Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to
instructions of this pattern:

    ld  $GPR8, [PTR:XYZ]+
    ld  $GPR8, [PTR]+1

This has a problem; the [PTR] is incremented in-place once, but never
decremented.

Future uses of the same pointer will use the now clobbered value,
leading to the pointer being incorrect by an offset of one.

This patch modifies the expansion code of the LDWRdPtr pseudo
instruction so that the pointer variable is not silently clobbered in
future uses in the same live range.

Bug first reported by Keshav Kini.

Patch by Kaushik Phatak.

llvm-svn: 351673

6afef286

Revert "[AVR] Fix codegen bug in 16-bit loads" · 52846ab0

Dylan McKay authored Jan 20, 2019

This reverts commit r351544.

In that commit, I had mistakenly misattributed the issue submitter as
the patch author, Kaushik Phatak.

The patch will be recommitted immediately with the correct attribution.

llvm-svn: 351672

52846ab0

[ConstantMerge] Factor out check for un-mergeable globals, NFC · 857cacd9
Vedant Kumar authored Jan 20, 2019
```
llvm-svn: 351671
```
857cacd9
make XFAIL, REQUIRES, and UNSUPPORTED support multi-line expressions · 1989f7e0
Eric Fiselier authored Jan 20, 2019
```
llvm-svn: 351668
```
1989f7e0

Jan 19, 2019

[X86] Add masked MCVTSI2P/MCVTUI2P ISD opcodes to model the cvtqq2ps cvtuqq2ps... · 4aa74fff

Craig Topper authored Jan 19, 2019

[X86] Add masked MCVTSI2P/MCVTUI2P ISD opcodes to model the cvtqq2ps cvtuqq2ps nodes that produce less than 128-bits of results.

These nodes zero the upper half of the result and can't be represented with vselect.

llvm-svn: 351666

4aa74fff

[llvm-objcopy] [COFF] Implement --only-section · e8305175
Martin Storsjö authored Jan 19, 2019
```
Differential Revision: https://reviews.llvm.org/D56873

llvm-svn: 351663
```
e8305175
[llvm-objcopy] [COFF] Implement --only-keep-debug · 1868d88b
Martin Storsjö authored Jan 19, 2019
```
Differential Revision: https://reviews.llvm.org/D56840

llvm-svn: 351662
```
1868d88b