Commits · edb12a838a22f212d7bee970ff637192f7ea8576 · Lorenzo Albano / LLVM bpEVL

Oct 15, 2018

[TI removal] Make variables declared as `TerminatorInst` and initialized · edb12a83

Chandler Carruth authored Oct 15, 2018

by `getTerminator()` calls instead be declared as `Instruction`.

This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).

llvm-svn: 344502

edb12a83

[TI removal] Remove `TerminatorInst` from GVN.h and GVN.cpp. · ae98759e

Chandler Carruth authored Oct 15, 2018

This is the last interesting usage in all of LLVM's headers. The
remaining usages in headers are the core typesystem bits (Core.h,
instruction types, and InstVisitor) and as the return of
`BasicBlock::getTerminator`. The latter is the big remaining API point
that I'll remove after mass updates to user code.

llvm-svn: 344501

ae98759e

[TI removal] Remove `TerminatorInst` from SparsePropagation.h and · ea36937a

Chandler Carruth authored Oct 15, 2018

related code.

This is simple as we just need to replace the type and move to the
concept of visiting a "terminator" rather than a specific instruction
subclass.

llvm-svn: 344500

ea36937a

[TI removal] Remove a dead forward declaration of TerminatorInst. NFC. · effbc5b1
Chandler Carruth authored Oct 15, 2018
```
llvm-svn: 344499
```
effbc5b1

[TI removal] Remove `TerminatorInst` from BasicBlockUtils.h · 4a2d58e1

Chandler Carruth authored Oct 15, 2018

This requires updating a number of .cpp files to adapt to the new API.
I've just systematically updated all uses of `TerminatorInst` within
these files te `Instruction` so thta I won't have to touch them again in
the future.

llvm-svn: 344498

4a2d58e1

[TI removal] Just use Instruction in the CFG printer code. NFC. · f21ce5df
Chandler Carruth authored Oct 15, 2018
```
llvm-svn: 344497
```
f21ce5df
[llvm-exegesis] Fix missing std::move. · a3849490
Guillaume Chatelet authored Oct 15, 2018
```
llvm-svn: 344496
```
a3849490
[TI removal] Remove a unnecessary use of `TerminatorInst` from an IR · c5283c9e
Chandler Carruth authored Oct 15, 2018
```
header. NFC.

Part of the removal of `TerminatorInst` from the type hierarchy.

llvm-svn: 344495
```
c5283c9e

[TI removal] Remove TerminatorInst as an input parameter from all public · b99a2468

Chandler Carruth authored Oct 15, 2018

LLVM APIs. There weren't very many.

We still have the instruction visitor, and APIs with TerminatorInst as
a return type or an output parameter.

llvm-svn: 344494

b99a2468

[llvm-exegesis][NFC] Return many CodeTemplates instead of one. · 296a862c

Guillaume Chatelet authored Oct 15, 2018

Summary: This is part one of the change where I simply changed the signature of the functions. More work need to be done to actually produce more than one CodeTemplate per instruction.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53209

llvm-svn: 344493

296a862c

[TwoAddressInstructionPass] Replace subregister uses when processing tied operands · 06494435

Bjorn Pettersson authored Oct 15, 2018

Summary:
TwoAddressInstruction pass typically rewrites
  %1:short = foo %0.sub_lo:long
as
  %1:short = COPY %0.sub_lo:long
  %1:short = foo %1:short
when having tied operands.

If there are extra un-tied operands that uses the same reg and
subreg, such as the second and third inputs to fie here:
  %1:short = fie %0.sub_lo:long, %0.sub_hi:long, %0.sub_lo:long
then there was a bug which replaced the register %0 also for
the un-tied operand, but without changing the subregister indices.
So we used to get:
  %1:short = COPY %0.sub_lo:long
  %1:short = fie %1, %1.sub_hi:short, %1.sub_lo:short
With this fix we instead get:
  %1:short = COPY %0.sub_lo:long
  %1:short = fie %1, %0.sub_hi:long, %1

Reviewers: arsenm, JesperAntonsson, kparzysz, MatzeB

Reviewed By: MatzeB

Subscribers: bjope, kparzysz, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D36224

llvm-svn: 344492

06494435

[X86] Autogenerate checks. NFC · b44b22c6
Craig Topper authored Oct 15, 2018
```
llvm-svn: 344490
```
b44b22c6

[ORC] Simplify naming for JITDylib definition generators. · a5157d6f

Lang Hames authored Oct 15, 2018

Renames:
  JITDylib's setFallbackDefinitionGenerator method to setGenerator.
  DynamicLibraryFallbackGenerator class to DynamicLibrarySearchGenerator.
  ReexportsFallbackDefinitionGenerator to ReexportsGenerator.

llvm-svn: 344489

a5157d6f

[X86] Move promotion of vector and/or/xor from legalization to DAG combine · 06aea172

Craig Topper authored Oct 15, 2018

Summary:
I've noticed that the bitcasts we introduce for these make computeKnownBits and computeNumSignBits not work well in LegalizeVectorOps. LegalizeVectorOps legalizes bottom up while LegalizeDAG legalizes top down. The bottom up strategy for LegalizeVectorOps means operands are legalized before their uses. So we promote and/or/xor before we legalize the operands that use them making computeKnownBits/computeNumSignBits in places like LowerTruncate suboptimal. I looked at changing LegalizeVectorOps to be top down as well, but that was more disruptive and caused some regressions. I also looked at just moving promotion of binops to LegalizeDAG, but that had a few issues one around matching AND,ANDN,OR into VSELECT because I had to create ANDN as vXi64, but the other nodes hadn't legalized yet, I didn't look too hard at fixing that.

This patch seems to produce better results overall than my other attempts. We now form broadcasts of constants better in some cases. For at least some of them the AND was being introduced in LegalizeDAG, promoted to vXi64, and the BUILD_VECTOR was also legalized there. I think we got bad ordering of that. Now the promotion is out of the legalizer so we handle this better.

In the longer term I think we really should evaluate whether we should be doing this promotion at all. It's really there to reduce isel pattern count, but I'm wondering if we'd be better served just eating the pattern cost or doing C++ based isel for vector and/or/xor in X86ISelDAGToDAG. The masked and/or/xor will definitely be difficult in patterns if a bitcast gets between the vselect and the and/or/xor node. That becomes a lot of permutations to cover.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53107

llvm-svn: 344487

06aea172

[X86] Add 128 MOVDDUP to the constant pool printing in X86AsmPrinter::EmitInstruction. · 67177945
Craig Topper authored Oct 15, 2018
```
We use this instruction to broadcast a single 64-bit value to a v2i64/v2f64 vector.

llvm-svn: 344486
```
67177945
[X86] Autogenerate complete checks. NFC · b5000974
Craig Topper authored Oct 15, 2018
```
llvm-svn: 344485
```
b5000974

Oct 14, 2018

[InstCombine] Add PR27343 test cases · f988639d
Simon Pilgrim authored Oct 14, 2018
```
llvm-svn: 344484
```
f988639d

[LV] Fix comments reported when not vectorizing single iteration loops; NFC · e567b5b5

Ayal Zaks authored Oct 14, 2018

Landing this as a separate part of https://reviews.llvm.org/D50480, being a
seemingly unrelated change ([LV] Vectorizing loops of arbitrary trip count
without remainder under opt for size).

llvm-svn: 344483

e567b5b5

[X86][AVX] Enable lowerVectorShuffleAsLanePermuteAndPermute v16i16/v32i8 shuffle lowering · 861cd0ba
Simon Pilgrim authored Oct 14, 2018
```
Extends D53148 from v4f64 now that we have test coverage for v16i16/v32i8 shuffles.

llvm-svn: 344481
```
861cd0ba
[ARM] Regenerate cttz tests · 9afb1e66
Simon Pilgrim authored Oct 14, 2018
```
Improve codegen view as part of PR32655

llvm-svn: 344479
```
9afb1e66
[ORC] Remove XXLayer::add methods that default to using the main JITDylib. · 9d2014c6
Lang Hames authored Oct 14, 2018
```
They're not currently used and may complicate upcoming changes to add's
signature and behavior.

llvm-svn: 344478
```
9d2014c6

[LegalizeDAG] Don't bother with final MUL+SRL stage for byte CTPOP. · a0590a4f

Simon Pilgrim authored Oct 14, 2018

The final stage of CTPOP expansion (v = (v * 0x01010101...) >> (Len - 8)) is completely pointless for the byte (Len = 8) case as it reduces to (v = (v * 0x01...) >> 0), but annoyingly this doesn't always get optimized away.

Found while investigating generic vector CTPOP expansion (PR32655).

llvm-svn: 344477

a0590a4f

[InstCombine] combine a shuffle and an extract subvector shuffle · 7181146c

Sanjay Patel authored Oct 14, 2018

This is part of the missing IR-level folding noted in D52912.
This should be ok as a canonicalization because the new shuffle mask can't
be any more complicated than the existing shuffle mask. If there's some
target where the shorter vector shuffle is not legal, it should just end up
expanding to something like the pair of shuffles that we're starting with here.

Differential Revision: https://reviews.llvm.org/D53037

llvm-svn: 344476

7181146c

recommit 344472 after fixing build failure on ARM and PPC. · 38bbf81a
Dorit Nuzman authored Oct 14, 2018
```
llvm-svn: 344475
```
38bbf81a
revert 344472 due to failures. · 5118c68c
Dorit Nuzman authored Oct 14, 2018
```
llvm-svn: 344473
```
5118c68c

[IAI,LV] Add support for vectorizing predicated strided accesses using masked · 81743689

Dorit Nuzman authored Oct 14, 2018

interleave-group

The vectorizer currently does not attempt to create interleave-groups that
contain predicated loads/stores; predicated strided accesses can currently be
vectorized only using masked gather/scatter or scalarization. This patch makes
predicated loads/stores candidates for forming interleave-groups during the
Loop-Vectorizer's analysis, and adds the proper support for masked-interleave-
groups to the Loop-Vectorizer's planning and transformation stages. The patch
also extends the TTI API to allow querying the cost of masked interleave groups
(which each target can control); Targets that support masked vector loads/
stores may choose to enable this feature and allow vectorizing predicated
strided loads/stores using masked wide loads/stores and shuffles.

Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53011

llvm-svn: 344472

81743689

[X86] Fix bad indentation. NFC · 20fa085d
Craig Topper authored Oct 14, 2018
```
llvm-svn: 344471
```
20fa085d

[X86] Type legalize v2f32 stores by widening to v4f32, casting to v2f64,... · ec4b75f4

Craig Topper authored Oct 14, 2018

[X86] Type legalize v2f32 stores by widening to v4f32, casting to v2f64, extracting f64 and storing.

Summary: This is similar to what D52528 did for loads. It should match what generic type legalization does in 64-bit mode where it uses a v2i64 cast and an i64 store.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53173

llvm-svn: 344470

ec4b75f4

Move some helpers from the global namespace into anonymous ones. · c55e9975
Benjamin Kramer authored Oct 13, 2018
```
llvm-svn: 344468
```
c55e9975

Oct 13, 2018

[ORC] During lookup, do not match against hidden symbols in other JITDylibs. · 7899ccbc

Lang Hames authored Oct 13, 2018

This adds two arguments to the main ExecutionSession::lookup method:
MatchNonExportedInJD, and MatchNonExported. These control whether and where
hidden symbols should be matched when searching a list of JITDylibs.

A similar effect could have been achieved by filtering search results, but
this would have involved materializing symbol definitions (since materialization
is triggered on lookup) only to throw the results away, among other issues.

llvm-svn: 344467

7899ccbc

[AARCH64] Regenerate popcnt tests · 2ac03ec2
Simon Pilgrim authored Oct 13, 2018
```
Improve codegen view as part of PR32655

llvm-svn: 344466
```
2ac03ec2
[ARM] Regenerate popcnt tests · 247ea880
Simon Pilgrim authored Oct 13, 2018
```
Improve codegen view as part of PR32655

llvm-svn: 344465
```
247ea880

Pull out repeated variables from SelectionDAGLegalize::ExpandBitCount. · 28a143f7

Simon Pilgrim authored Oct 13, 2018

The CTPOP case has been changed from VT.getSizeInBits to VT.getScalarSizeInBits - but this fits in with future work for vector support (PR32655) and doesn't affect any current (scalar) uses.

llvm-svn: 344461

28a143f7

[LegalizeTypes] Prevent an assertion from PromoteIntRes_BSWAP and... · 189e5b4a

Craig Topper authored Oct 13, 2018

[LegalizeTypes] Prevent an assertion from PromoteIntRes_BSWAP and PromoteIntRes_BITREVERSE if the shift amount is too large for the VT returned by getShiftAmountTy

Summary:
getShiftAmountTy for X86 returns MVT::i8. If a BSWAP or BITREVERSE is created that requires promotion and the difference between the original VT and the promoted VT is more than 255 then we won't able to create the constant.

This patch adds a check to replace the result from getShiftAmountTy to MVT::i32 if the difference won't fit. This should get legalized later when the shift is ultimately expanded since its clearly an illegal type that we're only promoting to make it a power of 2 bit width. Alternatively we could base the decision completely on the largest shift amount the promoted VT could use.

Vectors should be immune here because getShiftAmountTy always returns the incoming VT for vectors. Only the scalar shift amount can be changed by the targets.

Reviewers: eli.friedman, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53232

llvm-svn: 344460

189e5b4a

[WebAssembly][NFC] Fix signed/unsigned comparison warning · ffde98de
Thomas Lively authored Oct 13, 2018
```
llvm-svn: 344459
```
ffde98de

[InstCombine] fix complexity canonicalization with fake unary vector ops · 47579b21

Sanjay Patel authored Oct 13, 2018

This is a preliminary step to avoid regressions when we add
an actual 'fneg' instruction to IR. See D52934 and D53205.

llvm-svn: 344458

47579b21

[X86][SSE] Remove most of vector CTTZ custom lowering and use LegalizeDAG instead. · c5d7c6e5

Simon Pilgrim authored Oct 13, 2018

There is one remnant - AVX1 custom splitting of 256-bit vectors - which is due to a regression where the X86ISD::ANDNP is still performed as a YMM.

I've also tightened the CTLZ or CTPOP lowering in SelectionDAGLegalize::ExpandBitCount to require a legal CTLZ - it doesn't affect existing users and fixes an issue with AVX512 codegen.

llvm-svn: 344457

c5d7c6e5

[InstCombine] add tests for operand complexity canonicalization; NFC · 3d22fbd7
Sanjay Patel authored Oct 13, 2018
```
The tests with undef vector elements demonstrate a hole in 
the current pattern matching.

llvm-svn: 344456
```
3d22fbd7
[NFC] Fixed duplicated test file · d463e047
David Bolvansky authored Oct 13, 2018
```
llvm-svn: 344455
```
d463e047

[InstCombine] Fixed crash with aliased functions · e8b3bba7

David Bolvansky authored Oct 13, 2018

Summary: Fixes PR39177

Reviewers: spatel, jbuening

Reviewed By: jbuening

Subscribers: jbuening, llvm-commits

Differential Revision: https://reviews.llvm.org/D53129

llvm-svn: 344454

e8b3bba7