Commits · 3fbacd4964edb44bce797de8fe248512a835524c · Lorenzo Albano / LLVM bpEVL

Feb 11, 2019

[NFC][ARM] Simplify loop-indexing codegen test · 3fbacd49

Sam Parker authored Feb 11, 2019

Remove unnecessary offset checks, CHECK-BASE checks and add some
extra -NOT checks and TODO comments.

llvm-svn: 353689

3fbacd49

[ARM] LoadStoreOptimizer: reoder limit · 150ccb88

Sjoerd Meijer authored Feb 11, 2019

The whole design of generating LDMs/STMs is fragile and unreliable: it depends on
rescheduling here in the LoadStoreOptimizer that isn't register pressure aware
and regalloc that isn't aware of generating LDMs/STMs.
This patch adds a (hidden) option to control the total number of instructions that
can be re-ordered. I appreciate this looks only a tiny bit better than a hard-coded
constant, but at least it allows more easy experimentation with different values
for now. Ideally we calculate this reorder limit based on some heuristics, and take
register pressure into account. I might be looking into that next.

Differential Revision: https://reviews.llvm.org/D57954

llvm-svn: 353678

150ccb88

Feb 08, 2019

[DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X)) · 92a8c367

Nemanja Ivanovic authored Feb 08, 2019

The sqrt case is faster and we already do this for the case where
the exponent is 0.25. This adds the 0.75 case which is also not
sensitive to signed zeros.

Patch by Whitney Tsang (Whitney)

Differential revision: https://reviews.llvm.org/D57434

llvm-svn: 353557

92a8c367

Feb 07, 2019

[LSR] Generate cross iteration indexes · 67756c09

Sam Parker authored Feb 07, 2019

    
Modify GenerateConstantOffsetsImpl to create offsets that can be used
by indexed addressing modes. If formulae can be generated which
result in the constant offset being the same size as the recurrence,
we can generate a pre-indexed access. This allows the pointer to be
updated via the single pre-indexed access so that (hopefully) no
add/subs are required to update it for the next iteration. For small
cores, this can significantly improve performance DSP-like loops.

Differential Revision: https://reviews.llvm.org/D55373

llvm-svn: 353403

67756c09

[ARM GlobalISel] Support G_ICMP for Thumb2 · 75a04e2a

Diana Picus authored Feb 07, 2019

Mark as legal and use the t2* equivalents of the arm mode instructions,
e.g. t2CMPrr instead of plain CMPrr.

llvm-svn: 353392

75a04e2a

Feb 05, 2019

[ARM GlobalISel] Support G_GEP for Thumb2 · e24b104a
Diana Picus authored Feb 05, 2019
```
Same as ARM, but use a different opcode in the instruction selection.

llvm-svn: 353151
```
e24b104a

GlobalISel: Enforce operand types for constants · 1f795e2c

Matt Arsenault authored Feb 04, 2019

A number of of tests were using imm operands, not cimm. Since CSE
relies on the exact ConstantInt* pointer used, and implicit
conversions are generally evil, also enforce the bitsize of the types.

llvm-svn: 353113

1f795e2c

Feb 01, 2019

[CodeGen] Don't scavenge non-saved regs in exception throwing functions · bac11518

Oliver Stannard authored Feb 01, 2019

Previously, LiveRegUnits was assuming that if a block has no successors
and does not return, then no registers are live at the end of it
(because the end of the block is unreachable). This was causing the
register scavenger to use callee-saved registers to materialise stack
frame addresses without saving them in the prologue. This would normally
be fine, because the end of the block is unreachable, but this is not
legal if the block ends by throwing a C++ exception. If this happens,
the scratch register will be modified, but its previous value won't be
preserved, so it doesn't get restored by the exception unwinder.

Differential revision: https://reviews.llvm.org/D57381

llvm-svn: 352844

bac11518

Jan 31, 2019

[ARM] Thumb2: ConstantMaterializationCost · f222259c

Sjoerd Meijer authored Jan 31, 2019

Constants can also be materialised using the negated value and a MVN, and this
case seem to have been missed for Thumb2. To check the constant materialisation
costs, we now call getT2SOImmVal twice, once for the original constant and then
also for its negated value, and this function checks if the constant can both
be splatted or rotated.

This was revealed by a test that optimises for minsize: instead of a LDR
literal pool load and having a literal pool entry, just a MVN with an immediate
is smaller (and also faster).

Differential Revision: https://reviews.llvm.org/D57327

llvm-svn: 352737

f222259c

[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS · f7cc34ca

Sjoerd Meijer authored Jan 31, 2019

And instead just generate a libcall. My motivating example on ARM was a simple:
  
  shl i64 %A, %B

for which the code bloat is quite significant. For other targets that also
accept __int128/i128 such as AArch64 and X86, it is also beneficial for these
cases to generate a libcall when optimising for minsize. On these 64-bit targets,
the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS
lowering operation action is not set to custom/expand.

Differential Revision: https://reviews.llvm.org/D57386

llvm-svn: 352736

f7cc34ca

GlobalISel: Fix creating MMOs with align 0 · 2a64598e
Matt Arsenault authored Jan 31, 2019
```
llvm-svn: 352712
```
2a64598e
MIR: Reject non-power-of-4 alignments in MMO parsing · 547a83b4
Matt Arsenault authored Jan 30, 2019
```
llvm-svn: 352686
```
547a83b4

Jan 29, 2019

[ARM] Use sub for negative offset load/store in thumb1 · 54b01155

David Green authored Jan 29, 2019

This attempts to optimise negative values used in load/store operands
a little. We currently try to selct them as rr, materialising the
negative constant using a MOV/MVN pair. This instead selects ri with
an immediate of 0, forcing the add node to become a simpler sub.

Differential Revision: https://reviews.llvm.org/D57121

llvm-svn: 352475

54b01155

[ARM] Add extra testcases for D57121. NFC · 5c33c5da
David Green authored Jan 29, 2019
```
llvm-svn: 352472
```
5c33c5da

Jan 28, 2019

[ARM GlobalISel] Support integer division for Thumb2 · 574e0c5e

Diana Picus authored Jan 28, 2019

Support G_SDIV, G_UDIV, G_SREM and G_UREM.

The only significant difference between arm and thumb mode is that we
need to check a different subtarget feature.

llvm-svn: 352346

574e0c5e

Jan 25, 2019

[ARM GlobalISel] Support shifts for Thumb2 · 8976ad12

Diana Picus authored Jan 25, 2019

Same as ARM.

On this occasion we split some of the instruction select tests for more
complicated instructions into their own files, so we can reuse them for
ARM and Thumb mode. Likewise for the legalizer tests.

llvm-svn: 352188

8976ad12

[GISel]: Change how CSE is enabled by default for each pass · 3ba0d94b

Aditya Nandakumar authored Jan 24, 2019

https://reviews.llvm.org/D57178

Now add a hook in TargetPassConfig to query if CSE needs to be
enabled. By default this hook returns false only for O0 opt level but
this can be overridden by the target.
As a consequence of the default of enabled for non O0, a few tests
needed to be updated to not use CSE (by passing in -O0) to the run
line.

reviewed by: arsenm

llvm-svn: 352126

3ba0d94b

Jan 23, 2019

[ARM][CGP] Check trunc type before replacing · 31bef63b

Sam Parker authored Jan 23, 2019

In the last stage of type promotion, we replace any zext that uses a
new trunc with the operand of the trunc. This is okay when we only
allowed one type to be optimised, but now its the case that the trunc
maybe needed to produce a more narrow type than the one we were
optimising for. So we need to check this before doing the replacement.

Differential Revision: https://reviews.llvm.org/D57041

llvm-svn: 351935

31bef63b

[DAGCombine] Enable more pre-indexed stores · 9a2a89d5

Sam Parker authored Jan 23, 2019

    
The current check in CombineToPreIndexedLoadStore is too
conversative, preventing a pre-indexed store when the base pointer
is a predecessor of the value being stored. Instead, we should check
the pointer operand of the store.

Differential Revision: https://reviews.llvm.org/D56719

llvm-svn: 351933

9a2a89d5

Jan 17, 2019

[ARM GlobalISel] Allow calls to varargs functions · d5c2499a

Diana Picus authored Jan 17, 2019

Allow varargs functions to be called, both in arm and thumb mode. This
boils down to choosing the correct calling convention, which we can
easily test by making sure arm_aapcscc is used instead of
arm_aapcs_vfpcc when the callee is variadic.

llvm-svn: 351424

d5c2499a

Jan 16, 2019

[DAGCombine] Fix ReduceLoadWidth for shifted offsets · dd8cd6d2

Sam Parker authored Jan 16, 2019

ReduceLoadWidth can trigger using a shifted mask is used and this
requires that the function return a shl node to correct for the
offset. However, the way that this was implemented meant that the
returned result could be an existing node, which would be incorrect.
This fixes the method of inserting the new node and replacing uses.

Differential Revision: https://reviews.llvm.org/D50432

llvm-svn: 351310

dd8cd6d2

Jan 15, 2019

Remove irrelevant references to legacy git repositories from · 693d39dd

James Y Knight authored Jan 15, 2019

compiler identification lines in test-cases.

(Doing so only because it's then easier to search for references which
are actually important and need fixing.)

llvm-svn: 351200

693d39dd

Jan 14, 2019

[ARM GlobalISel] Import MOVi32imm into GlobalISel · 8987d006

Diana Picus authored Jan 14, 2019

Make it possible for TableGen to produce code for selecting MOVi32imm.
This allows reasonably recent ARM targets to select a lot more constants
than before.

We achieve this by adding GISelPredicateCode to arm_i32imm. It's
impossible to use the exact same code for both DAGISel and GlobalISel,
since one uses "Subtarget->" and the other "STI." to refer to the
subtarget. Moreover, in GlobalISel we don't have ready access to the
MachineFunction, so we need to add a bit of code for obtaining it from
the instruction that we're selecting. This is also the reason why it
needs to remain a PatLeaf instead of the more specific IntImmLeaf.

llvm-svn: 351056

8987d006

Replace "no-frame-pointer-*" function attributes with "frame-pointer" · b7cef81f

Francis Visoiu Mistrih authored Jan 14, 2019

Part of the effort to refactoring frame pointer code generation. We used
to use two function attributes "no-frame-pointer-elim" and
"no-frame-pointer-elim-non-leaf" to represent three kinds of frame
pointer usage: (all) frames use frame pointer, (non-leaf) frames use
frame pointer, (none) frame use frame pointer. This CL makes the idea
explicit by using only one enum function attribute "frame-pointer"

Option "-frame-pointer=" replaces "-disable-fp-elim" for tools such as
llc.

"no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" are still
supported for easy migration to "frame-pointer".

tests are mostly updated with

// replace command line args ‘-disable-fp-elim=false’ with ‘-frame-pointer=none’
grep -iIrnl '\-disable-fp-elim=false' * | xargs sed -i '' -e "s/-disable-fp-elim=false/-frame-pointer=none/g"

// replace command line args ‘-disable-fp-elim’ with ‘-frame-pointer=all’
grep -iIrnl '\-disable-fp-elim' * | xargs sed -i '' -e "s/-disable-fp-elim/-frame-pointer=all/g"

Patch by Yuanfang Chen (tabloid.adroit)!

Differential Revision: https://reviews.llvm.org/D56351

llvm-svn: 351049

b7cef81f

Jan 11, 2019
- [AArch64] Create feature set for Exynos M4 · 06747621
  Evandro Menezes authored Jan 11, 2019
```
Complete the feature set for Exynos M4 and update test cases.

llvm-svn: 350953
```
  06747621
Jan 08, 2019

[ARM] Add missing patterns for DSP muls · 53000a74

Sam Parker authored Jan 08, 2019

Using a PatLeaf for sext_16_node allowed matching smulbb and smlabb
instructions once the operands had been sign extended. But we also
need to use sext_inreg operands along with sext_16_node to catch a
few more cases that enable use to remove the unnecessary sxth.

Differential Revision: https://reviews.llvm.org/D55992

llvm-svn: 350613

53000a74

Jan 07, 2019

[ARM] ComputeKnownBits to handle extract vectors · f192cdb5

Diogo N. Sampaio authored Jan 07, 2019

This patch adds the sign/zero extension done by
vgetlane to ARM computeKnownBitsForTargetNode.

Differential revision: https://reviews.llvm.org/D56098

llvm-svn: 350553

f192cdb5

Regenerate test. · 6aac0ec2

Simon Pilgrim authored Jan 07, 2019

Prep work towards enabling SimplifyDemandedBits vector support for TRUNCATE as discussed on D56118.

llvm-svn: 350514

6aac0ec2

Dec 21, 2018

[ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR. · 8c9f865e

Florian Hahn authored Dec 21, 2018

Fixes PR35023.

Reviewers: MatzeB, t.p.northover, sunfish, qcolombet, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D55909

llvm-svn: 349935

8c9f865e

Dec 19, 2018

[ARM GlobalISel] Support G_CONSTANT for Thumb2 · 6c35a1e5

Diana Picus authored Dec 19, 2018

All we have to do is mark it as legal.

This allows us to select a lot of new patterns handled by TableGen. This
patch adds tests for them and splits up the existing test file for
binary operators into 2 files, one for arithmetic ops and one for
logical ones.

llvm-svn: 349610

6c35a1e5

Dec 17, 2018

ARM: use acquire/release instruction variants when available. · ae3b66b7

Tim Northover authored Dec 17, 2018

These features (fairly) recently got split out into their own feature, so we
should make CodeGen use them when available. The main change here is that the
check used to be based on the triple, but now it's based on CPU features.

llvm-svn: 349355

ae3b66b7

Dec 16, 2018

[DAGCombiner] allow hoisting vector bitwise logic ahead of truncates · f24900b9

Sanjay Patel authored Dec 16, 2018

The transform performs a bitwise logic op in a wider type followed by
truncate when both inputs are truncated from the same source type:
logic_op (truncate x), (truncate y) --> truncate (logic_op x, y)

There are a bunch of other checks that should prevent doing this when 
it might be harmful.

We already do this transform for scalars in this spot. The vector 
limitation was shared with a check for the case when the operands are 
extended. I'm not sure if that limit is needed either, but that would 
be a separate patch.

Differential Revision: https://reviews.llvm.org/D55448

llvm-svn: 349303

f24900b9

Dec 14, 2018

[ARM] make test immune to scalarization improvements; NFC · b7e2d6e4
Sanjay Patel authored Dec 14, 2018
```
llvm-svn: 349177
```
b7e2d6e4
[ARM GlobalISel] Thumb2: casts between int and ptr · 02c8343c
Diana Picus authored Dec 14, 2018
```
Mark as legal and add tests. Nothing special to do.

llvm-svn: 349147
```
02c8343c

[ARM GlobalISel] Remove duplicate test. NFCI · acca60b4

Diana Picus authored Dec 14, 2018

Fixup for r349026. I forgot to delete these test functions from the
original file when I moved them to arm-legalize-exts.mir.

llvm-svn: 349146

acca60b4

[ARM GlobalISel] Allow simple binary ops in Thumb2 · 14dc3b29

Diana Picus authored Dec 14, 2018

Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM
and Thumb2.

Extract the legalizer tests for these opcodes into another file.

Add tests for the instruction selector.

llvm-svn: 349142

14dc3b29

Dec 13, 2018

[ARM GlobalISel] Support exts and truncs for Thumb2 · 99cd644b

Diana Picus authored Dec 13, 2018

Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for
them in the instruction selector. This uses handwritten code again
because the patterns that are generated with TableGen are tuned for what
the DAG combiner would produce and not for simple sext/zext nodes.
Luckily, we only need to update the opcodes to use the Thumb2 variants,
everything else can be reused from ARM.

llvm-svn: 349026

99cd644b

[CodeGen] Allow mempcy/memset to generate small overlapping stores. · 76f4ae10

Clement Courbet authored Dec 13, 2018

Summary:
All targets either just return false here or properly model `Fast`, so I
don't think there is any reason to prevent CodeGen from doing the right
thing here.

Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D55365

llvm-svn: 349016

76f4ae10

Dec 12, 2018

[ARM GlobalISel] Select load/store for Thumb2 · 59720b42

Diana Picus authored Dec 12, 2018

Unfortunately we can't use TableGen for this because it doesn't yet
support predicates on the source pattern root. Therefore, add a bit of
handwritten code to the instruction selector to handle the most basic
cases.

Also mark them as legal and extract their legalizer test cases to a new
test file.

llvm-svn: 348920

59720b42

Dec 10, 2018

[GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes. · 5ec14604

Amara Emerson authored Dec 10, 2018

This patch restricts the capability of G_MERGE_VALUES, and uses the new
G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places.

This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32>
and <2 x s64> vectors.

Differential Revisions: https://reviews.llvm.org/D53629

llvm-svn: 348788

5ec14604