Commits · 9c340ec6fdc78b1611d353e55120c9b972151e36 · Lorenzo Albano / LLVM bpEVL

Aug 04, 2015
- ARM: remove horrible printf left over from debugging · 9c340ec6
  Tim Northover authored Aug 03, 2015
```
llvm-svn: 243907
```
  9c340ec6
Aug 03, 2015

Convert some AArch64 code to foreach loops. NFC. · 7be8f8f0

Pete Cooper authored Aug 03, 2015

Also converted a cast<> to dyn_cast while i was working on the same
line of code.

llvm-svn: 243894

7be8f8f0

ARM: prefer allocating VFP regs at stride 4 on Darwin. · 910dde7a

Tim Northover authored Aug 03, 2015

This is necessary for WatchOS support, where the compact unwind format assumes
this kind of layout. For now we only want this on Swift-like CPUs though, where
it's been the Xcode behaviour for ages. Also, since it can expand the prologue
we don't want it at -Oz.

llvm-svn: 243884

910dde7a

[ARM] Make GlobalMerge merge extern globals by default · f3324cf1

John Brawn authored Aug 03, 2015

Enabling merging of extern globals appears to be generally either beneficial or
harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57)
it gives improvements in the 1-5% range, but in the rest the overall effect is
zero.

Differential Revision: http://reviews.llvm.org/D10966

llvm-svn: 243874

f3324cf1

Be less conservative about forming IT blocks. · 6967e5e4

James Molloy authored Aug 03, 2015

In http://reviews.llvm.org/rL215382, IT forming was made more conservative under
the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M.

But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for
v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an
IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block
as long as the flags register is dead afterwards.

This gives significant performance improvements in a variety of MPEG based workloads.

Differential revision: http://reviews.llvm.org/D11680

llvm-svn: 243869

6967e5e4

WebAssembly: implement getScalarShiftAmountTy so we can shift by amount, with type · fda53373

JF Bastien authored Aug 03, 2015

Summary: This currently sets the shift amount RHS to the same type as the LHS, and assumes that the LHS is a simple type. This isn't currently the case e.g. with weird integers sizes, but will eventually be true and will assert if not. That's what you get for having an experimental backend: break it and you get to keep both pieces. Most backends either set the RHS to MVT::i32 or MVT::i64, but WebAssembly is a virtual ISA and tries to have regular-looking binary operations where both operands are the same type (even if a 64-bit RHS shifter is slightly silly, hey it's free!).

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11715

llvm-svn: 243860

fda53373

Aug 02, 2015

De-constify pointers to Type since they can't be modified. NFC · e3dcce97

Craig Topper authored Aug 01, 2015

This was already done in most places a while ago. This just fixes the ones that crept in over time.

llvm-svn: 243842

e3dcce97

Aug 01, 2015

[NVPTX] allow register copy between float and int · ffa09be2

Jingyue Wu authored Aug 01, 2015

Summary:
Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class
register copying (e.g. i32 to f32) becomes possible. Enhance
NVPTXInstrInfo::copyPhysReg to handle these cases.

Reviewers: jholewinski

Subscribers: eliben, jholewinski, llvm-commits, bruno

Differential Revision: http://reviews.llvm.org/D11622

llvm-svn: 243839

ffa09be2

-Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11 · 78633802

David Blaikie authored Aug 01, 2015

Remove some unnecessary explicit special members in Hexagon that, once
removed, allow the other implicit special members to be used without
depending on deprecated features.

llvm-svn: 243825

78633802

WebAssembly: handle more than int32 argument/return · 8f9aea08

JF Bastien authored Aug 01, 2015

Summary: Also test 64-bit integers, except shifts for now which are broken because isel dislikes the 32-bit truncate that precedes them.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11699

llvm-svn: 243822

8f9aea08

-Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11 · a5fd382e

David Blaikie authored Aug 01, 2015

Various targets use std::swap on specific MCAsmOperands (ARM and
possibly Hexagon as well). It might be helpful to mark those subclasses
as final, to ensure that the availability of move/copy operations can't
lead to slicing. (same sort of requirements as the non-vitual dtor -
protected or a final class)

llvm-svn: 243820

a5fd382e

AMDGPU/SI: Add implicit register operands in the correct order. · b4d0d6a3

Alex Lorenz authored Jul 31, 2015

This commit fixes a bug in the class 'SIInstrInfo' where the implicit register
machine operands were added to a machine instruction in an incorrect order -
the implicit uses were added before the implicit defs.

I found this bug while working on moving the implicit register operand
verification code from the MIR parser to the machine verifier.

This commit also makes the method 'addImplicitDefUseOperands' in the machine
instruction class public so that it can be reused in the 'SIInstrInfo' class.

Reviewers: Matt Arsenault

Differential Revision: http://reviews.llvm.org/D11689

llvm-svn: 243799

b4d0d6a3

Jul 31, 2015

[NVPTX] convert pointers in byval kernel arguments to global · cf70053b

Jingyue Wu authored Jul 31, 2015

Summary:
For example, in

  struct S {
    int *x;
    int *y;
  };
  __global__ void foo(S s) {
    int *b = s.y;
    // use b
  }

"b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for
accessing "b".

Reviewers: jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11505

llvm-svn: 243790

cf70053b

WebAssembly: handle `ret void`. · 4a2d5604

JF Bastien authored Jul 31, 2015

Summary:
Use -1 as numoperands for the return SDTypeProfile, denoting that return is variadic. Note that the patterns in InstrControl.td still need to match the inputs, so this ins't an "anything goes" variadic on ret!

The next step will be to handle other local types (not just int32).

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11692

llvm-svn: 243783

4a2d5604

x86: check hasOpaqueSPAdjustment in canRealignStack · e71e653a

JF Bastien authored Jul 31, 2015

Summary:
@rnk pointed out in [1] that x86's canRealignStack logic should match that in CantUseSP from hasBasePointer.

  [1]: http://reviews.llvm.org/D11160?id=29713#inline-89350

Reviewers: rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D11377

llvm-svn: 243772

e71e653a

WebAssembly: handle unused function arguments. · d7fcc6f9

JF Bastien authored Jul 31, 2015

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11684

llvm-svn: 243770

d7fcc6f9

WebAssembly: print basic integer assembly. · 600aee98

JF Bastien authored Jul 31, 2015

Summary:
This prints assembly for int32 integer operations defined in WebAssemblyInstrInteger.td only, with major caveats:

  - The operation names are currently incorrect.
  - Other integer and floating-point types will be added later.
  - The printer isn't factored out to handle recursive AST code yet, since it can't even handle control flow anyways.
  - The assembly format isn't full s-expressions yet either, this will be added later.
  - This currently disables PrologEpilogCodeInserter as well as MachineCopyPropagation becasue they don't like virtual registers, which WebAssembly likes quite a bit. This will be fixed by factoring out NVPTX's change (currently a fork of PrologEpilogCodeInserter).

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11671

llvm-svn: 243763

600aee98

[x86] reassociate integer multiplies using machine combiner pass · 9ff46260

Sanjay Patel authored Jul 31, 2015

Add i16, i32, i64 imul machine instructions to the list of reassociation
candidates.

A new bit of logic is needed to handle integer instructions: they have an
implicit EFLAGS operand, so we have to make sure it's dead in order to do
any reassociation with integer ops.

Differential Revision: http://reviews.llvm.org/D11660

llvm-svn: 243756

9ff46260

[AArch64] Favor extended reg patterns for sub · 8a7ef3b2

Geoff Berry authored Jul 31, 2015

Summary:
Favor the extended reg patterns over the shifted reg patterns that match
only the operand shift and not the full sign/zero extend and shift.

Reviewers: jmolloy, t.p.northover

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11569

llvm-svn: 243753

8a7ef3b2

Refactor: Simplify boolean conditional return statements in lib/Target/NVPTX · 4be014ae

Jingyue Wu authored Jul 31, 2015

Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: rafael, echristo, chandlerc, bkramer, craig.topper, dexonsmith, chapuni, eliben, jingyue, jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D9983

llvm-svn: 243734

4be014ae

AMDGPU: Fix v16i32 to v16i8 truncstore · e1ce344b
Matt Arsenault authored Jul 31, 2015
```
llvm-svn: 243731
```
e1ce344b

AMDGPU/SI: Set DwarfRegNum · ba013379

Matt Arsenault authored Jul 31, 2015

This requires a fix in tablegen for the cast<int> from bits<16>
to work in the list initializer.

llvm-svn: 243723

ba013379

AMDGPU/SI: Remove unused pattern for f32 constant loads · 82325598

Tom Stellard authored Jul 31, 2015

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11603

llvm-svn: 243719

82325598

[ARM] Lower modulo operation to generate __aeabi_divmod on Android · 532a1369

Sumanth Gundapaneni authored Jul 31, 2015

    
For a modulo (reminder) operation,
clang -target armv7-none-linux-gnueabi generates "__modsi3"
clang -target armv7-none-eabi generates "__aeabi_idivmod"
clang -target armv7-linux-androideabi generates "__modsi3"
Android bionic libc doesn't provide a __modsi3, instead it provides a
"__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate
the correct call when ever there is a modulo operation.

Differential Revision: http://reviews.llvm.org/D11661

llvm-svn: 243717

532a1369

Jul 30, 2015

fix memcpy/memset/memmove lowering when optimizing for size · 1166f2ff

Sanjay Patel authored Jul 30, 2015

Fixing MinSize attribute handling was discussed in D11363. 
This is a prerequisite patch to doing that.

The handling of OptSize when lowering mem* functions was broken
on Darwin because it wants to ignore -Os for these cases, but the
existing logic also made it ignore -Oz (MinSize).

The Linux change demonstrates a widespread problem. The backend
doesn't usually recognize the MinSize attribute by itself; it
assumes that if the MinSize attribute exists, then the OptSize 
attribute must also exist. 

Fixing this more generally will be a follow-on patch or two.

Differential Revision: http://reviews.llvm.org/D11568

llvm-svn: 243693

1166f2ff

AMDGPU: Set SubRegIndex size and offset · 7a0c3a92

Matt Arsenault authored Jul 30, 2015

I'm not sure what reasons the comment here could have
had for not setting these. Without these set, there is
an assertion hit during DWARF emission.

llvm-svn: 243661

7a0c3a92

AMDGPU: Fix unreachable when emitting binary debug info · b39e8583

Matt Arsenault authored Jul 30, 2015

Copy implementation of applyFixup from AArch64 with AArch64 bits
ripped out.

Tests will be included with a later commit. Several other
problems must be fixed before binary debug info emission
will work.

llvm-svn: 243660

b39e8583

AMDGPU/SI: Simplify moveSMRDToVALU() · 4229aa94

Tom Stellard authored Jul 30, 2015

Summary:
Replace the switch on instruction opcode with a switch on register size.
This way we don't need to update the switch statement when we add new
SMRD variants.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11601

llvm-svn: 243652

4229aa94

AMDGPU/SI: Remove isTriviallyReMaterializable() function from SIInstrInfo · 9d740760

Tom Stellard authored Jul 30, 2015

Summary:
This function is never called.  isReallyTriviallyReMaterializable() is
the function that should be implemented instead.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11620

llvm-svn: 243651

9d740760

[mips][FastISel] Remove hidden mips-fast-isel option. · 2041b1dd

Vasileios Kalintiris authored Jul 30, 2015

Summary:
This hidden option would disable code generation through FastISel by
default. It was removed from the available options and from the
Fast-ISel tests that required it in order to run the tests.

Reviewers: dsanders

Subscribers: qcolombet, llvm-commits

Differential Revision: http://reviews.llvm.org/D11610

llvm-svn: 243638

2041b1dd

[mips][FastISel] Apply only zero-extension to constants prior to their materialization. · 77fb0a3d

Vasileios Kalintiris authored Jul 30, 2015

Summary:
Previously, we would sign-extend non-boolean negative constants and
zero-extend otherwise. This was problematic for PHI instructions with
negative values that had a type with bitwidth less than that of the
register used for materialization.

More specifically, ComputePHILiveOutRegInfo() assumes the constants
present in a PHI node are zero extended in their container and
afterwards deduces the known bits.

For example, previously we would materialize an i16 -4 with the
following instruction:

  addiu $r, $zero, -4

The register would end-up with the 32-bit 2's complement representation
of -4. However, ComputePHILiveOutRegInfo() would generate a constant
with the upper 16-bits set to zero. The SelectionDAG builder would use
that information to generate an AssertZero node that would remove any
subsequent trunc & zero_extend nodes.

In theory, we should modify ComputePHILiveOutRegInfo() to consult
target-specific hooks about the way they prefer to materialize the
given constants. However, git-blame reports that this specific code
has not been touched since 2011 and it seems to be working well for every
target so far.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11592

llvm-svn: 243636

77fb0a3d

[X86] Recognize "flags" as an identifier, not a register in Intel-syntax inline asm · cdb076b8
Michael Kuperstein authored Jul 30, 2015
```
Patch by: marina.yatsina@intel.com
Differential Revision: http://reviews.llvm.org/D11512

llvm-svn: 243630
```
cdb076b8
push fast-math check for machine-combiner reassociations into instruction-type check; NFC · 5bfbb36a
Sanjay Patel authored Jul 30, 2015
```
This makes it simpler to add instruction types that don't depend on fast-math.

llvm-svn: 243596
```
5bfbb36a

Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the... · c3890d29

Nick Lewycky authored Jul 29, 2015

Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.)

Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h.

llvm-svn: 243585

c3890d29

Rename hasCompatibleFunctionAttributes->areInlineCompatible based · d566fb12

Eric Christopher authored Jul 29, 2015

on suggestions. Currently the function is only used for inline purposes
and this is more descriptive for the use.

llvm-svn: 243578

d566fb12

Jul 29, 2015

[X86][SSE] Keep 32-bit target i64 vector shifts on SSE unit. · ba10f767

Simon Pilgrim authored Jul 29, 2015

This patch improves the 32-bit target i64 constant matching to detect the shuffle vector splats that are introduced by i64 vector shift vectorization (D8416).

Differential Revision: http://reviews.llvm.org/D11327

llvm-svn: 243577

ba10f767

AArch64: use 32-bit MOV rather than UBFX to truncate registers. · 2a9d801f

Tim Northover authored Jul 29, 2015

It's potentially more efficient on Cyclone, and from the optimization guides &
schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd
expect a MOV to be about the most efficient instruction with its semantics,
even though the official "UXTW" alias is really a UBFX.

llvm-svn: 243576

2a9d801f

[X86][SSE] Vectorize i64 ASHR operations · 86478c69

Simon Pilgrim authored Jul 29, 2015

This patch vectorizes the v2i64/v4i64 ASHR shift operations - the last remaining integer vector shifts that are still being transferred to/from the scalar unit to be completed.

Differential Revision: http://reviews.llvm.org/D11439

llvm-svn: 243569

86478c69

Roll forward r242871 · 3a04dc6e

Jingyue Wu authored Jul 29, 2015

r242871 missed one place that should be guarded with isPhysicalReg. This patch
fixes that.

llvm-svn: 243555

3a04dc6e

Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" · 38c02506
Bruno Cardoso Lopes authored Jul 29, 2015
```
Reported to Broke some internal tests: PR24303

This reverts commit r243486.

llvm-svn: 243540
```
38c02506