Commits · 0d28f80bd1f0ec511fe608bfdc58c927d1a6ef9c · Lorenzo Albano / LLVM bpEVL

Aug 05, 2015
- Rename all references to old mailing lists to new lists.llvm.org address. · 0d28f80b
  Tanya Lattner authored Aug 05, 2015
```
llvm-svn: 243999
```
  0d28f80b
Aug 04, 2015

wrap OptSize and MinSize attributes for easier and consistent access (NFCI) · 924879ad

Sanjay Patel authored Aug 04, 2015

Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).

Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.

This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.

Differential Revision: http://reviews.llvm.org/D11734

llvm-svn: 243994

924879ad

[x86] machine combiner reassociation: mark EFLAGS operand as 'dead' · 75ced278

Sanjay Patel authored Aug 04, 2015

In the commentary for D11660, I wasn't sure if it was alright to create new
integer machine instructions without also creating the implicit EFLAGS operand.
From what I can see, the implicit operand is always created by the MachineInstrBuilder
based on the instruction type, so we don't have to do that explicitly. However, in
reviewing the debug output, I noticed that the operand was not marked as 'dead'.
The machine combiner should do that to preserve future optimization opportunities
that may be checking for that dead EFLAGS operand themselves.

Differential Revision: http://reviews.llvm.org/D11696

llvm-svn: 243990

75ced278

[mips][FastISel] Disable code generation for unsupported targets through FastISel. · 2f12b2ed

Vasileios Kalintiris authored Aug 04, 2015

Summary:
Previously, we would check whether the target is supported or not, only in
fastSelectInstruction(). This means that 64-bit targets could use FastISel too.
We fix this by checking every overridden method of the FastISel class and
by falling back to SelectionDAG if the target isn't supported. This change
should have been committed along with r243638, but somehow I missed it.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11755

llvm-svn: 243986

2f12b2ed

Revert r229675 - [mips] Avoid redundant sign extension of the result of binary... · 044e1722

Vasileios Kalintiris authored Aug 04, 2015

Revert r229675 - [mips] Avoid redundant sign extension of the result of binary bitwise instructions.

It introduced two regressions on 64-bit big-endian targets running under N32
(MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and
MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets
comparisons such as BEQ compare the whole GPR64 but incorrectly tell the
instruction selector that they operate on GPR32's. This leads to the
elimination of i32->i64 extensions that are actually required by
comparisons to work correctly.

There's currently a patch under review that fixes this problem.

llvm-svn: 243984

044e1722

ARM: support windows division routines · 0a2672bb

Saleem Abdulrasool authored Aug 04, 2015

This adds the software division routines for the Windows RTABI.  These are not
expected to be used often though as most modern Windows ARM capable targets
support hardware division.  In the case that the target CPU doesnt support
hardware division, this will be the fallback.

llvm-svn: 243952

0a2672bb

ARM: make Darwin libcall registration table driven (NFC) · 67697a7e

Saleem Abdulrasool authored Aug 04, 2015

Make the libcall updating table driven similar to the approach that the Linux
and Windows codepath does below.  NFC.

llvm-svn: 243951

67697a7e

[AArch64] Rename FP formats to be more consistent. NFC. · 81fda188

Ahmed Bougacha authored Aug 04, 2015

Some are named "FP", others "SD", others still "FP*SD".
Rename all this to just use "FP", which, except for conversions
(which don't use this format naming scheme), implies "SD" anyway.

llvm-svn: 243936

81fda188

[AArch64] Add isel support for f16 indexed LD/ST. · e0e12db8
Ahmed Bougacha authored Aug 04, 2015
```
llvm-svn: 243935
```
e0e12db8

[AArch64][v8.1a] The "pan" sysreg isn't MSR-specific. NFCI. · e8ea9ac3

Ahmed Bougacha authored Aug 04, 2015

It's already in SysRegMappings, no need to also have it in MSRMappings:
the latter is only used if we didn't find a match in the former.

llvm-svn: 243933

e8ea9ac3

[AArch64] Remove unnecessary "break". NFC. · 0cbe2efc
Ahmed Bougacha authored Aug 04, 2015
```
llvm-svn: 243931
```
0cbe2efc
[AArch64] Use SDValue bool operator. NFC. · 239d635d
Ahmed Bougacha authored Aug 04, 2015
```
llvm-svn: 243930
```
239d635d

[AArch64] Vector FCOPYSIGN supports Custom-lowering: mark it as such. · b0ae36f0

Ahmed Bougacha authored Aug 04, 2015

There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and
is actually already vector-aware; let's use it instead of scalarizing!

The only interesting change is that for v2f32, we previously always used
use v4i32 as the integer vector type.
Use v2i32 instead, and mark FCOPYSIGN as Custom.

llvm-svn: 243926

b0ae36f0

ARM: remove horrible printf left over from debugging · 9c340ec6
Tim Northover authored Aug 03, 2015
```
llvm-svn: 243907
```
9c340ec6

Aug 03, 2015

Convert some AArch64 code to foreach loops. NFC. · 7be8f8f0

Pete Cooper authored Aug 03, 2015

Also converted a cast<> to dyn_cast while i was working on the same
line of code.

llvm-svn: 243894

7be8f8f0

ARM: prefer allocating VFP regs at stride 4 on Darwin. · 910dde7a

Tim Northover authored Aug 03, 2015

This is necessary for WatchOS support, where the compact unwind format assumes
this kind of layout. For now we only want this on Swift-like CPUs though, where
it's been the Xcode behaviour for ages. Also, since it can expand the prologue
we don't want it at -Oz.

llvm-svn: 243884

910dde7a

[ARM] Make GlobalMerge merge extern globals by default · f3324cf1

John Brawn authored Aug 03, 2015

Enabling merging of extern globals appears to be generally either beneficial or
harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57)
it gives improvements in the 1-5% range, but in the rest the overall effect is
zero.

Differential Revision: http://reviews.llvm.org/D10966

llvm-svn: 243874

f3324cf1

Be less conservative about forming IT blocks. · 6967e5e4

James Molloy authored Aug 03, 2015

In http://reviews.llvm.org/rL215382, IT forming was made more conservative under
the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M.

But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for
v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an
IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block
as long as the flags register is dead afterwards.

This gives significant performance improvements in a variety of MPEG based workloads.

Differential revision: http://reviews.llvm.org/D11680

llvm-svn: 243869

6967e5e4

WebAssembly: implement getScalarShiftAmountTy so we can shift by amount, with type · fda53373

JF Bastien authored Aug 03, 2015

Summary: This currently sets the shift amount RHS to the same type as the LHS, and assumes that the LHS is a simple type. This isn't currently the case e.g. with weird integers sizes, but will eventually be true and will assert if not. That's what you get for having an experimental backend: break it and you get to keep both pieces. Most backends either set the RHS to MVT::i32 or MVT::i64, but WebAssembly is a virtual ISA and tries to have regular-looking binary operations where both operands are the same type (even if a 64-bit RHS shifter is slightly silly, hey it's free!).

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11715

llvm-svn: 243860

fda53373

Aug 02, 2015

De-constify pointers to Type since they can't be modified. NFC · e3dcce97

Craig Topper authored Aug 01, 2015

This was already done in most places a while ago. This just fixes the ones that crept in over time.

llvm-svn: 243842

e3dcce97

Aug 01, 2015

[NVPTX] allow register copy between float and int · ffa09be2

Jingyue Wu authored Aug 01, 2015

Summary:
Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class
register copying (e.g. i32 to f32) becomes possible. Enhance
NVPTXInstrInfo::copyPhysReg to handle these cases.

Reviewers: jholewinski

Subscribers: eliben, jholewinski, llvm-commits, bruno

Differential Revision: http://reviews.llvm.org/D11622

llvm-svn: 243839

ffa09be2

-Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11 · 78633802

David Blaikie authored Aug 01, 2015

Remove some unnecessary explicit special members in Hexagon that, once
removed, allow the other implicit special members to be used without
depending on deprecated features.

llvm-svn: 243825

78633802

WebAssembly: handle more than int32 argument/return · 8f9aea08

JF Bastien authored Aug 01, 2015

Summary: Also test 64-bit integers, except shifts for now which are broken because isel dislikes the 32-bit truncate that precedes them.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11699

llvm-svn: 243822

8f9aea08

-Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11 · a5fd382e

David Blaikie authored Aug 01, 2015

Various targets use std::swap on specific MCAsmOperands (ARM and
possibly Hexagon as well). It might be helpful to mark those subclasses
as final, to ensure that the availability of move/copy operations can't
lead to slicing. (same sort of requirements as the non-vitual dtor -
protected or a final class)

llvm-svn: 243820

a5fd382e

AMDGPU/SI: Add implicit register operands in the correct order. · b4d0d6a3

Alex Lorenz authored Jul 31, 2015

This commit fixes a bug in the class 'SIInstrInfo' where the implicit register
machine operands were added to a machine instruction in an incorrect order -
the implicit uses were added before the implicit defs.

I found this bug while working on moving the implicit register operand
verification code from the MIR parser to the machine verifier.

This commit also makes the method 'addImplicitDefUseOperands' in the machine
instruction class public so that it can be reused in the 'SIInstrInfo' class.

Reviewers: Matt Arsenault

Differential Revision: http://reviews.llvm.org/D11689

llvm-svn: 243799

b4d0d6a3

Jul 31, 2015

[NVPTX] convert pointers in byval kernel arguments to global · cf70053b

Jingyue Wu authored Jul 31, 2015

Summary:
For example, in

  struct S {
    int *x;
    int *y;
  };
  __global__ void foo(S s) {
    int *b = s.y;
    // use b
  }

"b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for
accessing "b".

Reviewers: jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11505

llvm-svn: 243790

cf70053b

WebAssembly: handle `ret void`. · 4a2d5604

JF Bastien authored Jul 31, 2015

Summary:
Use -1 as numoperands for the return SDTypeProfile, denoting that return is variadic. Note that the patterns in InstrControl.td still need to match the inputs, so this ins't an "anything goes" variadic on ret!

The next step will be to handle other local types (not just int32).

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11692

llvm-svn: 243783

4a2d5604

x86: check hasOpaqueSPAdjustment in canRealignStack · e71e653a

JF Bastien authored Jul 31, 2015

Summary:
@rnk pointed out in [1] that x86's canRealignStack logic should match that in CantUseSP from hasBasePointer.

  [1]: http://reviews.llvm.org/D11160?id=29713#inline-89350

Reviewers: rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D11377

llvm-svn: 243772

e71e653a

WebAssembly: handle unused function arguments. · d7fcc6f9

JF Bastien authored Jul 31, 2015

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11684

llvm-svn: 243770

d7fcc6f9

WebAssembly: print basic integer assembly. · 600aee98

JF Bastien authored Jul 31, 2015

Summary:
This prints assembly for int32 integer operations defined in WebAssemblyInstrInteger.td only, with major caveats:

  - The operation names are currently incorrect.
  - Other integer and floating-point types will be added later.
  - The printer isn't factored out to handle recursive AST code yet, since it can't even handle control flow anyways.
  - The assembly format isn't full s-expressions yet either, this will be added later.
  - This currently disables PrologEpilogCodeInserter as well as MachineCopyPropagation becasue they don't like virtual registers, which WebAssembly likes quite a bit. This will be fixed by factoring out NVPTX's change (currently a fork of PrologEpilogCodeInserter).

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11671

llvm-svn: 243763

600aee98

[x86] reassociate integer multiplies using machine combiner pass · 9ff46260

Sanjay Patel authored Jul 31, 2015

Add i16, i32, i64 imul machine instructions to the list of reassociation
candidates.

A new bit of logic is needed to handle integer instructions: they have an
implicit EFLAGS operand, so we have to make sure it's dead in order to do
any reassociation with integer ops.

Differential Revision: http://reviews.llvm.org/D11660

llvm-svn: 243756

9ff46260

[AArch64] Favor extended reg patterns for sub · 8a7ef3b2

Geoff Berry authored Jul 31, 2015

Summary:
Favor the extended reg patterns over the shifted reg patterns that match
only the operand shift and not the full sign/zero extend and shift.

Reviewers: jmolloy, t.p.northover

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11569

llvm-svn: 243753

8a7ef3b2

Refactor: Simplify boolean conditional return statements in lib/Target/NVPTX · 4be014ae

Jingyue Wu authored Jul 31, 2015

Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: rafael, echristo, chandlerc, bkramer, craig.topper, dexonsmith, chapuni, eliben, jingyue, jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D9983

llvm-svn: 243734

4be014ae

AMDGPU: Fix v16i32 to v16i8 truncstore · e1ce344b
Matt Arsenault authored Jul 31, 2015
```
llvm-svn: 243731
```
e1ce344b

AMDGPU/SI: Set DwarfRegNum · ba013379

Matt Arsenault authored Jul 31, 2015

This requires a fix in tablegen for the cast<int> from bits<16>
to work in the list initializer.

llvm-svn: 243723

ba013379

AMDGPU/SI: Remove unused pattern for f32 constant loads · 82325598

Tom Stellard authored Jul 31, 2015

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11603

llvm-svn: 243719

82325598

[ARM] Lower modulo operation to generate __aeabi_divmod on Android · 532a1369

Sumanth Gundapaneni authored Jul 31, 2015

    
For a modulo (reminder) operation,
clang -target armv7-none-linux-gnueabi generates "__modsi3"
clang -target armv7-none-eabi generates "__aeabi_idivmod"
clang -target armv7-linux-androideabi generates "__modsi3"
Android bionic libc doesn't provide a __modsi3, instead it provides a
"__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate
the correct call when ever there is a modulo operation.

Differential Revision: http://reviews.llvm.org/D11661

llvm-svn: 243717

532a1369

Jul 30, 2015

fix memcpy/memset/memmove lowering when optimizing for size · 1166f2ff

Sanjay Patel authored Jul 30, 2015

Fixing MinSize attribute handling was discussed in D11363. 
This is a prerequisite patch to doing that.

The handling of OptSize when lowering mem* functions was broken
on Darwin because it wants to ignore -Os for these cases, but the
existing logic also made it ignore -Oz (MinSize).

The Linux change demonstrates a widespread problem. The backend
doesn't usually recognize the MinSize attribute by itself; it
assumes that if the MinSize attribute exists, then the OptSize 
attribute must also exist. 

Fixing this more generally will be a follow-on patch or two.

Differential Revision: http://reviews.llvm.org/D11568

llvm-svn: 243693

1166f2ff

AMDGPU: Set SubRegIndex size and offset · 7a0c3a92

Matt Arsenault authored Jul 30, 2015

I'm not sure what reasons the comment here could have
had for not setting these. Without these set, there is
an assertion hit during DWARF emission.

llvm-svn: 243661

7a0c3a92

AMDGPU: Fix unreachable when emitting binary debug info · b39e8583

Matt Arsenault authored Jul 30, 2015

Copy implementation of applyFixup from AArch64 with AArch64 bits
ripped out.

Tests will be included with a later commit. Several other
problems must be fixed before binary debug info emission
will work.

llvm-svn: 243660

b39e8583