Commits · b8bbcbfcc81bb6101322c34fb4c770afb090a68c · Roger Ferrer / llvm-epi-0.8

Sep 27, 2013

Adding intrinsics to the llvm backend for TBM instruction set. · b8bbcbfc
Yunzhong Gao authored Sep 27, 2013
```
Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1750

llvm-svn: 191539
```
b8bbcbfc

[SystemZ] Rein back the use of block operations · 067817ee

Richard Sandiford authored Sep 27, 2013

The backend tries to use block operations like MVC, NC, OC and XC for
simple scalar operations.  For correctness reasons, it rejects any case
in which the regions might partially overlap.  However, for performance
reasons, it should also reject cases where the regions might be equal,
since the instruction might then not use the fast path.

This fixes a performance regression seen in bzip2.  We may want to limit
the optimisation even more in future, or even remove it entirely, but I'll
try with this for now.

llvm-svn: 191525

067817ee

[SystemZ] Improve handling of PC-relative addresses · 54b36916

Richard Sandiford authored Sep 27, 2013

The backend previously folded offsets into PC-relative addresses
whereever possible.  That's the right thing to do when the address
can be used directly in a PC-relative memory reference (using things
like LRL).  But if we have a register-based memory reference and need
to load the PC-relative address separately, it's better to use an anchor
point that could be shared with other accesses to the same area of the
variable.

Fixes a FIXME.

llvm-svn: 191524

54b36916

[mips][msa] Implemented insert.d intrinsic. · 6098b335

Daniel Sanders authored Sep 27, 2013

This intrinsic is lowered into an equivalent INSERT_VECTOR_ELT which is
further lowered into a sequence of insert.w's on MIPS32.

llvm-svn: 191521

6098b335

ARM: Teach assembler to enforce constraints for ARM LDRD destination register operands. · 1aebfa0a

Tilmann Scheller authored Sep 27, 2013

As specified in A8.8.72/A8.8.73/A8.8.74 in the ARM ARM, all variants of the ARM LDRD instruction have the following two constraints:

LDRD<c> <Rt>, <Rt2>, ...

(a) Rt must be even-numbered and not r14
(b) Rt2 must be R(t+1)

If those two constraints are not met the result of executing the instruction will be unpredictable.

Constraint (b) was already enforced, this commit adds support for constraint (a).

Fixes rdar://14479793.

llvm-svn: 191520

1aebfa0a

[mips][msa] Implemented fill.d intrinsic. · c72593e6

Daniel Sanders authored Sep 27, 2013

This intrinsic is lowered into an equivalent BUILD_VECTOR which is further
lowered into a sequence of insert.w's on MIPS32.

llvm-svn: 191519

c72593e6

[mips][msa] Implemented copy_[us].d intrinsic. · 7f3d946f

Daniel Sanders authored Sep 27, 2013

This intrinsic is lowered into equivalent copy_s.w instructions during
legalization.

llvm-svn: 191518

7f3d946f

[mips][msa] Rename arguments to MSA_INSERT_DESC_BASE to better match their expected values. · 51287b93
Daniel Sanders authored Sep 27, 2013
```
No functional change.

llvm-svn: 191517
```
51287b93

[mips][msa] Implemented insert_vector_elt for v4f32 and v2f64. · a515070e

Daniel Sanders authored Sep 27, 2013

For v4f32 and v2f64, INSERT_VECTOR_ELT is matched by a pseudo-insn which is
later expanded to appropriate insve.[wd] insns.

llvm-svn: 191515

a515070e

[mips][msa] Implemented extract_vector_elt for v4f32 or v2f64 · 39bb8ba0

Daniel Sanders authored Sep 27, 2013

For v4f32 and v2f64, EXTRACT_VECTOR_ELT is matched by a pseudo-insn which may
be expanded to subregister copies and/or instructions as appropriate.

llvm-svn: 191514

39bb8ba0

[mips][msa] Added support for MSA registers to copyPhysReg · 9ea9ff2d
Daniel Sanders authored Sep 27, 2013
```
llvm-svn: 191512
```
9ea9ff2d
[mips][msa] Added support for matching splati from normal IR (i.e. not intrinsics) · 7e51fe19
Daniel Sanders authored Sep 27, 2013
```
Updated some of the vshf since they (correctly) emit splati's now

llvm-svn: 191511
```
7e51fe19

[mips][msa] Added MSA.txt to describe instruction selection quirks. · 928920ab

Daniel Sanders authored Sep 27, 2013

This file contains notes about the instruction selection for MSA. For example,
it notes that ilvl.d is cannot be selected because ilvev.d covers the same
cases and is selected instead of ilvl.d.

llvm-svn: 191507

928920ab

Fix comment. · 041f7176
Tilmann Scheller authored Sep 27, 2013
```
llvm-svn: 191505
```
041f7176

ARM: Teach assembler to enforce constraint for Thumb2 LDRD (literal/immediate)... · 88c8f165

Tilmann Scheller authored Sep 27, 2013

ARM: Teach assembler to enforce constraint for Thumb2 LDRD (literal/immediate) destination register operands.

LDRD<c> <Rt>, <Rt2>, <label>
LDRD<c> <Rt>, <Rt2>, [<Rn>{, #+/-<imm>}]
LDRD<c> <Rt>, <Rt2>, [<Rn>], #+/-<imm>
LDRD<c> <Rt>, <Rt2>, [<Rn>, #+/-<imm>]!

As specified in A8.8.72/A8.8.73 in the ARM ARM, the T1 encoding has a constraint which enforces that Rt != Rt2.

If this constraint is not met the result of executing the instruction will be unpredictable.

Fixes rdar://14479780.

llvm-svn: 191504

88c8f165

[mips][msa] Tidy up · 84e7caf7

Daniel Sanders authored Sep 27, 2013

lowerMSABinaryIntr, lowerMSABinaryImmIntr, lowerMSABranchIntr,
and lowerMSAUnaryIntr were trivially small functions. Inlined them into
their callers.

lowerMSASplat now takes its callers SDLoc instead of making a new one.

No functional change.

llvm-svn: 191503

84e7caf7

[mips][msa] MSA requires FR=1 mode (64-bit FPU register file). Report fatal... · 1b1e25b7
Daniel Sanders authored Sep 27, 2013
```
[mips][msa] MSA requires FR=1 mode (64-bit FPU register file). Report fatal error when using it in FR=0 mode.

llvm-svn: 191498
```
1b1e25b7
[mips][msa] Expand all truncstores and loadexts for MSA as well as DSP · 36c671e2
Daniel Sanders authored Sep 27, 2013
```
llvm-svn: 191496
```
36c671e2

[mips][msa] Added missing check in performSRACombine · f4f1a872

Daniel Sanders authored Sep 27, 2013

Reviewers: jacksprat, dsanders

Reviewed By: dsanders

Differential Revision: http://llvm-reviews.chandlerc.com/D1755

llvm-svn: 191495

f4f1a872

Put HasAVX512 predicate on some patterns to properly disable them when AVX512... · dbe8b7d2

Craig Topper authored Sep 27, 2013

Put HasAVX512 predicate on some patterns to properly disable them when AVX512 isn't enabled. Currently it works simply because the SSE and AVX version of the same patterns are checked first in the DAG isel table.

llvm-svn: 191490

dbe8b7d2

Switch HasAVX to UseAVX in one spot to ensure that AVX512 form of VINSERTPS is used in AVX512 mode. · 8f14de8f
Craig Topper authored Sep 27, 2013
```
llvm-svn: 191489
```
8f14de8f
Removal some duplicate patterns. · c6a1aac7
Craig Topper authored Sep 27, 2013
```
llvm-svn: 191488
```
c6a1aac7

Fixing Intel format of the vshufpd instruction. · 4467f33e

Yunzhong Gao authored Sep 27, 2013

Phabricator code review is located at: http://llvm-reviews.chandlerc.com/D1759

llvm-svn: 191481

4467f33e

Sep 26, 2013

[mips][msa] Direct Object Emission for 3RF instructions. · cb8b40b0
Jack Carter authored Sep 26, 2013
```
Patch by Matheus Almeida

llvm-svn: 191461
```
cb8b40b0

[mips][msa] Updates encoding of 3RF instructions to match the latest revision... · 142ec828

Jack Carter authored Sep 26, 2013

[mips][msa] Updates encoding of 3RF instructions to match the latest revision of the MSA spec (1.06).

This does not affect any of the existing output.

Patch by Matheus Almeida

llvm-svn: 191460

142ec828

Fix PR 17372: Emitting PLD for stack address for ARM Thumb2 · 286304a3
Weiming Zhao authored Sep 26, 2013
```
t2PLDi12, t2PLDi8, t2PLDs was omitted in Thumb2InstrInfo.
This patch fixes it.

llvm-svn: 191441
```
286304a3

[PowerPC] Fix PR17354: Generate nop after local calls for PIC code. · cea15962

Bill Schmidt authored Sep 26, 2013

When generating code for shared libraries, even local calls may be
intercepted, so we need a nop after the call for the linker to fix up the
TOC.  Test case adapted from the one provided in PR17354.

llvm-svn: 191440

cea15962

[Sparc] Implements exception handling in SPARC with DwarfCFI. · 4c0cdd73
Venkatraman Govindaraju authored Sep 26, 2013
```
llvm-svn: 191432
```
4c0cdd73
[ARM] Use the load-acquire/store-release instructions optimally in AArch32. · b4ad2f39
Amara Emerson authored Sep 26, 2013
```
Patch by Artyom Skrobov.

llvm-svn: 191428
```
b4ad2f39

PPC: Allow partial fills in writeNopData() · 7137420d

David Majnemer authored Sep 26, 2013

When asked to pad an irregular number of bytes, we should fill with
zeros.  This is consistent with the behavior specified in the AIX
Assembler Language Reference as well as other LLVM and binutils
assemblers.

N.B. There is a small deviation from binutils' PPC assembler:
when handling pads which are greater than 4 bytes but not mod 4,
binutils will not emit any NOP sequences at all and only use zeros.
This may or may not be a bug but there is no excellent rationale as to
why that behavior is important to emulate.  If that behavior is needed,
we can change writeNopData() to behave in the same way.

This fixes PR17352.

llvm-svn: 191426

7137420d

Added temp flag -misched-bench for staging in default changes. · 71e8bb6d
Andrew Trick authored Sep 26, 2013
```
llvm-svn: 191423
```
71e8bb6d
PPC: Do not introduce ISD nodes for fctid and fctiw · 08249a31
David Majnemer authored Sep 26, 2013
```
llvm-svn: 191421
```
08249a31

PPC: Add support for fctid and fctiw · 6ad26d33

David Majnemer authored Sep 26, 2013

Encodings were checked against the Power ISA documents and double
checked against binutils.

This fixes PR17350.

llvm-svn: 191419

6ad26d33

[mips][msa] Direct Object Emission for 3R instructions. · 3eb663b0

Jack Carter authored Sep 26, 2013

This is the first set of instructions with a ".b" modifier thus we need to add the required code to disassemble a MSA128B register class.
 
Patch by Matheus Almeida

llvm-svn: 191415

3eb663b0

[mips][msa] Updates encoding of 3R instructions to match the latest revision... · 77551abe

Jack Carter authored Sep 26, 2013

[mips][msa] Updates encoding of 3R instructions to match the latest revision of the MSA spec (1.06).
 
Internal changes only.
 
Patch by Matheus Almeida

llvm-svn: 191414

77551abe

[mips][msa] Direct Object Emission for 2RF instructions. · 33812982
Jack Carter authored Sep 25, 2013
```
 
Patch by Matheus Almeida

llvm-svn: 191413
```
33812982

[mips][msa] Direct Object Emission support for the MSA instruction set. · 5dc8ac92

Jack Carter authored Sep 25, 2013

In more detail, this patch adds the ability to parse, encode and decode MSA registers ($w0-$w31). The format of 2RF instructions (MipsMSAInstrFormat.td) was updated so that we could attach a test case to this patch i.e., the test case parses, encodes and decodes 2 MSA instructions. Following patches will add the remainder of the instructions.

Note that DecodeMSA128BRegisterClass is missing from MipsDisassembler.td because it's not yet required at this stage and having it would cause a compiler warning (unused function).

Patch by Matheus Almeida

llvm-svn: 191412

5dc8ac92

[mips][msa] Updates encoding of 2RF instructions to match the latest revision... · 56c681eb

Jack Carter authored Sep 25, 2013

[mips][msa] Updates encoding of 2RF instructions to match the latest revision of the MSA spec (1.06).
 
This only changes internal encodings and doesn't affect output.


Patch by Matheus Almeida

llvm-svn: 191411

56c681eb

Fix PR 17368: disable vector mul distribution for square of add/sub for ARM · 2052f484

Weiming Zhao authored Sep 25, 2013

Generally, it is desirable to distribute (a + b) * c to a*c + b*c for
ARM with VMLx forwarding, where a, b and c are vectors.
However, for (a + b)*(a + b), distribution will result in one extra
instruction.
With distribution:
  x = a + b (add)
  y = a * x (mul)
  z = y + b * y (mla)

Without distribution:
  x = a + b (add)
  z = x * x (mul)

This patch checks if a mul is a square of add/sub. If yes, skip
distribution.

llvm-svn: 191410

2052f484

Sep 25, 2013
- Fix a bad typo in the inline assembly code for mips16 pic fp stubs · a6ce797f
  Reed Kotler authored Sep 25, 2013
```
and make one cosmetic cleanup to make it look the same as gcc
in this area; adjusting test cases.

llvm-svn: 191400
```
  a6ce797f