Commits · 8a80aa76c81c2bed7939938ffc6754850bf46488 · Roger Ferrer / llvm-epi-0.8

Nov 04, 2013
- Support for microMIPS branch instructions. · 8a80aa76
  Zoran Jovanovic authored Nov 04, 2013
```
llvm-svn: 193992
```
  8a80aa76
- X86: Add a description for AMD bdver3 aka Steamroller. · d114def3
  Benjamin Kramer authored Nov 04, 2013
```
This is just bdver2 + FSGSBase.

llvm-svn: 193984
```
  d114def3
Nov 03, 2013
- AVX-512: added VPCONFLICT instruction and intrinsics, · dacddb0b
  Elena Demikhovsky authored Nov 03, 2013
```
added EVEX_KZ to tablegen

llvm-svn: 193959
```
  dacddb0b
- [SparcV9] Handle i64 <-> float conversions in sparcv9 mode. · 5ae77f75
  Venkatraman Govindaraju authored Nov 03, 2013
```
llvm-svn: 193957
```
  5ae77f75
- [Sparc] Expand FP_TO_UINT, UINT_TO_FP for fp128. · f1d807ee
  Venkatraman Govindaraju authored Nov 03, 2013
```
llvm-svn: 193947
```
  f1d807ee
- Convert calls to __sinpi and __cospi into __sincospi_stret · d8d92d90
  Bob Wilson authored Nov 03, 2013
```
This adds an SimplifyLibCalls case which converts the special __sinpi and
__cospi (float & double variants) into a __sincospi_stret where appropriate to
remove duplicated work.

Patch by Tim Northover

llvm-svn: 193943
```
  d8d92d90
- Enable optimization of sin / cos pair into call to __sincos_stret for iOS7+. · e7dde0c0
  Bob Wilson authored Nov 03, 2013
```
rdar://12856873
Patch by Evan Cheng, with a fix for rdar://13209539 by Tilmann Scheller

llvm-svn: 193942
```
  e7dde0c0
- [SparcV9] Add ctpop instruction for i64. Also, expand ctlz, cttz and bswap. · 5615aca2
  Venkatraman Govindaraju authored Nov 03, 2013
```
llvm-svn: 193941
```
  5615aca2
Nov 02, 2013

Fix PR17764 · b638d05e

Michael Liao authored Nov 02, 2013

- When selecting BLEND from vselect, the operands need swapping as due to the
  difference between vselect and SSE/AVX's BLEND insn

llvm-svn: 193900

b638d05e

Nov 01, 2013

Use isa<> instead of dyn_cast<> with unused value · ef1a950b
Matt Arsenault authored Nov 01, 2013
```
llvm-svn: 193869
```
ef1a950b
[AArch64] Simplify a few of the instruction patterns. No functional change intended. · 995d9c2f
Chad Rosier authored Nov 01, 2013
```
llvm-svn: 193867
```
995d9c2f
[AArch64] Fix assembly string formatting and other coding standard violations. · a4bfb44a
Chad Rosier authored Nov 01, 2013
```
llvm-svn: 193866
```
a4bfb44a

Remove linkonce_odr_auto_hide. · 716e7405

Rafael Espindola authored Nov 01, 2013

linkonce_odr_auto_hide was in incomplete attempt to implement a way
for the linker to hide symbols that are known to be available in every
TU and whose addresses are not relevant for a particular DSO.

It was redundant in that it all its uses are equivalent to
linkonce_odr+unnamed_addr. Unlike those, it has never been connected
to clang or llvm's optimizers, so it was effectively dead.

Given that nothing produces it, this patch just nukes it
(other than the llvm-c enum value).

llvm-svn: 193865

716e7405

[ARM] Add Virtualization subtarget feature and more build attributes in this area · 2521975a

Bradley Smith authored Nov 01, 2013

Add a Virtualization ARM subtarget feature along with adding proper build
attribute emission for Tag_Virtualization_use (encodes Virtualization and
TrustZone) and Tag_MPextension_use.

Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to
something that is more maintainable. This changes the focus of this
testcase away from testing CPU defaults (which is tested elsewhere), onto
specifically testing that attributes are encoded correctly.

llvm-svn: 193859

2521975a

[ARM] Fix Tag_ABI_HardFP_use build attribute · c848beba

Bradley Smith authored Nov 01, 2013

Fix Tag_ABI_HardFP_use build attribute to handle single precision FP,
replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add
some tests for Tag_ABI_VFP_args.

llvm-svn: 193856

c848beba

Oct 31, 2013

Fix unused variable warnings. · 3e6f7aff
Dan Gohman authored Oct 31, 2013
```
llvm-svn: 193823
```
3e6f7aff
[AArch64] Add support for NEON scalar fixed-point convert to floating-point instructions. · 74b65cd8
Chad Rosier authored Oct 31, 2013
```
llvm-svn: 193816
```
74b65cd8
Add new calling convention for WebKit Java Script. · a3a11ded
Andrew Trick authored Oct 31, 2013
```
llvm-svn: 193812
```
a3a11ded
Add support for stack map generation in the X86 backend. · 153ebe6d
Andrew Trick authored Oct 31, 2013
```
Originally implemented by Lang Hames.

llvm-svn: 193811
```
153ebe6d
Use StringRef::startswith_lower. No functionality change. · 29d29108
Rui Ueyama authored Oct 31, 2013
```
llvm-svn: 193796
```
29d29108
[AArch64] Add support for NEON scalar shift immediate instructions. · 20e1f20d
Chad Rosier authored Oct 31, 2013
```
llvm-svn: 193790
```
20e1f20d
SparcV9 doesnt have rem instruction either. · 2262cfaf
Roman Divacky authored Oct 31, 2013
```
llvm-svn: 193789
```
2262cfaf
whitespace · d4d1d9c0
Andrew Trick authored Oct 31, 2013
```
llvm-svn: 193765
```
d4d1d9c0
Remove another unused flag. · 4b102d0e
Rafael Espindola authored Oct 31, 2013
```
llvm-svn: 193756
```
4b102d0e
Remove unused flag. · 74e1d0a0
Rafael Espindola authored Oct 31, 2013
```
llvm-svn: 193752
```
74e1d0a0
Add AVX512 unmasked integer broadcast intrinsics and support. · 394d557f
Cameron McInally authored Oct 31, 2013
```
llvm-svn: 193748
```
394d557f
AVX-512: Implemented CMOV for 512-bit vectors · 49665690
Elena Demikhovsky authored Oct 31, 2013
```
llvm-svn: 193747
```
49665690

[SystemZ] Automatically detect zEC12 and z196 hosts · f834ea19

Richard Sandiford authored Oct 31, 2013

As on other hosts, the CPU identification instruction is priveleged,
so we need to look through /proc/cpuinfo.  I copied the PowerPC way of
handling "generic".

Several tests were implicitly assuming z10 and so failed on z196.

llvm-svn: 193742

f834ea19

[AArch64] Make the use of FP instructions optional, but enabled by default. · f80f95fc

Amara Emerson authored Oct 31, 2013

This adds a new subtarget feature called FPARMv8 (implied by NEON), and
predicates the support of the FP instructions and registers on this feature.

llvm-svn: 193739

f80f95fc

Legalize: Improve legalization of long vector extends. · 72366786

Jim Grosbach authored Oct 31, 2013

When an extend more than doubles the size of the elements (e.g., a zext
from v16i8 to v16i32), the normal legalization method of splitting the
vectors will run into problems as by the time the destination vector is
legal, the source vector is illegal. The end result is the operation
often becoming scalarized, with the typical horrible performance. For
example, on x86_64, the simple input of:
define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
  %tmp = zext <16 x i8> %a to <16 x i32>
  store <16 x i32> %tmp, <16 x i32>*%p
  ret void
}

Generates:
  .section  __TEXT,__text,regular,pure_instructions
  .section  __TEXT,__const
  .align  5
LCPI0_0:
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _bar
  .align  4, 0x90
_bar:
  vpunpckhbw  %xmm0, %xmm0, %xmm1
  vpunpckhwd  %xmm0, %xmm1, %xmm2
  vpmovzxwd %xmm1, %xmm1
  vinsertf128 $1, %xmm2, %ymm1, %ymm1
  vmovaps LCPI0_0(%rip), %ymm2
  vandps  %ymm2, %ymm1, %ymm1
  vpmovzxbw %xmm0, %xmm3
  vpunpckhwd  %xmm0, %xmm3, %xmm3
  vpmovzxbd %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vandps  %ymm2, %ymm0, %ymm0
  vmovaps %ymm0, (%rdi)
  vmovaps %ymm1, 32(%rdi)
  vzeroupper
  ret

So instead we can check if there are legal types that enable us to split
more cleverly when the input vector is already legal such that we don't
turn it into an illegal type. If the extend is such that it's more than
doubling the size of the input we check if
  - the number of vector elements is even,
  - the source type is legal,
  - the type of a split source is illegal,
  - the type of an extended (by doubling element size) source is legal, and
  - the type of that extended source when split is legal.
If the conditions are met, instead of just splitting both the
destination and the source types, we create an extend that only goes up
one "step" (doubling the element width), and the continue legalizing the
rest of the operation normally. The result is that this operates as a
new, more effecient, termination condition for the loop of "split the
operation until the destination type is legal."

With this change, the above example now compiles to:
_bar:
  vpxor %xmm1, %xmm1, %xmm1
  vpunpcklbw  %xmm1, %xmm0, %xmm2
  vpunpckhwd  %xmm1, %xmm2, %xmm3
  vpunpcklwd  %xmm1, %xmm2, %xmm2
  vinsertf128 $1, %xmm3, %ymm2, %ymm2
  vpunpckhbw  %xmm1, %xmm0, %xmm0
  vpunpckhwd  %xmm1, %xmm0, %xmm3
  vpunpcklwd  %xmm1, %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vmovaps %ymm0, 32(%rdi)
  vmovaps %ymm2, (%rdi)
  vzeroupper
  ret

This generalizes a custom lowering that was added a while back to the
ARM backend. That lowering is no longer necessary, and is removed. The
testcases for it, however, provide excellent ARM tests for this change
and so remain.

rdar://14735100

llvm-svn: 193727

72366786

Fix a few typos · 909d0c06
Matt Arsenault authored Oct 30, 2013
```
llvm-svn: 193723
```
909d0c06

Oct 30, 2013

This commit adds some (but not all) of the x86-64 relocations that are not · 04d88fba
Tom Roeder authored Oct 30, 2013
```
currently supported in the ELF object writer, along with a simple test case.

llvm-svn: 193709
```
04d88fba
[ARM] NEON instructions were erroneously decoded from certain invalid encodings · c1be9c16
Artyom Skrobov authored Oct 30, 2013
```
llvm-svn: 193705
```
c1be9c16
R600: Custom lower f32 = uint_to_fp i64 · c947d8ca
Tom Stellard authored Oct 30, 2013
```
llvm-svn: 193701
```
c947d8ca

Add #include of raw_ostream.h to MipsSEISelLowering.cpp · 3e9b1c10

Hans Wennborg authored Oct 30, 2013

Fixing this Windows build error:

..\lib\Target\Mips\MipsSEISelLowering.cpp(997) : error C2027: use of undefined type 'llvm::raw_ostream'

llvm-svn: 193696

3e9b1c10

[mips][msa] Correct definition of bins[lr] and CHECK-DAG-ize related tests · d5f554f0
Daniel Sanders authored Oct 30, 2013
```
llvm-svn: 193695
```
d5f554f0

[mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal... · ab94b537

Daniel Sanders authored Oct 30, 2013

[mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal IR (i.e. not intrinsics)

Also corrected the definition of the intrinsics for these instructions (the
result register is also the first operand), and added intrinsics for bsel and
bseli to clang (they already existed in the backend).

These four operations are mostly equivalent to bsel, and bseli (the difference
is which operand is tied to the result). As a result some of the tests changed
as described below.

bitwise.ll:
- bsel.v test adapted so that the mask is unknown at compile-time. This stops
  it emitting bmnzi.b instead of the intended bsel.v.
- The bseli.b test now tests the right thing. Namely the case when one of the
  values is an uimm8, rather than when the condition is a uimm8 (which is
  covered by bmnzi.b)

compare.ll:
- bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this
  is the same operation (see MSA.txt).

i8.ll
- CHECK-DAG-ized test.
- bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands
  because this is the same operation (see MSA.txt).
- bseli.b still emits bseli.b though because the immediate makes it
  distinguishable from bmnzi.b.

vec.ll:
- CHECK-DAG-ized test.
- bmz.v tests now (correctly) emits bmnz.v with swapped operands (see
  MSA.txt).
- bsel.v tests now (correctly) emits bmnz.v with swapped operands (see
  MSA.txt).

llvm-svn: 193693

ab94b537

[AArch64] Add support for NEON scalar floating-point compare instructions. · be020d03
Chad Rosier authored Oct 30, 2013
```
llvm-svn: 193691
```
be020d03

[mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. not intrinsics) · d74b130c

Daniel Sanders authored Oct 30, 2013

This required correcting the definition of the bins[lr]i intrinsics because
the result is also the first operand.

It also required removing the (arbitrary) check for 32-bit immediates in
MipsSEDAGToDAGISel::selectVSplat().

Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d
because the constant is legalized into a ConstantPool. Similar things can
happen with binsri.d with more than 10 bits set in the mask. The resulting
code when this happens is correct but not optimal.

llvm-svn: 193687

d74b130c

[mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECT · 53fe6c4d

Daniel Sanders authored Oct 30, 2013

(or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b).
where $mask is a constant splat. This allows bitwise operations to make use
of bsel.

It's also a stepping stone towards matching bins[lr], and bins[lr]i from
normal IR.

Two sets of similar tests have been added in this commit. The bsel_* functions
test the case where binsri cannot be used. The binsr_*_i functions will
start to use the binsri instruction in the next commit.

llvm-svn: 193682

53fe6c4d