Commits · be633d91d047238defce21311b8525509ac09dcd · Roger Ferrer / llvm-epi-0.8

Aug 03, 2010

CMake: Change somme target library names: · 371b1b91

Oscar Fuentes authored Aug 03, 2010

XCore->XCoreGen
PIC16->PIC16CodeGen

After updating your working copy, the first build will fail because it
is using the old library dependencies. Start the build again and it
will work fine.

llvm-svn: 110127

371b1b91

Aug 02, 2010
- More SPU v2f32 stuff added: insertelement and shuffle. · 77558b7d
  Kalle Raiskila authored Aug 02, 2010
```
llvm-svn: 110038
```
  77558b7d
- Add preliminary v2f32 support for SPU. Like with v2i32, we just · 68b38866
  Kalle Raiskila authored Aug 02, 2010
```
duplicate the instructions and operate on half vectors. 

Also reorder code in SPUInstrInfo.td for better coherency.

llvm-svn: 110037
```
  68b38866
- Add preliminary v2i32 support for SPU backend. As there are no · 622f8eb9
  Kalle Raiskila authored Aug 02, 2010
```
such registers in SPU, this support boils down to "emulating" 
them by duplicating instructions on the general purpose registers. 

This adds the most basic operations on v2i32: passing parameters,
addition, subtraction, multiplication and a few others.

llvm-svn: 110035
```
  622f8eb9
- PR7781: Fix incorrect shifting in PPCTargetLowering::LowerBUILD_VECTOR. · 7595ce05
  Eli Friedman authored Aug 02, 2010
```
llvm-svn: 109998
```
  7595ce05
Aug 01, 2010
- PR7774: Fix undefined shifts in Alpha backend. As a bonus, this actually · 1b2bc1b8
  Eli Friedman authored Aug 01, 2010
```
improves the generated code in some cases.

llvm-svn: 109985
```
  1b2bc1b8
Jul 31, 2010
- Silence some -Asserts uninitialized variable warnings. · 727be43a
  Daniel Dunbar authored Jul 31, 2010
```
llvm-svn: 109956
```
  727be43a
- MC: Remove HasAbsolutizedSet from WindowsX86AsmBackend. · ed80f361
  Michael J. Spencer authored Jul 31, 2010
```
llvm-svn: 109949
```
  ed80f361
- Move newlines before inline jumptables from the asm strings in .td files to · b128824b
  Bob Wilson authored Jul 31, 2010
```
the jtblock_operand print methods.  This avoids extra newlines in the
disassembler's output.  PR7757.

llvm-svn: 109948
```
  b128824b
- Add relax all support to the COFF object streamer. · 6b4925e2
  Michael J. Spencer authored Jul 31, 2010
```
llvm-svn: 109947
```
  6b4925e2
- Add support for disassembling VMVN (immediate) instructions. PR7747. · cd5fc7be
  Bob Wilson authored Jul 31, 2010
```
llvm-svn: 109946
```
  cd5fc7be
- Add -disable-shifter-op to disable isel of shifter ops. On Cortex-a9 the... · 59069ec7
  Evan Cheng authored Jul 30, 2010
```
Add -disable-shifter-op to disable isel of shifter ops. On Cortex-a9 the shifts cost extra instructions so it might be better to emit them separately to take advantage of dual-issues.

llvm-svn: 109934
```
  59069ec7
- Add a check in the ARM disassembler for NEON instructions that would · eb7b21f3
  Bob Wilson authored Jul 30, 2010
```
reference registers past the end of the NEON register file, and report them
as invalid instead of asserting when trying to print them.  PR7746.

llvm-svn: 109933
```
  eb7b21f3
Jul 30, 2010

PPC doesn't supported VLA with large alignment. This was · cf0287e5

Dale Johannesen authored Jul 30, 2010

formerly rejected by the FE, so asserted in the BE; now the FE only
warns, so we treat it as a legitimate fatal error in PPC BE.
This means the test for the feature won't pass, so it's xfail'd.

llvm-svn: 109892

cf0287e5

Add the __TEXT,__StaticInit section to the list of sections emitted at the · 4320e2d1
Bob Wilson authored Jul 30, 2010
```
beginning on ARM Darwin assembly files so that it won't be placed after
debug sections.  Radar 8252813.

llvm-svn: 109879
```
4320e2d1

Support all 128-bit AVX vector intrinsics. Most part of them I already · 349165b4

Bruno Cardoso Lopes authored Jul 30, 2010

declared during the addition of the assembler support, the additional
changes are:
- Add missing intrinsics
- Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file.
- Duplicate some patterns to AVX mode.
- Step into PCMPEST/PCMPIST custom inserter and add AVX versions.

llvm-svn: 109878

349165b4

Fix typo! · 405405bb
Bruno Cardoso Lopes authored Jul 30, 2010
```
llvm-svn: 109877
```
405405bb

Many Thumb2 instructions can reference the full ARM register set (i.e., · d343166a

Jim Grosbach authored Jul 30, 2010

have 4 bits per register in the operand encoding), but have undefined
behavior when the operand value is 13 or 15 (SP and PC, respectively).
The trivial coalescer in linear scan sometimes will merge a copy from
SP into a subsequent instruction which uses the copy, and if that
instruction cannot legally reference SP, we get bad code such as:
  mls r0,r9,r0,sp
instead of:
  mov r2, sp
  mls r0, r9, r0, r2

This patch adds a new register class for use by Thumb2 that excludes
the problematic registers (SP and PC) and is used instead of GPR
for those operands which cannot legally reference PC or SP. The
trivial coalescer explicitly requires that the register class
of the destination for the COPY instruction contain the source
register for the COPY to be considered for coalescing. This prevents
errant instructions like that above.

PR7499

llvm-svn: 109842

d343166a

Add builtins for ssat/usat, similar to RealView's __ssat and __usat intrinsics. · c4a96c0e
Nate Begeman authored Jul 29, 2010
```
llvm-svn: 109813
```
c4a96c0e

Jul 29, 2010

Refactor ARM-specific DAG combining in preparation for adding some more · 728eb292
Bob Wilson authored Jul 29, 2010
```
transformations.

llvm-svn: 109800
```
728eb292

Implement vector constants which are splat of · 2bff5054

Dale Johannesen authored Jul 29, 2010

integers with mov + vdup.  8003375.  This is
currently disabled by default because LICM will
not hoist a VDUP, so it pessimizes the code if
the construct occurs inside a loop (8248029).

llvm-svn: 109799

2bff5054

Don't assert on an unrecognized BrMiscFrm instruction. · a9bf1b14
Bob Wilson authored Jul 29, 2010
```
PR7745.

llvm-svn: 109788
```
a9bf1b14

Add intrinsics __builtin_arm_qadd & __builtin_arm_qsub to allow access to the... · 7010a71a

Nate Begeman authored Jul 29, 2010

Add intrinsics __builtin_arm_qadd & __builtin_arm_qsub to allow access to the QADD & QSUB instructions.
Behave identically to __qadd & __qsub RealView instruction intrinsics.

llvm-svn: 109770

7010a71a

Revert r109652, and remove the offending assert in loadRegFromStackSlot instead. · ba0e124a

Jakob Stoklund Olesen authored Jul 29, 2010

We do sometimes load from a too small stack slot when dealing with x86 arguments
(varargs and smaller-than-32-bit args). It looks like we know what we are doing
in those cases, so I am going to remove the assert instead of artifically
enlarging stack slot sizes.

The assert in storeRegToStackSlot stays in. We don't want to write beyond the
bounds of a stack slot.

llvm-svn: 109764

ba0e124a

ARM mode version of r109693. Remove incorrect substitution pattern for UXTB16.... · c445a7d2

Jim Grosbach authored Jul 28, 2010

ARM mode version of r109693. Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138

llvm-svn: 109696

c445a7d2

Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input... · 716a596c

Jim Grosbach authored Jul 28, 2010

Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138

llvm-svn: 109693

716a596c

Remove dead prototype · de0874a4
Jim Grosbach authored Jul 28, 2010
```
llvm-svn: 109691
```
de0874a4

Jul 28, 2010

Create a fixed stack object for varargs that is as large as any register. · f2234fbe

Jakob Stoklund Olesen authored Jul 28, 2010

The size of this object isn't used for anything - technically it is of variable
size.

This avoids a false positive from the assert in
X86InstrInfo::loadRegFromStackSlot, and fixes PR7735.

llvm-svn: 109652

f2234fbe

Fix this code to avoid decrementing an iterator past the beginning · 1da02dfb
Dan Gohman authored Jul 28, 2010
```
of a std::vector.

llvm-svn: 109597
```
1da02dfb
Do GEP offset calculations with unsigned math rather than signed math · 32f889e5
Dan Gohman authored Jul 28, 2010
```
to avoid undefined behavior on overflow, noticed by John Regehr.

llvm-svn: 109594
```
32f889e5
Implement a vectorized algorithm for <16 x i8> << <16 x i8> · 53afc8f0
Nate Begeman authored Jul 28, 2010
```
This is about 4x faster and smaller than the existing scalarization.

llvm-svn: 109566
```
53afc8f0

~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller... · 269a6da0

Nate Begeman authored Jul 27, 2010

~40% faster vector shl <4 x i32> on SSE 4.1  Larger improvements for smaller types coming in future patches.

For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret

llvm-svn: 109549

269a6da0

Jul 27, 2010
- Make MC use Windows COFF on Windows and add tests. · f8270bdb
  Michael J. Spencer authored Jul 27, 2010
```
llvm-svn: 109494
```
  f8270bdb
- The isLoadFromStackSlot and isStoreToStackSlot have no way of reporting · 96a890a7
  Jakob Stoklund Olesen authored Jul 27, 2010
```
subregister operands like this:

%reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8)

Make them return false when subreg operands are present. VirtRegRewriter is
making bad assumptions otherwise.

This fixes PR7713.

llvm-svn: 109489
```
  96a890a7
- Add assertions that expose the PR7713 miscompilation: Accessing a stack slot · c3c05ed0
  Jakob Stoklund Olesen authored Jul 27, 2010
```
with a too-big register class.

llvm-svn: 109488
```
  c3c05ed0
- And a bit more non-ASCII stuff. · f902befe
  Eli Friedman authored Jul 26, 2010
```
llvm-svn: 109458
```
  f902befe
- Drop some non-ascii stuff · 1e0d76bf
  Anton Korobeynikov authored Jul 26, 2010
```
llvm-svn: 109456
```
  1e0d76bf
Jul 26, 2010

On x86, f32 / f64 nodes share the same registers as 128-bit vector values. · d4218b87
Evan Cheng authored Jul 26, 2010
```
llvm-svn: 109450
```
d4218b87
Add a note · b61a6f27
Anton Korobeynikov authored Jul 26, 2010
```
llvm-svn: 109448
```
b61a6f27

Temporary hack to let codegen assert or generate poor code in case · 36c2ea6c

Bruno Cardoso Lopes authored Jul 26, 2010

we are using AVX and no AVX version of the desired intruction is present,
this is better for incremental dev (without fallbacks it's easier to spot
what's missing). Not sure this is the best hack thought (we can also disable
all HasSSE* predicates by dinamically marking them 'false' if AVX is present)

llvm-svn: 109434

36c2ea6c