Commits · 9d49056ef00f13fd7e609dfe16c5e7ef053632ad · Roger Ferrer / llvm-epi-0.8

Apr 25, 2013

R600: Initialize BooleanVectorContents · 87047f69
Tom Stellard authored Apr 24, 2013
```
Fixes test/CodeGen/R600/setcc.ll

llvm-svn: 180231
```
87047f69

R600: Use SHT_PROGBITS for the .AMDGPU.config section · 34e4068d

Tom Stellard authored Apr 24, 2013

The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
will not parse sections that are marked SHT_NULL.

llvm-svn: 180230

34e4068d

Apr 23, 2013
- Hexagon: Use multiclass for combine and STri[bhwd]_shl_V4 instructions. · af2359b9
  Jyotsna Verma authored Apr 23, 2013
```
llvm-svn: 180145
```
  af2359b9
- Hexagon: Define relations for GP-relative instructions. · f00aab98
  Jyotsna Verma authored Apr 23, 2013
```
No functionality change.

llvm-svn: 180144
```
  f00aab98
- Add more tests for r179925 to verify correct handling of signext/zeroext;... · 8118e0b5
  Stephen Lin authored Apr 23, 2013
```
Add more tests for r179925 to verify correct handling of signext/zeroext; strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change)

llvm-svn: 180138
```
  8118e0b5
- Lowercase "is" boolean variable prefix for consistency within function, no functionality change. · 4eedb29b
  Stephen Lin authored Apr 23, 2013
```
llvm-svn: 180136
```
  4eedb29b
- Hexagon: Remove assembler mapped instruction definitions. · 89c84821
  Jyotsna Verma authored Apr 23, 2013
```
llvm-svn: 180133
```
  89c84821
- Change commentary for PowerPC Boolean vector contents. · a76bf5a6
  Bill Schmidt authored Apr 23, 2013
```
No functional change intended.

llvm-svn: 180131
```
  a76bf5a6
- [mips] Compare splat value with element size instead of calling isUIntN. · e9d0b318
  Akira Hatanaka authored Apr 23, 2013
```
No intended changes in functionality.

llvm-svn: 180130
```
  e9d0b318
- DAGCombine should not aggressively fold SEXT(VSETCC(...)) into a wider VSETCC... · 2d4cca35
  Owen Anderson authored Apr 23, 2013
```
DAGCombine should not aggressively fold SEXT(VSETCC(...)) into a wider VSETCC without first checking the target's vector boolean contents.
This exposed an issue with PowerPC AltiVec where it appears it was setting the wrong vector boolean contents.  The included change
fixes the PowerPC tests, and was OK'd by Hal.

llvm-svn: 180129
```
  2d4cca35
- R600: Use .AMDGPU.config section to emit stacksize · 117f075f
  Vincent Lejeune authored Apr 23, 2013
```
llvm-svn: 180124
```
  117f075f
- R600: Add CF_END · b6bfe85a
  Vincent Lejeune authored Apr 23, 2013
```
llvm-svn: 180123
```
  b6bfe85a
- Hexagon: Remove duplicate instructions to handle global/immediate values · a696239b
  Jyotsna Verma authored Apr 23, 2013
```
for absolute/absolute-set addressing modes.

llvm-svn: 180120
```
  a696239b
- AArch64: remove unnecessary check that RS is valid · 2ac2d4c5
  Tim Northover authored Apr 23, 2013
```
AArch64 always demands a register-scavenger, so the pointer should never be
NULL. However, in the spirit of paranoia, we'll assert it before use just in
case.

llvm-svn: 180080
```
  2ac2d4c5
- Remove unused DwarfSectionOffsetDirective string · 034ca0fe
  Matt Arsenault authored Apr 22, 2013
```
The value isn't actually used, and setting it emits a COFF specific
directive.

llvm-svn: 180064
```
  034ca0fe
- Move C++ code out of the C headers and into either C++ headers · 04d4e931
  Eric Christopher authored Apr 22, 2013
```
or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.

llvm-svn: 180063
```
  04d4e931
- [ms-inline asm] Removed this unnecessary check. In the current implementation, · 65dd0399
  Chad Rosier authored Apr 22, 2013
```
Disp will always be one of MCSymbolRefExpr or MCConstantExpr, and never NULL.

llvm-svn: 180059
```
  65dd0399
- [ms-inline asm] Add the OpDecl to the InlineAsmIdentifierInfo struct and in turn · 732b837a
  Chad Rosier authored Apr 22, 2013
```
the MCParsedAsmOperand.
Part of rdar://13663589

llvm-svn: 180054
```
  732b837a
Apr 22, 2013

Fix unused variable warning. · eeb00349
Chad Rosier authored Apr 22, 2013
```
llvm-svn: 180044
```
eeb00349
80 columns. · d8fb032c
Akira Hatanaka authored Apr 22, 2013
```
llvm-svn: 180040
```
d8fb032c
[mips] In performDSPShiftCombine, check that all elements in the vector are · 0d6964cf
Akira Hatanaka authored Apr 22, 2013
```
shifted by the same amount and the shift amount is smaller than the element
size.

llvm-svn: 180039
```
0d6964cf

[ms-inline asm] Remove the identifier parsing logic from the AsmParser. This is · cb78f0d0

Chad Rosier authored Apr 22, 2013

now taken care of by the frontend, which allows us to parse arbitrary C/C++
variables.
Part of rdar://13663589

llvm-svn: 180037

cb78f0d0

[ms-inline asm] Refactor/clean up the SemaLookup interface. No functional · f6675c3d
Chad Rosier authored Apr 22, 2013
```
change indended.
Part of rdar://13663589

llvm-svn: 180028
```
f6675c3d
No really, don't store anything to this since it's unconditionally · cc2cfe42
Eric Christopher authored Apr 22, 2013
```
set below.

llvm-svn: 180015
```
cc2cfe42
Remove variable store that is never read. · 6647fb2c
Eric Christopher authored Apr 22, 2013
```
llvm-svn: 180014
```
6647fb2c

Fix for 5.5 Parameter Passing --> Stage C: · f80f9513

Stepan Dyatkovskiy authored Apr 22, 2013

 -- C.4 and C.5 statements, when NSAA is not equal to SP.
 -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a
    variadic procedure.

Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are
some exceptions in AAPCS.
1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated
   CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs.
2. Check that for VA functions all params uses GPRs and then stack.
   No exceptions, no CPRCs here.

llvm-svn: 180011

f80f9513

Legalize vector truncates by parts rather than just splitting. · 563983c8

Jim Grosbach authored Apr 21, 2013

Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.

Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again.  For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
  %inlo = v4i32 extract_subvector %in, 0
  %inhi = v4i32 extract_subvector %in, 4
  %lo16 = v4i16 trunc v4i32 %inlo
  %hi16 = v4i16 trunc v4i32 %inhi
  %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
  %res = v8i8 trunc v8i16 %in16

This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.

Update the ARMTargetTransformInfo to take this improved legalization
into account.

Consider the simplified IR:

define <16 x i8> @test1(<16 x i32>* %ap) {
  %a = load <16 x i32>* %ap
  %tmp = trunc <16 x i32> %a to <16 x i8>
  ret <16 x i8> %tmp
}

define <8 x i8> @test2(<8 x i32>* %ap) {
  %a = load <8 x i32>* %ap
  %tmp = trunc <8 x i32> %a to <8 x i8>
  ret <8 x i8> %tmp
}

Previously, we would generate the truly hideous:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #20
	bic	sp, sp, #7
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d24, d25}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	vld1.64	{d18, d19}, [r2:128]
	add	r1, r0, #16
	vmovn.i32	d22, q8
	vld1.64	{d16, d17}, [r1:128]
	vmovn.i32	d20, q9
	vmovn.i32	d18, q12
	vmov.u16	r0, d22[3]
	strb	r0, [sp, #15]
	vmov.u16	r0, d22[2]
	strb	r0, [sp, #14]
	vmov.u16	r0, d22[1]
	strb	r0, [sp, #13]
	vmov.u16	r0, d22[0]
	vmovn.i32	d16, q8
	strb	r0, [sp, #12]
	vmov.u16	r0, d20[3]
	strb	r0, [sp, #11]
	vmov.u16	r0, d20[2]
	strb	r0, [sp, #10]
	vmov.u16	r0, d20[1]
	strb	r0, [sp, #9]
	vmov.u16	r0, d20[0]
	strb	r0, [sp, #8]
	vmov.u16	r0, d18[3]
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	vldmia	sp, {d16, d17}
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	mov	sp, r7
	pop	{r7}
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #12
	bic	sp, sp, #7
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d20, d21}, [r0:128]
	vmovn.i32	d18, q8
	vmov.u16	r0, d18[3]
	vmovn.i32	d16, q10
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	ldm	sp, {r0, r1}
	mov	sp, r7
	pop	{r7}
	bx	lr

Now, however, we generate the much more straightforward:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d20, d21}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	add	r1, r0, #16
	vld1.64	{d18, d19}, [r2:128]
	vld1.64	{d22, d23}, [r1:128]
	vmovn.i32	d17, q8
	vmovn.i32	d16, q9
	vmovn.i32	d18, q10
	vmovn.i32	d19, q11
	vmovn.i16	d17, q8
	vmovn.i16	d16, q9
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d18, d19}, [r0:128]
	vmovn.i32	d16, q8
	vmovn.i32	d17, q9
	vmovn.i16	d16, q8
	vmov	r0, r1, d16
	bx	lr

llvm-svn: 179989

563983c8

Apr 21, 2013

Passing arguments to varags functions under the SPARC v9 ABI. · 84ebe25d
Jakob Stoklund Olesen authored Apr 21, 2013
```
Arguments after the fixed arguments never use the floating point
registers.

llvm-svn: 179987
```
84ebe25d
Fix the SETHIimm pattern for 64-bit code. · 65d32872
Jakob Stoklund Olesen authored Apr 21, 2013
```
Don't ignore the high 32 bits of the immediate.

llvm-svn: 179985
```
65d32872

ARM: Use ldrd/strd to spill 64-bit pairs when available. · 798697d6

Tim Northover authored Apr 21, 2013

This allows common sp-offsets to be part of the instruction and is
probably faster on modern CPUs too.

llvm-svn: 179977

798697d6

Compile varargs functions for SPARCv9. · a41f91ea

Jakob Stoklund Olesen authored Apr 20, 2013

With a little help from the frontend, it looks like the standard va_*
intrinsics can do the job.

Also clean up an old bitcast hack in LowerVAARG that dealt with
unaligned double loads. Load SDNodes can specify an alignment now.

Still missing: Calling varargs functions with float arguments.

llvm-svn: 179961

a41f91ea

Apr 20, 2013

ARM: don't add FrameIndex offset for LDMIA (has no immediate) · d9d4211f

Tim Northover authored Apr 20, 2013

Previously, when spilling 64-bit paired registers, an LDMIA with both
a FrameIndex and an offset was produced. This kind of instruction
shouldn't exist, and the extra operand was being confused with the
predicate, causing aborts later on.

This removes the invalid 0-offset from the instruction being
produced.

llvm-svn: 179956

d9d4211f

AArch64: remove useless comment · 56862bd6
Tim Northover authored Apr 20, 2013
```
llvm-svn: 179952
```
56862bd6

Remove unused ShouldFoldAtomicFences flag. · 16aba170

Tim Northover authored Apr 20, 2013

I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.

llvm-svn: 179940

16aba170

Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE. · a2b53390
Tim Northover authored Apr 20, 2013
```
llvm-svn: 179939
```
a2b53390

Move PPC getSwappedPredicate for reuse · 0f64e21b

Hal Finkel authored Apr 20, 2013

The getSwappedPredicate function can be used in other places (such as in
improvements to the PPCCTRLoops pass). Instead of trapping it as a static
function in PPCInstrInfo, move it into PPCPredicates with other
predicate-related things.

No functionality change intended.

llvm-svn: 179926

0f64e21b

Add CodeGen support for functions that always return arguments via a new... · b8bd232a

Stephen Lin authored Apr 20, 2013

Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).

llvm-svn: 179925

b8bd232a

Test commit · d36fd2cf
Stephen Lin authored Apr 20, 2013
```
llvm-svn: 179913
```
d36fd2cf
[mips] Instruction selection patterns for DSP-ASE vector shifts. · 1ebb2a1c
Akira Hatanaka authored Apr 19, 2013
```
llvm-svn: 179906
```
1ebb2a1c

Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'm · 90dd3e7d

Eli Bendersky authored Apr 19, 2013

trying to move as much FastISel logic as possible out of the main path in
SelectionDAGISel - intermixing them just adds confusion.

llvm-svn: 179902

90dd3e7d