Commits · 7068a93caea194c950a319b125509b84473722f5 · Roger Ferrer / llvm-epi-0.8

May 23, 2006

Better way to check for vararg. · 7068a93c
Evan Cheng authored May 23, 2006
```
llvm-svn: 28440
```
7068a93c
Remove PreprocessCCCArguments and PreprocessFastCCArguments now that · 17e734f0
Evan Cheng authored May 23, 2006
```
FORMAL_ARGUMENTS nodes include a token operand.

llvm-svn: 28439
```
17e734f0

Implement an annoying part of the Darwin/X86 abi: the callee of a struct · 8be5be81

Chris Lattner authored May 23, 2006

return argument pops the hidden struct pointer if present, not the caller.

For example, in this testcase:

struct X { int D, E, F, G; };
struct X bar() {
  struct X a;
  a.D = 0;
  a.E = 1;
  a.F = 2;
  a.G = 3;
  return a;
}
void foo(struct X *P) {
  *P = bar();
}

We used to emit:

_foo:
        subl $28, %esp
        movl 32(%esp), %eax
        movl %eax, (%esp)
        call _bar
        addl $28, %esp
        ret
_bar:
        movl 4(%esp), %eax
        movl $0, (%eax)
        movl $1, 4(%eax)
        movl $2, 8(%eax)
        movl $3, 12(%eax)
        ret

This is correct on Linux/X86 but not Darwin/X86.  With this patch, we now
emit:

_foo:
        subl $28, %esp
        movl 32(%esp), %eax
        movl %eax, (%esp)
        call _bar
***     addl $24, %esp
        ret
_bar:
        movl 4(%esp), %eax
        movl $0, (%eax)
        movl $1, 4(%eax)
        movl $2, 8(%eax)
        movl $3, 12(%eax)
***     ret $4

For the record, GCC emits (which is functionally equivalent to our new code):

_bar:
        movl    4(%esp), %eax
        movl    $3, 12(%eax)
        movl    $2, 8(%eax)
        movl    $1, 4(%eax)
        movl    $0, (%eax)
        ret     $4
_foo:
        pushl   %esi
        subl    $40, %esp
        movl    48(%esp), %esi
        leal    16(%esp), %eax
        movl    %eax, (%esp)
        call    _bar
        subl    $4, %esp
        movl    16(%esp), %eax
        movl    %eax, (%esi)
        movl    20(%esp), %eax
        movl    %eax, 4(%esi)
        movl    24(%esp), %eax
        movl    %eax, 8(%esi)
        movl    28(%esp), %eax
        movl    %eax, 12(%esi)
        addl    $40, %esp
        popl    %esi
        ret

This fixes SingleSource/Benchmarks/CoyoteBench/fftbench with LLC and the
JIT, and fixes the X86-backend portion of PR729.  The CBE still needs to
be updated.

llvm-svn: 28438

8be5be81

May 19, 2006
- CSRet allows varargs · 01dd6df5
  Chris Lattner authored May 19, 2006
```
llvm-svn: 28409
```
  01dd6df5
May 17, 2006
- Should pass by reference. · 8c6b234c
  Evan Cheng authored May 17, 2006
```
llvm-svn: 28357
```
  8c6b234c
May 16, 2006
- Implement the custom lowering hook right, returning values for all of the · c7df70db
  Chris Lattner authored May 16, 2006
```
arguments at once.

llvm-svn: 28327
```
  c7df70db
- Fix a bug I introduced yesterday, which broke functions with *no* arguments. · 7b8b8bbb
  Chris Lattner authored May 16, 2006
```
llvm-svn: 28326
```
  7b8b8bbb
- X86 integer register classes naming changes. Make them consistent with FP, vector classes. · 9fee442e
  Evan Cheng authored May 16, 2006
```
llvm-svn: 28324
```
  9fee442e
- Add a chain to FORMAL_ARGUMENTS. This is a minimal port of the X86 backend, · 3d826996
  Chris Lattner authored May 16, 2006
```
it doesn't currently use/maintain the chain properly.  Also, make the
X86ISelLowering.cpp file 80-col clean.

llvm-svn: 28320
```
  3d826996
May 12, 2006
- Dead variable · 22f95b74
  Chris Lattner authored May 12, 2006
```
llvm-svn: 28265
```
  22f95b74
May 06, 2006
- Teach the X86 backend about non-i32 inline asm register classes. · 6d4a2dc4
  Chris Lattner authored May 06, 2006
```
llvm-svn: 28139
```
  6d4a2dc4
May 05, 2006
- Teach the code generator to use cvtss2sd as extload f32 -> f64 · 44a73e9f
  Chris Lattner authored May 05, 2006
```
llvm-svn: 28131
```
  44a73e9f
May 03, 2006

Refactor TargetMachine, pushing handling of TargetData into the... · 20a631fd

Owen Anderson authored May 03, 2006

Refactor TargetMachine, pushing handling of TargetData into the target-specific subclasses. This has one caller-visible change: getTargetData() now returns a pointer instead of a reference.

This fixes PR 759.

llvm-svn: 28074

20a631fd

Apr 28, 2006
- Initial caller side support (for CCC only, not FastCC) of 128-bit vector · 88decded
  Evan Cheng authored Apr 28, 2006
```
passing by value.

llvm-svn: 28015
```
  88decded
- Implement four-wide shuffle with 2 shufps if no more than two elements come · 3cd4362a
  Evan Cheng authored Apr 28, 2006
```
from each vector. e.g.
        shuffle(G1, G2, 7, 1, 5, 2)
==>
        movaps _G2, %xmm0
        shufps $151, _G1, %xmm0
        shufps $216, %xmm0, %xmm0

llvm-svn: 28011
```
  3cd4362a
- TargetLowering::LowerArguments should return a VBIT_CONVERT of · d43c5c60
  Evan Cheng authored Apr 28, 2006
```
FORMAL_ARGUMENTS SDOperand in the return result vector.

llvm-svn: 28009
```
  d43c5c60
Apr 27, 2006
- Make x86 isel lowering produce tailcall nodes. They are match to normal calls · f4f3f0d2
  Evan Cheng authored Apr 27, 2006
```
for now.

Patch contributed by Alexander Friedman.

llvm-svn: 27994
```
  f4f3f0d2
- Support for passing 128-bit vector arguments via XMM registers. · 89001ad7
  Evan Cheng authored Apr 27, 2006
```
llvm-svn: 27992
```
  89001ad7
- Oops · a0374e1b
  Evan Cheng authored Apr 27, 2006
```
llvm-svn: 27989
```
  a0374e1b
- Bug fix: not updating NumIntRegs. · 24eb3f47
  Evan Cheng authored Apr 27, 2006
```
llvm-svn: 27988
```
  24eb3f47
- - Clean up formal argument lowering code. Prepare for vector pass by value work. · 48940d16
  Evan Cheng authored Apr 27, 2006
```
- Fixed vararg support.

llvm-svn: 27985
```
  48940d16
Apr 26, 2006
- Fix fastcc failures. · 1c399032
  Evan Cheng authored Apr 26, 2006
```
llvm-svn: 27980
```
  1c399032
- Switching over FORMAL_ARGUMENTS mechanism to lower call arguments. · e0bcfbe8
  Evan Cheng authored Apr 26, 2006
```
llvm-svn: 27975
```
  e0bcfbe8
Apr 25, 2006
- Separate LowerOperation() into multiple functions, one per opcode. · a9467aab
  Evan Cheng authored Apr 25, 2006
```
llvm-svn: 27972
```
  a9467aab
- Special case handling two wide build_vector(0, x). · 5c2bfb06
  Evan Cheng authored Apr 24, 2006
```
llvm-svn: 27961
```
  5c2bfb06
Apr 24, 2006
- A little bit more build_vector enhancement for v8i16 cases. · b0461080
  Evan Cheng authored Apr 24, 2006
```
llvm-svn: 27959
```
  b0461080
Apr 23, 2006
- MOVL shuffle (i.e. movd or movss / movsd from memory) of undef, V2 == V2 · b4f31dd1
  Evan Cheng authored Apr 23, 2006
```
llvm-svn: 27953
```
  b4f31dd1
Apr 22, 2006

JumpTable support! What this represents is working asm and jit support for · 4ca2ea5b

Nate Begeman authored Apr 22, 2006

x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.

llvm-svn: 27947

4ca2ea5b

Don't do all the lowering stuff for 2-wide build_vector's. Also, minor... · e728efdf

Evan Cheng authored Apr 22, 2006

Don't do all the lowering stuff for 2-wide build_vector's. Also, minor optimization for shuffle of undef.

llvm-svn: 27946

e728efdf

Fix a performance regression. Use {p}shuf* when there are only two distinct... · 16ef94f4
Evan Cheng authored Apr 22, 2006
```
Fix a performance regression. Use {p}shuf* when there are only two distinct elements in a build_vector.

llvm-svn: 27945
```
16ef94f4

Revamp build_vector lowering to take advantage of movss and movd instructions. · 14215c36

Evan Cheng authored Apr 21, 2006

movd always clear the top 96 bits and movss does so when it's loading the
value from memory.
The net result is codegen for 4-wide shuffles is much improved. It is near
optimal if one or more elements is a zero. e.g.

__m128i test(int a, int b) {
  return _mm_set_epi32(0, 0, b, a);
}

compiles to

_test:
	movd 8(%esp), %xmm1
	movd 4(%esp), %xmm0
	punpckldq %xmm1, %xmm0
	ret

compare to gcc:

_test:
	subl	$12, %esp
	movd	20(%esp), %xmm0
	movd	16(%esp), %xmm1
	punpckldq	%xmm0, %xmm1
	movq	%xmm1, %xmm0
	movhps	LC0, %xmm0
	addl	$12, %esp
	ret

or icc:

_test:
        movd      4(%esp), %xmm0                                #5.10
        movd      8(%esp), %xmm3                                #5.10
        xorl      %eax, %eax                                    #5.10
        movd      %eax, %xmm1                                   #5.10
        punpckldq %xmm1, %xmm0                                  #5.10
        movd      %eax, %xmm2                                   #5.10
        punpckldq %xmm2, %xmm3                                  #5.10
        punpckldq %xmm3, %xmm0                                  #5.10
        ret                                                     #5.10

There are still room for improvement, for example the FP variant of the above example:

__m128 test(float a, float b) {
  return _mm_set_ps(0.0, 0.0, b, a);
}

_test:
	movss 8(%esp), %xmm1
	movss 4(%esp), %xmm0
	unpcklps %xmm1, %xmm0
	xorps %xmm1, %xmm1
	movlhps %xmm1, %xmm0
	ret

The xorps and movlhps are unnecessary. This will require post legalizer optimization to handle.

llvm-svn: 27939

14215c36

Apr 21, 2006

Now generating perfect (I think) code for "vector set" with a single non-zero · e8b51800

Evan Cheng authored Apr 21, 2006

scalar value.

e.g.
        _mm_set_epi32(0, a, 0, 0);
==>
	movd 4(%esp), %xmm0
	pshufd $69, %xmm0, %xmm0

        _mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
	movzbw 4(%esp), %ax
	movzwl %ax, %eax
	pxor %xmm0, %xmm0
	pinsrw $5, %eax, %xmm0

llvm-svn: 27923

e8b51800

Apr 20, 2006
- - Added support to turn "vector clear elements", e.g. pand V, <-1, -1, 0, -1> · 60f0b899
  Evan Cheng authored Apr 20, 2006
```
to a vector shuffle.
- VECTOR_SHUFFLE lowering change in preparation for more efficient codegen
of vector shuffle with zero (or any splat) vector.

llvm-svn: 27875
```
  60f0b899
- Handle v2i64 BUILD_VECTOR custom lowering correctly. v2i64 is a legal type, · 15c264b7
  Evan Cheng authored Apr 20, 2006
```
but i64 is not. If possible, change a i64 op to a f64 (e.g. load, constant)
and then cast it back.

llvm-svn: 27849
```
  15c264b7
- isSplatMask() bug: first element can be an undef. · 4a1b0d32
  Evan Cheng authored Apr 19, 2006
```
llvm-svn: 27847
```
  4a1b0d32
- - Added support to do aribitrary 4 wide shuffle with no more than three · a3caaee5
  Evan Cheng authored Apr 19, 2006
```
  instructions.
- Fixed a commute vector_shuff bug.

llvm-svn: 27845
```
  a3caaee5
Apr 19, 2006
- Commute vector_shuffle to match more movlhps, movlp{s|d} cases. · 7855e4d0
  Evan Cheng authored Apr 19, 2006
```
llvm-svn: 27840
```
  7855e4d0
Apr 18, 2006
- Use movss to insert_vector_elt(v, s, 0). · 5421206c
  Evan Cheng authored Apr 17, 2006
```
llvm-svn: 27782
```
  5421206c
- Use two pinsrw to insert an element into v4i32 / v4f32 vector. · 6e5e2058
  Evan Cheng authored Apr 17, 2006
```
llvm-svn: 27779
```
  6e5e2058
Apr 17, 2006
- Implement v8i16, v16i8 splat using unpckl + pshufd. · 5022b342
  Evan Cheng authored Apr 17, 2006
```
llvm-svn: 27768
```
  5022b342