Commits · 1f35454e991034305e3b9de014e1efcddd3986e1 · Roger Ferrer / llvm-epi-0.8

Dec 31, 2007

Add new shorter predicates for testing machine operands for various types: · a5bb370a

Chris Lattner authored Dec 30, 2007

e.g. MO.isMBB() instead of MO.isMachineBasicBlock().  I don't plan on 
switching everything over, so new clients should just start using the 
shorter names.

Remove old long accessors, switching everything over to use the short
accessor: getMachineBasicBlock() -> getMBB(), 
getConstantPoolIndex() -> getIndex(), setMachineBasicBlock -> setMBB(), etc.

llvm-svn: 45464

a5bb370a

Dec 29, 2007

Remove attribution from file headers, per discussion on llvmdev. · f3ebc3f3
Chris Lattner authored Dec 29, 2007
```
llvm-svn: 45418
```
f3ebc3f3

Codegen: · 07ccbfa6

Chris Lattner authored Dec 29, 2007

as:

_bar:
	pushl	%esi
	subl	$8, %esp
	movl	16(%esp), %esi
	call	L_foo$stub
	fstps	(%esi)
	addl	$8, %esp
	popl	%esi
	#FP_REG_KILL
	ret

instead of:

_bar:
	pushl	%esi
	subl	$8, %esp
	movl	16(%esp), %esi
	call	L_foo$stub
	fstpl	(%esi)
	cvtsd2ss	(%esi), %xmm0
	movss	%xmm0, (%esi)
	addl	$8, %esp
	popl	%esi
	#FP_REG_KILL
	ret

llvm-svn: 45401

07ccbfa6

avoid going through a stack slot to convert from fpstack to xmm reg · 8013bd33

Chris Lattner authored Dec 29, 2007

if we are just going to store it back anyway.  This improves things 
like:
double foo();
void bar(double *P) { *P = foo(); }

llvm-svn: 45399

8013bd33

Dec 16, 2007
- fix a questionable cast, thanks to Mike Stump for pointing this out. · e3b05fe3
  Chris Lattner authored Dec 16, 2007
```
llvm-svn: 45075
```
  e3b05fe3
Dec 15, 2007
- Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs. · 23d2d4dc
  Evan Cheng authored Dec 15, 2007
```
llvm-svn: 45058
```
  23d2d4dc
Dec 14, 2007
- Fix ctlz and cttz. llvm definition requires them to return number of bits in... · 0e640812
  Evan Cheng authored Dec 14, 2007
```
Fix ctlz and cttz. llvm definition requires them to return number of bits in of the src type when value is zero.

llvm-svn: 45029
```
  0e640812
- Implement ctlz and cttz with bsr and bsf. · e9fbc3f0
  Evan Cheng authored Dec 14, 2007
```
llvm-svn: 45024
```
  e9fbc3f0
Dec 12, 2007
- Allow vector integer constants to be created with · 7a7742c2
  Dan Gohman authored Dec 12, 2007
```
SelectionDAG::getConstant, in the same way as vector floating-point
constants. This allows the legalize expansion code for @llvm.ctpop and
friends to be usable with vector types.

llvm-svn: 44954
```
  7a7742c2
- Use shuffles to implement insert_vector_elt for i32, i64, f32, and f64. · 0f427307
  Evan Cheng authored Dec 12, 2007
```
llvm-svn: 44929
```
  0f427307
- Lower a build_vector with all constants into a constpool load unless it can be... · 2a989567
  Evan Cheng authored Dec 12, 2007
```
Lower a build_vector with all constants into a constpool load unless it can be done with a move to low part.

llvm-svn: 44921
```
  2a989567
Dec 11, 2007

- Improved v8i16 shuffle lowering. It now uses pshuflw and pshufhw as much as · 4fbf4595

Evan Cheng authored Dec 11, 2007

possible before resorting to pextrw and pinsrw.
- Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles.
- Improves (i16 extract_vector_element 0) codegen by recognizing
  (i32 extract_vector_element 0) does not require a pextrw.

llvm-svn: 44836

4fbf4595

x86 doesn't actually want to custom lower v3i32 · a55a67ae
Nate Begeman authored Dec 11, 2007
```
llvm-svn: 44835
```
a55a67ae

Dec 07, 2007
- Add comment. · b41d838d
  Evan Cheng authored Dec 07, 2007
```
llvm-svn: 44686
```
  b41d838d
- Much improved v8i16 shuffles. (Step 1). · bfd373a5
  Evan Cheng authored Dec 07, 2007
```
llvm-svn: 44676
```
  bfd373a5
Dec 06, 2007
- Remove a bogus optimization. It's not possible to do a move to low element to... · c829e5cd
  Evan Cheng authored Dec 06, 2007
```
Remove a bogus optimization. It's not possible to do a move to low element to a <8 x i16> or <16 x i8> vector.

llvm-svn: 44669
```
  c829e5cd
Nov 27, 2007

Fix PR1146: parameter attributes are longer part of · ad0ea2d4

Duncan Sands authored Nov 27, 2007

the function type, instead they belong to functions
and function calls.  This is an updated and slightly
corrected version of Reid Spencer's original patch.
The only known problem is that auto-upgrading of
bitcode files doesn't seem to work properly (see
test/Bitcode/AutoUpgradeIntrinsics.ll).  Hopefully
a bitcode guru (who might that be? :) ) will fix it.

llvm-svn: 44359

ad0ea2d4

Nov 25, 2007

Fix a long standing deficiency in the X86 backend: we would · 5728bdd4

Chris Lattner authored Nov 25, 2007

sometimes emit "zero" and "all one" vectors multiple times,
for example:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M2
	ret

instead of:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	movq	%mm0, _M2
	ret

This patch fixes this by always arranging for zero/one vectors
to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be
any random type.  This ensures they get trivially CSE'd on the dag.
This fix is also important for LegalizeDAGTypes, as it gets unhappy
when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when
'i64' isn't legal.

This patch makes the following changes:

1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into
   their canonical types.
2) The now-dead patterns are removed from the SSE/MMX .td files.
3) All the patterns in the .td file that referred to immAllOnesV or
   immAllZerosV in the wrong form now use *_bc to match them with a
   bitcast wrapped around them.
4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle 
   bitcast'd zero vectors, which simplifies the code actually.
5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that
   is legal, instead of generating one that is illegal and expecting
   a later legalize pass to clean it up.
6) isZeroShuffle is generalized to handle bitcast of zeros.
7) several other minor tweaks.

This patch is definite goodness, but has the potential to cause random
code quality regressions.  Please be on the lookout for these and let 
me know if they happen.

llvm-svn: 44310

5728bdd4

Nov 24, 2007

remove bogus assertion that broke CodeGen/Generic/cast-fp.ll on x86 · f72ad162
Chris Lattner authored Nov 24, 2007
```
among others.

llvm-svn: 44302
```
f72ad162

Several changes: · f81d5886

Chris Lattner authored Nov 24, 2007

1) Change the interface to TargetLowering::ExpandOperationResult to 
   take and return entire NODES that need a result expanded, not just
   the value.  This allows us to handle things like READCYCLECOUNTER,
   which returns two values.
2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES.
3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new
   ExpandOperationResult.  This makes the result simpler and fully 
   general.
4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes.
5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM
   i64 shifts, allowing them to work with LegalizeDAGTypes.
6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT,
   allowing them to work with LegalizeDAGTypes.

LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when
type legalization in LegalizeDAG is ifdef'd out.

llvm-svn: 44300

f81d5886

Nov 16, 2007
- Implement codegen for flt_rounds on x86 · 91460e43
  Anton Korobeynikov authored Nov 16, 2007
```
llvm-svn: 44183
```
  91460e43
Nov 13, 2007

Unify CALLSEQ_{START,END}. They take 4 parameters: the chain, two stack · f359fed9

Bill Wendling authored Nov 13, 2007

adjustment fields, and an optional flag. If there is a "dynamic_stackalloc" in
the code, make sure that it's bracketed by CALLSEQ_START and CALLSEQ_END. If
not, then there is the potential for the stack to be changed while the stack's
being used by another instruction (like a call).

This can only result in tears...

llvm-svn: 44037

f359fed9

Nov 10, 2007
- Update tailcall code to include inline attribute operand for memcpy. · d2c16ff9
  Arnold Schwaighofer authored Nov 10, 2007
```
llvm-svn: 43978
```
  d2c16ff9
Nov 09, 2007

Much improved pic jumptable codegen: · 797d56ff

Evan Cheng authored Nov 09, 2007

Then:
        call    "L1$pb"
"L1$pb":
        popl    %eax
		...
LBB1_1: # entry
        imull   $4, %ecx, %ecx
        leal    LJTI1_0-"L1$pb"(%eax), %edx
        addl    LJTI1_0-"L1$pb"(%ecx,%eax), %edx
        jmpl    *%edx

        .align  2
        .set L1_0_set_3,LBB1_3-LJTI1_0
        .set L1_0_set_2,LBB1_2-LJTI1_0
        .set L1_0_set_5,LBB1_5-LJTI1_0
        .set L1_0_set_4,LBB1_4-LJTI1_0
LJTI1_0:
        .long    L1_0_set_3
        .long    L1_0_set_2

Now:
        call    "L1$pb"
"L1$pb":
        popl    %eax
		...
LBB1_1: # entry
        addl    LJTI1_0-"L1$pb"(%eax,%ecx,4), %eax
        jmpl    *%eax

		.align  2
		.set L1_0_set_3,LBB1_3-"L1$pb"
		.set L1_0_set_2,LBB1_2-"L1$pb"
		.set L1_0_set_5,LBB1_5-"L1$pb"
		.set L1_0_set_4,LBB1_4-"L1$pb"
LJTI1_0:
        .long    L1_0_set_3
        .long    L1_0_set_2

llvm-svn: 43924

797d56ff

Nov 06, 2007
- Move the LowerMEMCPY and LowerMEMCPYCall to a common place. · fa0df55b
  Rafael Espindola authored Nov 05, 2007
```
Thanks for the suggestions Bill :-)

llvm-svn: 43742
```
  fa0df55b
Nov 04, 2007
- Fix PR1763 by allowing the 'q' constraint to work with 64-bit · 296160d4
  Chris Lattner authored Nov 04, 2007
```
regs on x86-64.

llvm-svn: 43669
```
  296160d4
Nov 02, 2007
- Unbreak tailcall opt. · 2b93a20b
  Evan Cheng authored Nov 02, 2007
```
llvm-svn: 43646
```
  2b93a20b
- Missing a getNumOperands check. · e453ff49
  Evan Cheng authored Nov 02, 2007
```
llvm-svn: 43630
```
  e453ff49
Oct 31, 2007
- Make ARM and X86 LowerMEMCPY identical by moving the isThumb check into getMaxInlineSizeThreshold · 419b6d7c
  Rafael Espindola authored Oct 31, 2007
```
and by restructuring the X86 version.

New I just have to move this to a common place :-)

llvm-svn: 43554
```
  419b6d7c
- Make ARM an X86 memcpy expansion more similar to each other. · 063f1773
  Rafael Espindola authored Oct 31, 2007
```
Now both subtarget define getMaxInlineSizeThreshold and the expansion uses it.

This should not change generated code.

llvm-svn: 43552
```
  063f1773
- Make i64=expand_vector_elt(v2i64) work in 32-bit mode. · b066c1f2
  Dale Johannesen authored Oct 31, 2007
```
llvm-svn: 43535
```
  b066c1f2
Oct 30, 2007
- Add missing MMX PSUBQ. · 6aa304e5
  Dale Johannesen authored Oct 30, 2007
```
llvm-svn: 43488
```
  6aa304e5
Oct 29, 2007

Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) · e106e2f1

Evan Cheng authored Oct 29, 2007

transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).

llvm-svn: 43465

e106e2f1

Avoid doing something dumb like rewriting using a 64-bit iv in 32-bit mode. · 7b3f7fea
Evan Cheng authored Oct 29, 2007
```
llvm-svn: 43446
```
7b3f7fea

Oct 26, 2007

Loosen up iv reuse to allow reuse of the same stride but a larger type when... · 7f3d0247

Evan Cheng authored Oct 26, 2007

Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free.
e.g.
Turns this loop:
LBB1_1: # entry.bb_crit_edge
        xorl    %ecx, %ecx
        xorw    %dx, %dx
        movw    %dx, %si
LBB1_2: # bb
        movl    L_X$non_lazy_ptr, %edi
        movw    %si, (%edi)
        movl    L_Y$non_lazy_ptr, %edi
        movw    %dx, (%edi)
		addw    $4, %dx
		incw    %si
		incl    %ecx
		cmpl    %eax, %ecx
		jne     LBB1_2  # bb
	
into

LBB1_1: # entry.bb_crit_edge
        xorl    %ecx, %ecx
        xorw    %dx, %dx
LBB1_2: # bb
        movl    L_X$non_lazy_ptr, %esi
        movw    %cx, (%esi)
        movl    L_Y$non_lazy_ptr, %esi
        movw    %dx, (%esi)
        addw    $4, %dx
		incl    %ecx
        cmpl    %eax, %ecx
        jne     LBB1_2  # bb

llvm-svn: 43375

7f3d0247

Oct 21, 2007
- Allow for copysign having f80 second argument. · 8ee70112
  Dale Johannesen authored Oct 21, 2007
```
Fixes 5550319.

llvm-svn: 43205
```
  8ee70112
Oct 19, 2007

Add support for byval function whose argument is not 32 bit aligned. · 846c19dd

Rafael Espindola authored Oct 19, 2007

To do this it is necessary to add a "always inline" argument to the
memcpy node. For completeness I have also added this node to memmove
and memset.  I have also added getMem* functions, because the extra
argument makes it cumbersome to use getNode and because I get confused
by it :-)

llvm-svn: 43172

846c19dd

Oct 17, 2007
- Change fp to sint legalization on x86-32 to do 2 x i32 · 12d5da49
  Chris Lattner authored Oct 17, 2007
```
loads instead of 1 x i64 loads.  This doesn't change any functionality yet.

llvm-svn: 43068
```
  12d5da49
- fix some funny indentation, add comments. · 693cbead
  Chris Lattner authored Oct 17, 2007
```
llvm-svn: 43066
```
  693cbead
Oct 16, 2007
- Check for invalid cc's in f80 select. · e5530a35
  Dale Johannesen authored Oct 16, 2007
```
llvm-svn: 43033
```
  e5530a35