Commits · b9f38f38fa06317ade6b29ad1e86a2ed86eac672 · Roger Ferrer / llvm-epi-0.8

Apr 12, 2008

This patch corrects the handling of byval arguments for tailcall · 634fc9a3

Arnold Schwaighofer authored Apr 12, 2008

optimized x86-64 (and x86) calls so that they work (... at least for
my test cases).

Should fix the following problems:

Problem 1: When i introduced the optimized handling of arguments for
tail called functions (using a sequence of copyto/copyfrom virtual
registers instead of always lowering to top of the stack) i did not
handle byval arguments correctly e.g they did not work at all :).

Problem 2: On x86-64 after the arguments of the tail called function
are moved to their registers (which include ESI/RSI etc), tail call
optimization performs byval lowering which causes xSI,xDI, xCX
registers to be overwritten. This is handled in this patch by moving
the arguments to virtual registers first and after the byval lowering
the arguments are moved from those virtual registers back to
RSI/RDI/RCX.

llvm-svn: 49584

634fc9a3

Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal · 544ab2c5

Dan Gohman authored Apr 12, 2008

on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.

Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.

This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.

Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.

This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.

llvm-svn: 49572

544ab2c5

Apr 09, 2008
- Make isVectorClearMaskLegal's operand list const. · 33b33001
  Dan Gohman authored Apr 09, 2008
```
llvm-svn: 49446
```
  33b33001
Mar 21, 2008

remove Evan's "ugly hack" that sorta attempted to get · 68b11e14

Chris Lattner authored Mar 21, 2008

x86-64 return conventions correct, but was never enabled.
We can now do the "right thing" with multiple return values.

llvm-svn: 48635

68b11e14

Mar 19, 2008
- Don't loose incoming argument registers. Fix documentation style. · 7da2bceb
  Arnold Schwaighofer authored Mar 19, 2008
```
llvm-svn: 48545
```
  7da2bceb
Mar 10, 2008
- Eliminate the FP_GET_ST0/FP_SET_ST0 target-specific dag nodes, just lower to · 4b3a7fa8
  Chris Lattner authored Mar 10, 2008
```
copyfromreg/copytoreg instead.

llvm-svn: 48174
```
  4b3a7fa8
- Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's · a6729e86
  Scott Michel authored Mar 10, 2008
```
return ValueType can depend its operands' ValueType.

This is a cosmetic change, no functionality impacted.

llvm-svn: 48145
```
  a6729e86
Mar 09, 2008
- rename FP_SETRESULT -> FP_SET_ST0 · 4c869594
  Chris Lattner authored Mar 09, 2008
```
llvm-svn: 48094
```
  4c869594
- rename FpGETRESULT32 -> FpGET_ST0_32 etc. Add support for · d587e580
  Chris Lattner authored Mar 09, 2008
```
isel'ing value preserving FP roundings from one fp stack reg to another
into a noop, instead of stack traffic.

llvm-svn: 48093
```
  d587e580
Mar 05, 2008

Add a target lowering hook to control whether it's worthwhile to compress fp constant. · 0a62cb44

Evan Cheng authored Mar 05, 2008

For x86, if sse2 is available, it's not a good idea since cvtss2sd is slower than a movsd load and it prevents load folding. On x87, it's important to shrink fp constant since fldt is very expensive.

llvm-svn: 47931

0a62cb44

64bit CAS on 32bit x86. · 357061a7
Andrew Lenharth authored Mar 05, 2008
```
llvm-svn: 47929
```
357061a7

Mar 01, 2008
- all but CAS working on x86 · d032c333
  Andrew Lenharth authored Mar 01, 2008
```
llvm-svn: 47798
```
  d032c333
Feb 26, 2008

Refactor according to Evan's and Anton's suggestions. · 3bfca3e9
Arnold Schwaighofer authored Feb 26, 2008
```
llvm-svn: 47635
```
3bfca3e9

Change the lowering of arguments for tail call optimized · b01b99ec

Arnold Schwaighofer authored Feb 26, 2008

calls. Before arguments that could overwrite each other were
explicitly lowered to a stack slot, not giving the register allocator
a chance to optimize. Now a sequence of copyto/copyfrom virtual
registers ensures that arguments are loaded in (virtual) registers
before they are lowered to the stack slot (and might overwrite each
other). Also parameter stack slots are marked mutable for
(potentially) tail calling functions.

llvm-svn: 47593

b01b99ec

Feb 19, 2008

- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should... · 6200c225

Evan Cheng authored Feb 18, 2008

- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type.
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.

llvm-svn: 47290

6200c225

Feb 13, 2008
- Simplify some logic in ComputeMaskedBits. And change ComputeMaskedBits · e1d9ee66
  Dan Gohman authored Feb 13, 2008
```
to pass the mask APInt by value, not by reference. 

llvm-svn: 47096
```
  e1d9ee66
- Convert SelectionDAG::ComputeMaskedBits to use APInt instead of uint64_t. · f990faf2
  Dan Gohman authored Feb 13, 2008
```
Add an overload that supports the uint64_t interface for use by clients
that haven't been updated yet.

llvm-svn: 47039
```
  f990faf2
Feb 11, 2008
- Enable SSE4 codegen and pattern matching. · 2d77e8e4
  Nate Begeman authored Feb 11, 2008
```
Add some notes to the README.

llvm-svn: 46949
```
  2d77e8e4
Feb 10, 2008
- Rename MRegisterInfo to TargetRegisterInfo. · 3a4be0fd
  Dan Gohman authored Feb 10, 2008
```
llvm-svn: 46930
```
  3a4be0fd
Jan 31, 2008
- Rename ISD::FLT_ROUNDS to ISD::FLT_ROUNDS_ to avoid conflicting · 9ba4d768
  Dan Gohman authored Jan 31, 2008
```
with the real FLT_ROUNDS (defined in <float.h>).

llvm-svn: 46587
```
  9ba4d768
Jan 30, 2008

Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper... · 29cfb67e

Evan Cheng authored Jan 30, 2008

Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper name. Rename it to EmitInstrWithCustomInserter since it does not necessarily insert
instruction at the end.

llvm-svn: 46562

29cfb67e

Jan 29, 2008

Work in progress. This patch *fixes* x86-64 calls which are modelled as... · 084a1cdc

Evan Cheng authored Jan 29, 2008

Work in progress. This patch *fixes* x86-64 calls which are modelled as StructRet but really should be return in registers, e.g. _Complex long double, some 128-bit aggregates. This is a short term solution that is necessary only because llvm, for now, cannot model i128 nor call's with multiple results.
Status: This only works for direct calls, and only the caller side is done. Disabled for now.

llvm-svn: 46527

084a1cdc

Handle 'X' constraint in asm's better. · 2b3bc304
Dale Johannesen authored Jan 29, 2008
```
llvm-svn: 46485
```
2b3bc304

Jan 24, 2008

Let each target decide byval alignment. For X86, it's 4-byte unless the... · 35abd840

Evan Cheng authored Jan 23, 2008

Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type.

llvm-svn: 46286

35abd840

Jan 18, 2008
- make a method public · 7dc00e80
  Chris Lattner authored Jan 18, 2008
```
llvm-svn: 46159
```
  7dc00e80
Jan 16, 2008
- make it more clear that this predicate only applies to scalar FP types. · e8bb9f21
  Chris Lattner authored Jan 16, 2008
```
llvm-svn: 46058
```
  e8bb9f21
- introduce a isTypeInSSEReg predicate, which allows us to simplify · 14e616ef
  Chris Lattner authored Jan 16, 2008
```
some code.  No functionality change.

llvm-svn: 46055
```
  14e616ef
Jan 15, 2008
- no need to expand ISD::TRAP to X86ISD::TRAP, just match ISD::TRAP. · 3c3fefde
  Chris Lattner authored Jan 15, 2008
```
llvm-svn: 46015
```
  3c3fefde
- For PR1839: add initial support for __builtin_trap. llvm-gcc part is missed · 6bbbc4cb
  Anton Korobeynikov authored Jan 15, 2008
```
as well as PPC codegen

llvm-svn: 46001
```
  6bbbc4cb
Jan 05, 2008

Refactoring the x86 and x86-64 calling convention implementations, · 92319583

Gordon Henriksen authored Jan 05, 2008

unifying the copied algorithms and saving over 500 LOC. There should
be no functionality change, but please test on your favorite x86
target.

llvm-svn: 45627

92319583

Dec 29, 2007
- Remove attribution from file headers, per discussion on llvmdev. · f3ebc3f3
  Chris Lattner authored Dec 29, 2007
```
llvm-svn: 45418
```
  f3ebc3f3
Dec 14, 2007
- Implement ctlz and cttz with bsr and bsf. · e9fbc3f0
  Evan Cheng authored Dec 14, 2007
```
llvm-svn: 45024
```
  e9fbc3f0
Nov 24, 2007

Several changes: · f81d5886

Chris Lattner authored Nov 24, 2007

1) Change the interface to TargetLowering::ExpandOperationResult to 
   take and return entire NODES that need a result expanded, not just
   the value.  This allows us to handle things like READCYCLECOUNTER,
   which returns two values.
2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES.
3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new
   ExpandOperationResult.  This makes the result simpler and fully 
   general.
4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes.
5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM
   i64 shifts, allowing them to work with LegalizeDAGTypes.
6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT,
   allowing them to work with LegalizeDAGTypes.

LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when
type legalization in LegalizeDAG is ifdef'd out.

llvm-svn: 44300

f81d5886

Nov 16, 2007
- Implement codegen for flt_rounds on x86 · 91460e43
  Anton Korobeynikov authored Nov 16, 2007
```
llvm-svn: 44183
```
  91460e43
Nov 09, 2007

Much improved pic jumptable codegen: · 797d56ff

Evan Cheng authored Nov 09, 2007

Then:
        call    "L1$pb"
"L1$pb":
        popl    %eax
		...
LBB1_1: # entry
        imull   $4, %ecx, %ecx
        leal    LJTI1_0-"L1$pb"(%eax), %edx
        addl    LJTI1_0-"L1$pb"(%ecx,%eax), %edx
        jmpl    *%edx

        .align  2
        .set L1_0_set_3,LBB1_3-LJTI1_0
        .set L1_0_set_2,LBB1_2-LJTI1_0
        .set L1_0_set_5,LBB1_5-LJTI1_0
        .set L1_0_set_4,LBB1_4-LJTI1_0
LJTI1_0:
        .long    L1_0_set_3
        .long    L1_0_set_2

Now:
        call    "L1$pb"
"L1$pb":
        popl    %eax
		...
LBB1_1: # entry
        addl    LJTI1_0-"L1$pb"(%eax,%ecx,4), %eax
        jmpl    *%eax

		.align  2
		.set L1_0_set_3,LBB1_3-"L1$pb"
		.set L1_0_set_2,LBB1_2-"L1$pb"
		.set L1_0_set_5,LBB1_5-"L1$pb"
		.set L1_0_set_4,LBB1_4-"L1$pb"
LJTI1_0:
        .long    L1_0_set_3
        .long    L1_0_set_2

llvm-svn: 43924

797d56ff

Nov 06, 2007
- Move the LowerMEMCPY and LowerMEMCPYCall to a common place. · fa0df55b
  Rafael Espindola authored Nov 05, 2007
```
Thanks for the suggestions Bill :-)

llvm-svn: 43742
```
  fa0df55b
Oct 29, 2007

Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) · e106e2f1

Evan Cheng authored Oct 29, 2007

transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).

llvm-svn: 43465

e106e2f1

Oct 26, 2007

Loosen up iv reuse to allow reuse of the same stride but a larger type when... · 7f3d0247

Evan Cheng authored Oct 26, 2007

Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free.
e.g.
Turns this loop:
LBB1_1: # entry.bb_crit_edge
        xorl    %ecx, %ecx
        xorw    %dx, %dx
        movw    %dx, %si
LBB1_2: # bb
        movl    L_X$non_lazy_ptr, %edi
        movw    %si, (%edi)
        movl    L_Y$non_lazy_ptr, %edi
        movw    %dx, (%edi)
		addw    $4, %dx
		incw    %si
		incl    %ecx
		cmpl    %eax, %ecx
		jne     LBB1_2  # bb
	
into

LBB1_1: # entry.bb_crit_edge
        xorl    %ecx, %ecx
        xorw    %dx, %dx
LBB1_2: # bb
        movl    L_X$non_lazy_ptr, %esi
        movw    %cx, (%esi)
        movl    L_Y$non_lazy_ptr, %esi
        movw    %dx, (%esi)
        addw    $4, %dx
		incl    %ecx
        cmpl    %eax, %ecx
        jne     LBB1_2  # bb

llvm-svn: 43375

7f3d0247

Oct 11, 2007

Added tail call optimization to the x86 back end. It can be · 9ccea991

Arnold Schwaighofer authored Oct 11, 2007

enabled by passing -tailcallopt to llc.  The optimization is
performed if the following conditions are satisfied:
* caller/callee are fastcc
* elf/pic is disabled OR
  elf/pic enabled + callee is in module + callee has
  visibility protected or hidden

llvm-svn: 42870

9ccea991

Oct 09, 2007
- LowerIntegerDivOrRem no longer exists. · e8c8ef52
  Dan Gohman authored Oct 09, 2007
```
llvm-svn: 42787
```
  e8c8ef52