Commits · b01b99ec78da7ce6bb4d499378abbd899ac77a4e · Roger Ferrer / llvm-epi-0.8

Feb 26, 2008

Change the lowering of arguments for tail call optimized · b01b99ec

Arnold Schwaighofer authored Feb 26, 2008

calls. Before arguments that could overwrite each other were
explicitly lowered to a stack slot, not giving the register allocator
a chance to optimize. Now a sequence of copyto/copyfrom virtual
registers ensures that arguments are loaded in (virtual) registers
before they are lowered to the stack slot (and might overwrite each
other). Also parameter stack slots are marked mutable for
(potentially) tail calling functions.

llvm-svn: 47593

b01b99ec

Feb 25, 2008
- Revise previous patch per review. · 65b404d6
  Dale Johannesen authored Feb 25, 2008
```
llvm-svn: 47573
```
  65b404d6
- Expand removal of MMX memory copies to allow 1 level · 32d84b17
  Dale Johannesen authored Feb 25, 2008
```
of TokenFactor underneath chain (seems to be enough)

llvm-svn: 47554
```
  32d84b17
Feb 22, 2008

Split ParameterAttributes.h, putting the complicated · 09f410b6
Dale Johannesen authored Feb 22, 2008
```
stuff into ParamAttrsList.h.  Per feedback from
ParamAttrs changes.

llvm-svn: 47504
```
09f410b6

copy mmx values from/to memory with GPRs on x86-32 · ab8bfc28

Chris Lattner authored Feb 22, 2008

instead of with mmx registers.  This horribleness is apparently
done by gcc to avoid having to insert emms in places that really 
should have it.  This is the second half of rdar://5741668.

llvm-svn: 47474

ab8bfc28

Start using GPR's to copy around mmx value instead of mmx regs. · 997b3a65

Chris Lattner authored Feb 22, 2008

GCC apparently does this, and code depends on not having to do
emms when this happens.  This is x86-64 only so far, second half
should handle x86-32.

rdar://5741668

llvm-svn: 47470

997b3a65

Feb 20, 2008
- Remove bunch of gcc 4.3-related warnings from Target · 40d67c59
  Anton Korobeynikov authored Feb 20, 2008
```
llvm-svn: 47369
```
  40d67c59
Feb 19, 2008

- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should... · 6200c225

Evan Cheng authored Feb 18, 2008

- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type.
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.

llvm-svn: 47290

6200c225

Feb 18, 2008

Chris pointed out that it's not necessary to set i64 MUL to Expand · c5892431
Dan Gohman authored Feb 18, 2008
```
on x86-32 since i64 itself is not a Legal type. And, update some
comments.

llvm-svn: 47282
```
c5892431

Don't mark scalar integer multiplication as Expand on x86, since x86 · a589ee11

Dan Gohman authored Feb 18, 2008

has plain one-result scalar integer multiplication instructions.
This avoids expanding such instructions into MUL_LOHI sequences that
must be special-cased at isel time, and avoids the problem with that
code that provented memory operands from being folded.

This fixes PR1874, addressesing the most common case. The uncommon
cases of optimizing multiply-high operations will require work
in DAGCombiner.

llvm-svn: 47277

a589ee11

Feb 16, 2008

I cannot find a libgcc function for this builtin. Therefor expanding it to a... · fedcf477

Andrew Lenharth authored Feb 16, 2008

I cannot find a libgcc function for this builtin.  Therefor expanding it to a noop (which is how it use to be treated).  If someone who knows the x86 backend better than me could tell me how to get a lock prefix on an instruction, that would be nice to complete x86 support.

llvm-svn: 47213

fedcf477

Feb 14, 2008

In TargetLowering::LowerCallTo, don't assert that · 4c95dbd6

Duncan Sands authored Feb 14, 2008

the return value is zero-extended if it isn't
sign-extended.  It may also be any-extended.
Also, if a floating point value was returned
in a larger floating point type, pass 1 as the
second operand to FP_ROUND, which tells it
that all the precision is in the original type.
I think this is right but I could be wrong.
Finally, when doing libcalls, set isZExt on
a parameter if it is "unsigned".  Currently
isSExt is set when signed, and nothing is
set otherwise.  This should be right for all
calls to standard library routines.

llvm-svn: 47122

4c95dbd6

Change how FP immediates are handled. · 53e1b3f9

Nate Begeman authored Feb 14, 2008

1) ConstantFP is now expand by default
2) ConstantFP is not turned into TargetConstantFP during Legalize
   if it is legal.

This allows ConstantFP to be handled like Constant, allowing for 
targets that can encode FP immediates as MachineOperands.

As a bonus, fix up Itanium FP constants, which now correctly match,
and match more constants!  Hooray.

llvm-svn: 47121

53e1b3f9

Assigning an APInt to 0 with plain assignment gives it a one-bit · 9ca025f1
Dan Gohman authored Feb 13, 2008
```
size. Initialize these APInts to properly-sized zero values.

llvm-svn: 47099
```
9ca025f1

Feb 13, 2008
- Simplify some logic in ComputeMaskedBits. And change ComputeMaskedBits · e1d9ee66
  Dan Gohman authored Feb 13, 2008
```
to pass the mask APInt by value, not by reference. 

llvm-svn: 47096
```
  e1d9ee66
- Convert SelectionDAG::ComputeMaskedBits to use APInt instead of uint64_t. · f990faf2
  Dan Gohman authored Feb 13, 2008
```
Add an overload that supports the uint64_t interface for use by clients
that haven't been updated yet.

llvm-svn: 47039
```
  f990faf2
Feb 12, 2008
- SSE4.1 64b integer insert/extract pattern support · 8ef50214
  Nate Begeman authored Feb 12, 2008
```
Move formats into the formats file

llvm-svn: 47035
```
  8ef50214
- Unbreak various insert_vector_elt and extract_vector_elt tests in presence of SSE4. · 4d8c98b8
  Evan Cheng authored Feb 12, 2008
```
llvm-svn: 47001
```
  4d8c98b8
Feb 11, 2008
- Enable SSE4 codegen and pattern matching. · 2d77e8e4
  Nate Begeman authored Feb 11, 2008
```
Add some notes to the README.

llvm-svn: 46949
```
  2d77e8e4
Feb 10, 2008
- Rename MRegisterInfo to TargetRegisterInfo. · 3a4be0fd
  Dan Gohman authored Feb 10, 2008
```
llvm-svn: 46930
```
  3a4be0fd
Feb 08, 2008
- 64-bit (MMX) vectors do not need restrictive alignment. · 36c2967d
  Dale Johannesen authored Feb 08, 2008
```
128-bit vectors need it only when SSE is on.

llvm-svn: 46890
```
  36c2967d
- Avoid needlessly casting away const qualifiers. · 7a55a94b
  Dan Gohman authored Feb 08, 2008
```
llvm-svn: 46877
```
  7a55a94b
Feb 07, 2008
- Follow Chris' suggestion; change the PseudoSourceValue accessors · 16d4bc3d
  Dan Gohman authored Feb 07, 2008
```
to return pointers instead of references, since this is always what
is needed.

llvm-svn: 46857
```
  16d4bc3d
- Add SourceValue information for outgoing argument stores on x86. · 63a8452e
  Dan Gohman authored Feb 07, 2008
```
llvm-svn: 46854
```
  63a8452e
Feb 06, 2008

Re-apply the memory operand changes, with a fix for the static · 2d489b50

Dan Gohman authored Feb 06, 2008

initializer problem, a minor tweak to the way the
DAGISelEmitter finds load/store nodes, and a renaming of the
new PseudoSourceValue objects.

llvm-svn: 46827

2d489b50

Feb 05, 2008
- Implement sseregparm. · d88f1d06
  Dale Johannesen authored Feb 05, 2008
```
llvm-svn: 46764
```
  d88f1d06
Feb 02, 2008

Don't use uninitialized values. Fixes vec_align.ll on X86 Linux. · f5b9938e
Nick Lewycky authored Feb 02, 2008
```
llvm-svn: 46666
```
f5b9938e

SDIsel processes llvm.dbg.declare by recording the variable debug information... · efd142a9

Evan Cheng authored Feb 02, 2008

SDIsel processes llvm.dbg.declare by recording the variable debug information descriptor and its corresponding stack frame index in MachineModuleInfo. This only works if the local variable is "homed" in the stack frame. It does not work for byval parameter, etc.
Added ISD::DECLARE node type to represent llvm.dbg.declare intrinsic. Now the intrinsic calls are lowered into a SDNode and lives on through out the codegen passes.
For now, since all the debugging information recording is done at isel time, when a ISD::DECLARE node is selected, it has the side effect of also recording the variable. This is a short term solution that should be fixed in time.

llvm-svn: 46659

efd142a9

Jan 31, 2008

Revert 46556 and 46585. Dan please fix the PseudoSourceValue problem and re-commit. · 27b32b87
Evan Cheng authored Jan 31, 2008
```
llvm-svn: 46623
```
27b32b87
Avoid unnecessarily casting away const. · ed346f2e
Dan Gohman authored Jan 31, 2008
```
llvm-svn: 46590
```
ed346f2e
Rename ISD::FLT_ROUNDS to ISD::FLT_ROUNDS_ to avoid conflicting · 9ba4d768
Dan Gohman authored Jan 31, 2008
```
with the real FLT_ROUNDS (defined in <float.h>).

llvm-svn: 46587
```
9ba4d768

Create a new class, MemOperand, for describing memory references · 3646fdda

Dan Gohman authored Jan 31, 2008

in the backend. Introduce a new SDNode type, MemOperandSDNode, for
holding a MemOperand in the SelectionDAG IR, and add a MemOperand
list to MachineInstr, and code to manage them. Remove the offset
field from SrcValueSDNode; uses of SrcValueSDNode that were using
it are all all using MemOperandSDNode now.

Also, begin updating some getLoad and getStore calls to use the
PseudoSourceValue objects.

Most of this was written by Florian Brander, some
reorganization and updating to TOT by me.

llvm-svn: 46585

3646fdda

Jan 30, 2008

Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper... · 29cfb67e

Evan Cheng authored Jan 30, 2008

Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper name. Rename it to EmitInstrWithCustomInserter since it does not necessarily insert
instruction at the end.

llvm-svn: 46562

29cfb67e

Jan 29, 2008

Work in progress. This patch *fixes* x86-64 calls which are modelled as... · 084a1cdc

Evan Cheng authored Jan 29, 2008

Work in progress. This patch *fixes* x86-64 calls which are modelled as StructRet but really should be return in registers, e.g. _Complex long double, some 128-bit aggregates. This is a short term solution that is necessary only because llvm, for now, cannot model i128 nor call's with multiple results.
Status: This only works for direct calls, and only the caller side is done. Disabled for now.

llvm-svn: 46527

084a1cdc

Handle 'X' constraint in asm's better. · 2b3bc304
Dale Johannesen authored Jan 29, 2008
```
llvm-svn: 46485
```
2b3bc304

Jan 27, 2008
- Use fldz and fld1 for long double constants instead of a constant pool load. · d05d2011
  Chris Lattner authored Jan 27, 2008
```
llvm-svn: 46411
```
  d05d2011
Jan 26, 2008
- Remove some code for inferring alignment info from the x86 backend · 250789f1
  Chris Lattner authored Jan 26, 2008
```
now that the dag combiner does it.

llvm-svn: 46404
```
  250789f1
Jan 25, 2008

optimize fxor like for · f4523c35
Chris Lattner authored Jan 25, 2008
```
llvm-svn: 46345
```
f4523c35

Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows · 84ab724e

Chris Lattner authored Jan 25, 2008

us to compile:

double test(double X) {
  return copysign(0.0, X);
}

into:

_test:
	andpd	LCPI1_0(%rip), %xmm0
	ret

instead of:
_test:
	pxor	%xmm1, %xmm1
	andpd	LCPI1_0(%rip), %xmm1
	movapd	%xmm0, %xmm2
	andpd	LCPI1_1(%rip), %xmm2
	movapd	%xmm1, %xmm0
	orpd	%xmm2, %xmm0
	ret

llvm-svn: 46344

84ab724e

Jan 24, 2008

Significantly simplify and improve handling of FP function results on x86-32. · a91f77ea

Chris Lattner authored Jan 24, 2008

This case returns the value in ST(0) and then has to convert it to an SSE
register.  This causes significant codegen ugliness in some cases.  For 
example in the trivial fp-stack-direct-ret.ll testcase we used to generate:

_bar:
	subl	$28, %esp
	call	L_foo$stub
	fstpl	16(%esp)
	movsd	16(%esp), %xmm0
	movsd	%xmm0, 8(%esp)
	fldl	8(%esp)
	addl	$28, %esp
	ret

because we move the result of foo() into an XMM register, then have to
move it back for the return of bar.

Instead of hacking ever-more special cases into the call result lowering code
we take a much simpler approach: on x86-32, fp return is modeled as always 
returning into an f80 register which is then truncated to f32 or f64 as needed.
Similarly for a result, we model it as an extension to f80 + return.

This exposes the truncate and extensions to the dag combiner, allowing target
independent code to hack on them, eliminating them in this case.  This gives 
us this code for the example above:

_bar:
	subl	$12, %esp
	call	L_foo$stub
	addl	$12, %esp
	ret

The nasty aspect of this is that these conversions are not legal, but we want
the second pass of dag combiner (post-legalize) to be able to hack on them.
To handle this, we lie to legalize and say they are legal, then custom expand
them on entry to the isel pass (PreprocessForFPConvert).  This is gross, but
less gross than the code it is replacing :)

This also allows us to generate better code in several other cases.  For 
example on fp-stack-ret-conv.ll, we now generate:

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstps	8(%esp)
	movl	16(%esp), %eax
	cvtss2sd	8(%esp), %xmm0
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

where before we produced (incidentally, the old bad code is identical to what
gcc produces):

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstpl	(%esp)
	cvtsd2ss	(%esp), %xmm0
	cvtss2sd	%xmm0, %xmm0
	movl	16(%esp), %eax
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

Note that we generate slightly worse code on pr1505b.ll due to a scheduling 
deficiency that is unrelated to this patch.

llvm-svn: 46307

a91f77ea