Commits · e14fdfaecd87eb7e588022aa34891b9204367bb1 · Roger Ferrer / llvm-epi-0.8

Feb 03, 2008
- SSE 4.1 Intrinsics and detection · e14fdfae
  Nate Begeman authored Feb 03, 2008
```
llvm-svn: 46681
```
  e14fdfae
Jan 24, 2008

Significantly simplify and improve handling of FP function results on x86-32. · a91f77ea

Chris Lattner authored Jan 24, 2008

This case returns the value in ST(0) and then has to convert it to an SSE
register.  This causes significant codegen ugliness in some cases.  For 
example in the trivial fp-stack-direct-ret.ll testcase we used to generate:

_bar:
	subl	$28, %esp
	call	L_foo$stub
	fstpl	16(%esp)
	movsd	16(%esp), %xmm0
	movsd	%xmm0, 8(%esp)
	fldl	8(%esp)
	addl	$28, %esp
	ret

because we move the result of foo() into an XMM register, then have to
move it back for the return of bar.

Instead of hacking ever-more special cases into the call result lowering code
we take a much simpler approach: on x86-32, fp return is modeled as always 
returning into an f80 register which is then truncated to f32 or f64 as needed.
Similarly for a result, we model it as an extension to f80 + return.

This exposes the truncate and extensions to the dag combiner, allowing target
independent code to hack on them, eliminating them in this case.  This gives 
us this code for the example above:

_bar:
	subl	$12, %esp
	call	L_foo$stub
	addl	$12, %esp
	ret

The nasty aspect of this is that these conversions are not legal, but we want
the second pass of dag combiner (post-legalize) to be able to hack on them.
To handle this, we lie to legalize and say they are legal, then custom expand
them on entry to the isel pass (PreprocessForFPConvert).  This is gross, but
less gross than the code it is replacing :)

This also allows us to generate better code in several other cases.  For 
example on fp-stack-ret-conv.ll, we now generate:

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstps	8(%esp)
	movl	16(%esp), %eax
	cvtss2sd	8(%esp), %xmm0
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

where before we produced (incidentally, the old bad code is identical to what
gcc produces):

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstpl	(%esp)
	cvtsd2ss	(%esp), %xmm0
	cvtss2sd	%xmm0, %xmm0
	movl	16(%esp), %eax
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

Note that we generate slightly worse code on pr1505b.ll due to a scheduling 
deficiency that is unrelated to this patch.

llvm-svn: 46307

a91f77ea

Jan 11, 2008
- add some missing flags. · f4b0c99d
  Chris Lattner authored Jan 11, 2008
```
llvm-svn: 45859
```
  f4b0c99d
Jan 10, 2008

Start inferring side effect information more aggressively, and fix many bugs in the · 317332fc

Chris Lattner authored Jan 10, 2008

x86 backend where instructions were not marked maystore/mayload, and perf issues where
instructions were not marked neverHasSideEffects.  It would be really nice if we could
write patterns for copy instructions.

I have audited all the x86 instructions down to MOVDQAmr.  The flags on others and on
other targets are probably not right in all cases, but no clients currently use this
info that are enabled by default.

llvm-svn: 45829

317332fc

remove explicit sets of 'neverHasSideEffects' that can now be · aca7ca37
Chris Lattner authored Jan 10, 2008
```
inferred from the instr patterns.

llvm-svn: 45824
```
aca7ca37

Jan 07, 2008
- rename isLoad -> isSimpleLoad due to evan's desire to have such a predicate. · a4ce4f69
  Chris Lattner authored Jan 06, 2008
```
llvm-svn: 45667
```
  a4ce4f69
Dec 29, 2007
- Remove attribution from file headers, per discussion on llvmdev. · f3ebc3f3
  Chris Lattner authored Dec 29, 2007
```
llvm-svn: 45418
```
  f3ebc3f3
Dec 20, 2007
- Fix JIT encoding for CMPSD as well. · 01c7c198
  Evan Cheng authored Dec 20, 2007
```
llvm-svn: 45268
```
  01c7c198
Dec 18, 2007

Add "mayHaveSideEffects" and "neverHasSideEffects" flags to some instructions. I · b3d85a5d

Bill Wendling authored Dec 17, 2007

based what flag to set on whether it was already marked as
"isRematerializable". If there was a further check to determine if it's "really"
rematerializable, then I marked it as "mayHaveSideEffects" and created a check
in the X86 back-end similar to the remat one.

llvm-svn: 45132

b3d85a5d

Dec 16, 2007

Fix the JIT encoding of cmp*ss, which aborts with this assertion currently: · dab6bd90

Chris Lattner authored Dec 16, 2007

X86CodeEmitter.cpp:378: failed assertion `0 && "Immediate size not set!"'

I *think* this is right, but Evan, please verify.  It also looks like
CMPSDrr and maybe others are missing this info.  Evan, plz investigate.

llvm-svn: 45074

dab6bd90

Dec 15, 2007
- Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs. · 23d2d4dc
  Evan Cheng authored Dec 15, 2007
```
llvm-svn: 45058
```
  23d2d4dc
Dec 13, 2007

Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always... · 6e68381e

Evan Cheng authored Dec 12, 2007

Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled.

llvm-svn: 44960

6e68381e

Dec 06, 2007
- Remove a bogus optimization. It's not possible to do a move to low element to... · c829e5cd
  Evan Cheng authored Dec 06, 2007
```
Remove a bogus optimization. It's not possible to do a move to low element to a <8 x i16> or <16 x i8> vector.

llvm-svn: 44669
```
  c829e5cd
Nov 25, 2007

Fix a long standing deficiency in the X86 backend: we would · 5728bdd4

Chris Lattner authored Nov 25, 2007

sometimes emit "zero" and "all one" vectors multiple times,
for example:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M2
	ret

instead of:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	movq	%mm0, _M2
	ret

This patch fixes this by always arranging for zero/one vectors
to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be
any random type.  This ensures they get trivially CSE'd on the dag.
This fix is also important for LegalizeDAGTypes, as it gets unhappy
when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when
'i64' isn't legal.

This patch makes the following changes:

1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into
   their canonical types.
2) The now-dead patterns are removed from the SSE/MMX .td files.
3) All the patterns in the .td file that referred to immAllOnesV or
   immAllZerosV in the wrong form now use *_bc to match them with a
   bitcast wrapped around them.
4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle 
   bitcast'd zero vectors, which simplifies the code actually.
5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that
   is legal, instead of generating one that is illegal and expecting
   a later legalize pass to clean it up.
6) isZeroShuffle is generalized to handle bitcast of zeros.
7) several other minor tweaks.

This patch is definite goodness, but has the potential to cause random
code quality regressions.  Please be on the lookout for these and let 
me know if they happen.

llvm-svn: 44310

5728bdd4

Nov 17, 2007
- Add support for vectors to int <-> float casts. · d4d45c26
  Nate Begeman authored Nov 17, 2007
```
llvm-svn: 44204
```
  d4d45c26
Oct 30, 2007
- Add missing SSE builtins: CVTPD2PI, CVTPS2PI, · d50c8bce
  Dale Johannesen authored Oct 30, 2007
```
CVTTPD2PI, CVTTPS2PI, CVTPI2PD, CVTPI2PS.

llvm-svn: 43523
```
  d50c8bce
Oct 12, 2007

Corrected many typing errors. And removed 'nest' parameter handling · 1f0da1fe

Arnold Schwaighofer authored Oct 12, 2007

for fastcc from X86CallingConv.td.  This means that nested functions
are not supported for calling convention 'fastcc'.

llvm-svn: 42934

1f0da1fe

Oct 11, 2007
- Add missing argument to PALIGNR · 62f65edc
  Dale Johannesen authored Oct 11, 2007
```
llvm-svn: 42874
```
  62f65edc
Oct 06, 2007

Added DAG xforms. e.g. · f4b5d491

Evan Cheng authored Oct 06, 2007

(vextract (v4f32 s2v (f32 load $addr)), 0) -> (f32 load $addr) 
(vextract (v4i32 bc (v4f32 s2v (f32 load $addr))), 0) -> (i32 load $addr)
Remove x86 specific patterns.

llvm-svn: 42677

f4b5d491

Oct 01, 2007
- Typo. X86comi doesn't read / write chain's. · a1b7e950
  Evan Cheng authored Oct 01, 2007
```
llvm-svn: 42492
```
  a1b7e950
Sep 29, 2007
- Enabling new condition code modeling scheme. · 5fb5a1f3
  Evan Cheng authored Sep 29, 2007
```
llvm-svn: 42459
```
  5fb5a1f3
Sep 25, 2007

Added support for new condition code modeling scheme (i.e. physical register... · e95f391e

Evan Cheng authored Sep 25, 2007

Added support for new condition code modeling scheme (i.e. physical register dependency). These are a bunch of instructions that are duplicated so the x86 backend can support both the old and new schemes at the same time. They will be deleted after
all the kinks are worked out.

llvm-svn: 42285

e95f391e

Sep 23, 2007

Fix PR 1681. When X86 target uses +sse -sse2, · e36c4002

Dale Johannesen authored Sep 23, 2007

keep f32 in SSE registers and f64 in x87.  This
is effectively a new codegen mode.
Change addLegalFPImmediate to permit float and
double variants to do different things.
Adjust callers.

llvm-svn: 42246

e36c4002

Sep 14, 2007
- Add implicit def of EFLAGS on those instructions that may modify flags. · 483e1ce1
  Evan Cheng authored Sep 14, 2007
```
llvm-svn: 41962
```
  483e1ce1
Sep 11, 2007
- Remove (somewhat confusing) Imp<> helper, use let Defs = [], Uses = [] instead. · 3e18e504
  Evan Cheng authored Sep 11, 2007
```
llvm-svn: 41863
```
  3e18e504
Sep 07, 2007
- Avoid storing and reloading zeros and other constants from stack slots · a95cbb00
  Dan Gohman authored Sep 07, 2007
```
by flagging the associated instructions as being trivially rematerializable.

llvm-svn: 41775
```
  a95cbb00
Aug 30, 2007
- Mark load instructions with isLoad = 1. · c2081fe5
  Evan Cheng authored Aug 30, 2007
```
llvm-svn: 41595
```
  c2081fe5
Aug 11, 2007
- 64-bit SSSE3 ops that use MMX registers don't require 16-byte alignment. · cdbd82ee
  Bill Wendling authored Aug 11, 2007
```
Make a 'memop' pattern just for them.

llvm-svn: 41017
```
  cdbd82ee
Aug 10, 2007
- For kicks, I though it would be fun to use the correct opcode. · 70146150
  Bill Wendling authored Aug 10, 2007
```
llvm-svn: 40985
```
  70146150
- Adding SSSE3 intrinsics. · 23772069
  Bill Wendling authored Aug 10, 2007
```
llvm-svn: 40982
```
  23772069
Aug 02, 2007

Fix the alignment requirements of several unpck and shuf instructions. · 8932bff7

Dan Gohman authored Aug 02, 2007

Generalize isPSHUFDMask and add a unary SHUFPD pattern so that SHUFPD's
memory operand alignment can be tested as well, with a fix to avoid
breaking MMX's use of isPSHUFDMask.

llvm-svn: 40756

8932bff7

Fix pastos in vector arithmetic intrinsics. · 4d436e2b
Dan Gohman authored Aug 02, 2007
```
llvm-svn: 40754
```
4d436e2b

Mark the SSE and MMX load instructions that · fa3eeeed

Dan Gohman authored Aug 02, 2007

X86InstrInfo::isReallyTriviallyReMaterializable knows how to handle
with the isReMaterializable flag so that it is given a chance to handle
them. Without hoisting constant-pool loads from loops this isn't very
visible, though it does keep CodeGen/X86/constant-pool-remat-0.ll from
making a copy of the constant pool on the stack.

llvm-svn: 40736

fa3eeeed

Aug 01, 2007
- Missing Requires. · da549ece
  Evan Cheng authored Aug 01, 2007
```
llvm-svn: 40691
```
  da549ece
Jul 31, 2007

Change the x86 assembly output to use tab characters to separate the · 54ec4bfa

Dan Gohman authored Jul 31, 2007

mnemonics from their operands instead of single spaces. This makes the
assembly output a little more consistent with various other compilers
(f.e. GCC), and slightly easier to read. Also, update the regression
tests accordingly.

llvm-svn: 40648

54ec4bfa

Redo and generalize previously removed opt for pinsrw: (vextract (v4i32 bc... · 12c6be84

Evan Cheng authored Jul 31, 2007

Redo and generalize previously removed opt for pinsrw: (vextract (v4i32 bc (v4f32 s2v (f32 load ))), 0) -> (i32 load )

llvm-svn: 40628

12c6be84

Jul 27, 2007

Re-apply 40504, but with a fix for the segfault it caused in oggenc: · 4788552d

Dan Gohman authored Jul 27, 2007

Make the alignedload and alignedstore patterns always require 16-byte
alignment. This way when they are used in the "Fs" instructions, in which
a vector instruction is used for a scalar purpose, they can still require
the full vector alignment. And add a regression test for this.

llvm-svn: 40555

4788552d

Reverting 40504 for now. It's breaking oggenc. · 931de40a
Evan Cheng authored Jul 27, 2007
```
llvm-svn: 40547
```
931de40a

Jul 26, 2007
- Fix a whitespace difference between CMPSSrr and CMPSDrr. · cecd4b37
  Dan Gohman authored Jul 26, 2007
```
llvm-svn: 40528
```
  cecd4b37
- Remove X86ISD::LOAD_PACK and X86ISD::LOAD_UA and associated code from the · 8455bd3f
  Dan Gohman authored Jul 26, 2007
```
x86 target, replacing them with the new alignment attributes on memory
references.

llvm-svn: 40504
```
  8455bd3f