- Apr 10, 2008
-
-
Chris Lattner authored
MOVZQI2PQIrr. This would be better handled as a dag combine (with the goal of eliminating the bitconvert) but I don't know how to do that safely. Thoughts welcome. llvm-svn: 49463
-
- Apr 05, 2008
-
-
Evan Cheng authored
llvm-svn: 49244
-
- Mar 26, 2008
-
-
Evan Cheng authored
llvm-svn: 48815
-
- Mar 24, 2008
-
-
Evan Cheng authored
- SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction. llvm-svn: 48746
-
- Mar 16, 2008
-
-
Nate Begeman authored
llvm-svn: 48430
-
- Mar 15, 2008
-
-
Evan Cheng authored
Replace all target specific implicit def instructions with a target independent one: TargetInstrInfo::IMPLICIT_DEF. llvm-svn: 48380
-
- Mar 14, 2008
-
-
Evan Cheng authored
llvm-svn: 48361
-
Evan Cheng authored
Fix a number of encoding bugs. SSE 4.1 instructions MPSADBWrri, PINSRDrr, etc. have 8-bits immediate field (ImmT == Imm8). llvm-svn: 48360
-
- Mar 12, 2008
-
-
Evan Cheng authored
X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases. llvm-svn: 48279
-
- Mar 08, 2008
-
-
Evan Cheng authored
Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0|1|2} and prefetchnta instructions. llvm-svn: 48042
-
- Mar 05, 2008
-
-
Evan Cheng authored
llvm-svn: 47941
-
Evan Cheng authored
llvm-svn: 47940
-
- Feb 19, 2008
-
-
Evan Cheng authored
- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290
-
- Feb 16, 2008
-
-
Andrew Lenharth authored
llvm-svn: 47204
-
- Feb 12, 2008
-
-
Nate Begeman authored
Move formats into the formats file llvm-svn: 47035
-
- Feb 11, 2008
-
-
Nate Begeman authored
Add some notes to the README. llvm-svn: 46949
-
- Feb 10, 2008
-
-
Nate Begeman authored
llvm-svn: 46931
-
Nate Begeman authored
pabs{b,w,d} are not two address fix extract-to-mem sse4 ops add sse4 vector sign extend nodes llvm-svn: 46915
-
- Feb 09, 2008
-
-
Nate Begeman authored
llvm-svn: 46902
-
- Feb 04, 2008
-
-
Nate Begeman authored
Evan's help with the rest. llvm-svn: 46697
-
Nate Begeman authored
llvm-svn: 46696
-
- Feb 03, 2008
-
-
Nate Begeman authored
llvm-svn: 46681
-
- Jan 24, 2008
-
-
Chris Lattner authored
This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307
-
- Jan 11, 2008
-
-
Chris Lattner authored
llvm-svn: 45859
-
- Jan 10, 2008
-
-
Chris Lattner authored
x86 backend where instructions were not marked maystore/mayload, and perf issues where instructions were not marked neverHasSideEffects. It would be really nice if we could write patterns for copy instructions. I have audited all the x86 instructions down to MOVDQAmr. The flags on others and on other targets are probably not right in all cases, but no clients currently use this info that are enabled by default. llvm-svn: 45829
-
Chris Lattner authored
inferred from the instr patterns. llvm-svn: 45824
-
- Jan 07, 2008
-
-
Chris Lattner authored
llvm-svn: 45667
-
- Dec 29, 2007
-
-
Chris Lattner authored
llvm-svn: 45418
-
- Dec 20, 2007
-
-
Evan Cheng authored
llvm-svn: 45268
-
- Dec 18, 2007
-
-
Bill Wendling authored
based what flag to set on whether it was already marked as "isRematerializable". If there was a further check to determine if it's "really" rematerializable, then I marked it as "mayHaveSideEffects" and created a check in the X86 back-end similar to the remat one. llvm-svn: 45132
-
- Dec 16, 2007
-
-
Chris Lattner authored
X86CodeEmitter.cpp:378: failed assertion `0 && "Immediate size not set!"' I *think* this is right, but Evan, please verify. It also looks like CMPSDrr and maybe others are missing this info. Evan, plz investigate. llvm-svn: 45074
-
- Dec 15, 2007
-
-
Evan Cheng authored
llvm-svn: 45058
-
- Dec 13, 2007
-
-
Evan Cheng authored
Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled. llvm-svn: 44960
-
- Dec 06, 2007
-
-
Evan Cheng authored
Remove a bogus optimization. It's not possible to do a move to low element to a <8 x i16> or <16 x i8> vector. llvm-svn: 44669
-
- Nov 25, 2007
-
-
Chris Lattner authored
sometimes emit "zero" and "all one" vectors multiple times, for example: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 pcmpeqd %mm0, %mm0 movq %mm0, _M2 ret instead of: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 movq %mm0, _M2 ret This patch fixes this by always arranging for zero/one vectors to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be any random type. This ensures they get trivially CSE'd on the dag. This fix is also important for LegalizeDAGTypes, as it gets unhappy when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when 'i64' isn't legal. This patch makes the following changes: 1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into their canonical types. 2) The now-dead patterns are removed from the SSE/MMX .td files. 3) All the patterns in the .td file that referred to immAllOnesV or immAllZerosV in the wrong form now use *_bc to match them with a bitcast wrapped around them. 4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle bitcast'd zero vectors, which simplifies the code actually. 5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that is legal, instead of generating one that is illegal and expecting a later legalize pass to clean it up. 6) isZeroShuffle is generalized to handle bitcast of zeros. 7) several other minor tweaks. This patch is definite goodness, but has the potential to cause random code quality regressions. Please be on the lookout for these and let me know if they happen. llvm-svn: 44310
-
- Nov 17, 2007
-
-
Nate Begeman authored
llvm-svn: 44204
-
- Oct 30, 2007
-
-
Dale Johannesen authored
CVTTPD2PI, CVTTPS2PI, CVTPI2PD, CVTPI2PS. llvm-svn: 43523
-
- Oct 12, 2007
-
-
Arnold Schwaighofer authored
for fastcc from X86CallingConv.td. This means that nested functions are not supported for calling convention 'fastcc'. llvm-svn: 42934
-
- Oct 11, 2007
-
-
Dale Johannesen authored
llvm-svn: 42874
-
- Oct 06, 2007
-
-
Evan Cheng authored
(vextract (v4f32 s2v (f32 load $addr)), 0) -> (f32 load $addr) (vextract (v4i32 bc (v4f32 s2v (f32 load $addr))), 0) -> (i32 load $addr) Remove x86 specific patterns. llvm-svn: 42677
-