- Aug 23, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 111837
-
Anton Korobeynikov authored
it's COFF emitter which does not support differences of two symbols (and needs to be fixed). GAS is pretty fine with code produced. llvm-svn: 111801
-
Michael J. Spencer authored
llvm-svn: 111792
-
- Aug 21, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 111704
-
Bruno Cardoso Lopes authored
general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a *bunch* of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691
-
- Aug 17, 2010
-
-
Anton Korobeynikov authored
- Do not clobber al during variadic calls, this is AMD64 ABI-only feature - Emit wincall64, where necessary Patch by Cameron Esfahani! llvm-svn: 111289
-
- Aug 14, 2010
-
-
Eric Christopher authored
encoding is correct for the built-in assembler. Based on a patch from Chris. llvm-svn: 111083
-
Chris Lattner authored
llvm-svn: 111073
-
- Aug 13, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 111022
-
- Aug 12, 2010
-
-
Bruno Cardoso Lopes authored
term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897
-
- Aug 11, 2010
-
-
Dan Gohman authored
avoids trouble if the return type of TD->getPointerSize() is changed to something which doesn't promote to a signed type, and is simpler anyway. Also, use getCopyFromReg instead of getRegister to read a physical register's value. llvm-svn: 110835
-
Bruno Cardoso Lopes authored
Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744
-
- Aug 10, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 110645
-
- Aug 06, 2010
-
-
Bruno Cardoso Lopes authored
Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394
-
- Aug 05, 2010
-
-
Eric Christopher authored
uses. llvm-svn: 110274
-
- Jul 30, 2010
-
-
Bruno Cardoso Lopes authored
declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878
-
- Jul 29, 2010
-
-
Jakob Stoklund Olesen authored
We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764
-
- Jul 28, 2010
-
-
Jakob Stoklund Olesen authored
The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652
-
Nate Begeman authored
This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566
-
Nate Begeman authored
~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549
-
- Jul 26, 2010
-
-
Evan Cheng authored
llvm-svn: 109450
-
- Jul 24, 2010
-
-
Evan Cheng authored
appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300
-
- Jul 23, 2010
-
-
Dale Johannesen authored
SSE, so we can't return floating point values if this is disabled. Detect this error for clang. With SSE1 only, f64 is a problem; it can be done, but neither llvm-gcc nor clang has ever generated correct code for it. Since nobody noticed this I think it's OK to treat it as an error for now. This also handles SSE-sized vectors of floating point. 8207686, 8204109. llvm-svn: 109201
-
- Jul 22, 2010
-
-
Eric Christopher authored
for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078
-
Eric Christopher authored
llvm-svn: 109070
-
- Jul 21, 2010
-
-
Nate Begeman authored
1) all registers were spilled as xmm, regardless of actual size 2) win64 abi doesn't do the varargs-size-in-%al thing Still to look into: xmm6-15 are marked as clobbered by call instructions on win64 even though they aren't. llvm-svn: 109035
-
Eric Christopher authored
the wrong directory. llvm-svn: 109005
-
Eric Christopher authored
Fixes a pile of libgomp failures in the llvm-gcc testsuite due to the libcall not existing. llvm-svn: 109004
-
- Jul 16, 2010
-
-
Evan Cheng authored
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN. llvm-svn: 108465
-
- Jul 15, 2010
-
-
Jakob Stoklund Olesen authored
lowering atomics. This will allow those copies to still be coalesced after TII::isMoveInstr is removed. llvm-svn: 108385
-
- Jul 14, 2010
-
-
Evan Cheng authored
address cannot be allocated a register is in 32-bit mode where the first three arguments are marked inreg. In that case EAX, EDX, and ECX will be used for argument passing. This fixes PR7610. llvm-svn: 108327
-
- Jul 10, 2010
-
-
Dan Gohman authored
- Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039
-
Jakob Stoklund Olesen authored
it is popped, even if it is ununsed. A CopyFromReg node is too weak to represent the required sideeffect, so insert an FpGET_ST0 instruction directly instead. This will matter when CopyFromReg gets lowered to a generic COPY instruction. llvm-svn: 108037
-
- Jul 09, 2010
-
-
Bob Wilson authored
U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987
-
Dan Gohman authored
llvm-svn: 107948
-
Dan Gohman authored
a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943
-
Chris Lattner authored
like all other instructions, even though a segment is not allowed. This resolves a bunch of gross hacks in the encoder and makes LEA more consistent with the rest of the instruction set. No functionality change. llvm-svn: 107934
-
Chris Lattner authored
X86 memory operand. llvm-svn: 107925
-
- Jul 08, 2010
-
-
Dan Gohman authored
Debug info intrinsics win for now. llvm-svn: 107850
-
Evan Cheng authored
llvm-svn: 107820
-