- Aug 11, 2010
-
-
Dan Gohman authored
avoids trouble if the return type of TD->getPointerSize() is changed to something which doesn't promote to a signed type, and is simpler anyway. Also, use getCopyFromReg instead of getRegister to read a physical register's value. llvm-svn: 110835
-
Bruno Cardoso Lopes authored
Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744
-
- Aug 10, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 110645
-
- Aug 06, 2010
-
-
Bruno Cardoso Lopes authored
Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394
-
- Aug 05, 2010
-
-
Eric Christopher authored
uses. llvm-svn: 110274
-
- Jul 30, 2010
-
-
Bruno Cardoso Lopes authored
declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878
-
- Jul 29, 2010
-
-
Jakob Stoklund Olesen authored
We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764
-
- Jul 28, 2010
-
-
Jakob Stoklund Olesen authored
The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652
-
Nate Begeman authored
This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566
-
Nate Begeman authored
~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549
-
- Jul 26, 2010
-
-
Evan Cheng authored
llvm-svn: 109450
-
- Jul 24, 2010
-
-
Evan Cheng authored
appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300
-
- Jul 23, 2010
-
-
Dale Johannesen authored
SSE, so we can't return floating point values if this is disabled. Detect this error for clang. With SSE1 only, f64 is a problem; it can be done, but neither llvm-gcc nor clang has ever generated correct code for it. Since nobody noticed this I think it's OK to treat it as an error for now. This also handles SSE-sized vectors of floating point. 8207686, 8204109. llvm-svn: 109201
-
- Jul 22, 2010
-
-
Eric Christopher authored
for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078
-
Eric Christopher authored
llvm-svn: 109070
-
- Jul 21, 2010
-
-
Nate Begeman authored
1) all registers were spilled as xmm, regardless of actual size 2) win64 abi doesn't do the varargs-size-in-%al thing Still to look into: xmm6-15 are marked as clobbered by call instructions on win64 even though they aren't. llvm-svn: 109035
-
Eric Christopher authored
the wrong directory. llvm-svn: 109005
-
Eric Christopher authored
Fixes a pile of libgomp failures in the llvm-gcc testsuite due to the libcall not existing. llvm-svn: 109004
-
- Jul 16, 2010
-
-
Evan Cheng authored
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN. llvm-svn: 108465
-
- Jul 15, 2010
-
-
Jakob Stoklund Olesen authored
lowering atomics. This will allow those copies to still be coalesced after TII::isMoveInstr is removed. llvm-svn: 108385
-
- Jul 14, 2010
-
-
Evan Cheng authored
address cannot be allocated a register is in 32-bit mode where the first three arguments are marked inreg. In that case EAX, EDX, and ECX will be used for argument passing. This fixes PR7610. llvm-svn: 108327
-
- Jul 10, 2010
-
-
Dan Gohman authored
- Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039
-
Jakob Stoklund Olesen authored
it is popped, even if it is ununsed. A CopyFromReg node is too weak to represent the required sideeffect, so insert an FpGET_ST0 instruction directly instead. This will matter when CopyFromReg gets lowered to a generic COPY instruction. llvm-svn: 108037
-
- Jul 09, 2010
-
-
Bob Wilson authored
U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987
-
Dan Gohman authored
llvm-svn: 107948
-
Dan Gohman authored
a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943
-
Chris Lattner authored
like all other instructions, even though a segment is not allowed. This resolves a bunch of gross hacks in the encoder and makes LEA more consistent with the rest of the instruction set. No functionality change. llvm-svn: 107934
-
Chris Lattner authored
X86 memory operand. llvm-svn: 107925
-
- Jul 08, 2010
-
-
Dan Gohman authored
Debug info intrinsics win for now. llvm-svn: 107850
-
Evan Cheng authored
llvm-svn: 107820
-
- Jul 07, 2010
-
-
Dan Gohman authored
a bunch of stuff, to allow the target-independent calling convention logic to be employed. llvm-svn: 107800
-
Dan Gohman authored
instance, rather than pointers to all of FunctionLoweringInfo's members. This eliminates an NDEBUG ABI sensitivity. llvm-svn: 107789
-
Dan Gohman authored
code can do calling-convention queries. This obviates OutputArgReg. llvm-svn: 107786
-
Dale Johannesen authored
print the (%rip) only if the 'a' modifier is present. PR 7528. llvm-svn: 107727
-
Dan Gohman authored
SelectBasicBlock doesn't needs its BasicBlock argument. llvm-svn: 107712
-
Devang Patel authored
llvm-svn: 107710
-
- Jul 06, 2010
-
-
Dan Gohman authored
the block before calling the expansion hook. And don't put EFLAGS in a mbb's live-in list twice. llvm-svn: 107691
-
Dan Gohman authored
llvm-svn: 107668
-
Dan Gohman authored
the pseudo instruction is not at the end of the block. llvm-svn: 107655
-
Eric Christopher authored
registers. Split out testcases per architecture and os now. Patch from Nelson Elhage. llvm-svn: 107640
-