- Aug 13, 2012
-
-
Craig Topper authored
Tidy up VSETCC lowering code a bit more by adding an llvm_unreachable and putting an a couple if conditions in a better order. llvm-svn: 161746
-
Craig Topper authored
llvm-svn: 161745
-
Craig Topper authored
llvm-svn: 161743
-
Craig Topper authored
Remove the LowerMMXCONCAT_VECTORS function. It could never execute because there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result. llvm-svn: 161742
-
- Aug 12, 2012
-
-
Craig Topper authored
llvm-svn: 161738
-
Craig Topper authored
Remove unnecessary call to setOperationAction for SETCC of v2i64 under SSE42. It was already called for the same under SSE2. llvm-svn: 161737
-
Craig Topper authored
llvm-svn: 161734
-
Craig Topper authored
Use MVT.isXBitVector instead of EVT.isXBitVector when setting up operation actions. Compiles to smaller code. llvm-svn: 161733
-
Michael Liao authored
- FCMOV only supports a subset of X86 conditions. Skip boolean simplification if X86 condition is not valid for FCMOV. - add a minimal test case for PR13577. llvm-svn: 161732
-
Craig Topper authored
Move setOperationAction for CONCAT_VECTORS for 256-bit vectors into loop since all 256-bit types are supported. llvm-svn: 161730
-
- Aug 10, 2012
-
-
Michael Liao authored
- if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value generated from X86ISD::SETCC, try to simplify the boolean value generation and checking by reusing the original EFLAGS with proper condition code - add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places consuming EFLAGS part of patches fixing PR12312 llvm-svn: 161687
-
Michael Liao authored
llvm-svn: 161664
-
Joerg Sonnenberger authored
llvm-svn: 161657
-
- Aug 08, 2012
-
-
Manman Ren authored
We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462
-
Evan Cheng authored
do so when the high bits are known zero. This caused a subtle miscompilation. rdar://12027825 llvm-svn: 161451
-
- Aug 06, 2012
-
-
Craig Topper authored
Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318
-
- Aug 05, 2012
-
-
Craig Topper authored
llvm-svn: 161306
-
Craig Topper authored
Use a COPY node instead of an explicit MOVA opcode in the custom insterter for pcmpestrm/pcmpistrm. Allows the register allocator to handle it better and prevent wasted identity moves. llvm-svn: 161305
-
- Aug 03, 2012
-
-
Bob Wilson authored
Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232
-
- Aug 01, 2012
-
-
Chad Rosier authored
llvm-svn: 161122
-
Elena Demikhovsky authored
llvm-svn: 161110
-
- Jul 25, 2012
-
-
Rafael Espindola authored
to pop. llvm-svn: 160725
-
- Jul 23, 2012
-
-
Sylvestre Ledru authored
llvm-svn: 160621
-
- Jul 17, 2012
-
-
Evan Cheng authored
llvm-svn: 160387
-
Evan Cheng authored
llvm-svn: 160354
-
Evan Cheng authored
large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346
-
- Jul 16, 2012
-
-
Evan Cheng authored
uint32_t hi(uint64_t res) { uint_32t hi = res >> 32; return !hi; } llvm IR looks like this: define i32 @hi(i64 %res) nounwind uwtable ssp { entry: %lnot = icmp ult i64 %res, 4294967296 %lnot.ext = zext i1 %lnot to i32 ret i32 %lnot.ext } The optimizer has optimize away the right shift and truncate but the resulting constant is too large to fit in the 32-bit immediate field. The resulting x86 code is worse as a result: movabsq $4294967296, %rax ## imm = 0x100000000 cmpq %rax, %rdi sbbl %eax, %eax andl $1, %eax This patch teaches the x86 lowering code to handle ult against a large immediate with trailing zeros. It will issue a right shift and a truncate followed by a comparison against a shifted immediate. shrq $32, %rdi testl %edi, %edi sete %al movzbl %al, %eax It also handles a ugt comparison against a large immediate with trailing bits set. i.e. X > 0x0ffffffff -> (X >> 32) >= 1 rdar://11866926 llvm-svn: 160312
-
- Jul 15, 2012
-
-
Nadav Rotem authored
llvm-svn: 160234
-
Nadav Rotem authored
AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector. This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions. llvm-svn: 160222
-
- Jul 12, 2012
-
-
Benjamin Kramer authored
Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it. I already had the necessary things in place for IR-level passes but missed the machine passes. llvm-svn: 160137
-
Benjamin Kramer authored
The rdrand/cmov sequence is the same that is emitted by both GCC and ICC. Fixes PR13284. llvm-svn: 160117
-
- Jul 11, 2012
-
-
Nadav Rotem authored
When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers. llvm-svn: 160044
-
- Jul 10, 2012
-
-
Nadav Rotem authored
Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991
-
- Jul 05, 2012
-
-
Jakob Stoklund Olesen authored
Function argument and return value registers aren't part of the encoding, so they should be implicit operands. llvm-svn: 159728
-
- Jul 04, 2012
-
-
Jakob Stoklund Olesen authored
The CopyToReg nodes that set up the argument registers before a call must be glued to the call instruction. Otherwise, the scheduler may emit the physreg copies long before the call, causing long live ranges for the fixed registers. Besides disabling good register allocation, that can also expose problems when EmitInstrWithCustomInserter() splits a basic block during the live range of a physreg. llvm-svn: 159721
-
- Jul 01, 2012
-
-
Elena Demikhovsky authored
llvm-svn: 159504
-
- Jun 29, 2012
-
-
Rafael Espindola authored
Before this patch in pic 32 bit code we would add the global base register and not load from that address. This is a really old bug, but before the introduction of the tls attributes we would never select initial exec for pic code. llvm-svn: 159409
-
- Jun 26, 2012
-
-
Elena Demikhovsky authored
llvm-svn: 159197
-
Bill Wendling authored
llvm-svn: 159196
-
Elena Demikhovsky authored
The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction. Before: vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3] vpermilps $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3] vextractf128 $1, %ymm1, %xmm1 vextractf128 $1, %ymm0, %xmm0 vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3] vpermilps $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3] vinsertf128 $1, %xmm0, %ymm2, %ymm0 After: vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4] vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4] vunpcklps %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5] llvm-svn: 159188
-