- Jul 19, 2012
-
-
Bill Wendling authored
llvm-svn: 160477
-
- Jul 18, 2012
-
-
Manman Ren authored
Updated OptimizeCompare in peephole to remove redundant cmp against zero. We only remove Compare if CF and OF are not used. rdar://11855129 llvm-svn: 160454
-
Preston Gurd authored
when run on an Intel Atom processor. The failures have arisen due to changes elsewhere in the trunk over the past 8 weeks or so. These failures were not detected by the Atom buildbot because the CPU on the Atom buildbot was not being detected as an Atom CPU. The fix for this problem is in Host.cpp and X86Subtarget.cpp, but shall remain commented out until the current set of Atom test failures are fixed. Patch by Andy Zhang and Tyler Nowicki! llvm-svn: 160451
-
Nadav Rotem authored
load source operand is used by multiple nodes. The v2i64 broadcast was emulated by shuffling the two lower i32 elements to the upper two. We had a bug in the immediate used for the broadcast. Replacing 0 to 0x44. 0x44 means [01|00|01|00] which corresponds to the correct lane. Patch by Michael Kuperstein. llvm-svn: 160430
-
Craig Topper authored
llvm-svn: 160425
-
Craig Topper authored
llvm-svn: 160423
-
Craig Topper authored
Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas. llvm-svn: 160420
-
- Jul 17, 2012
-
-
Evan Cheng authored
llvm-svn: 160387
-
Evan Cheng authored
llvm-svn: 160354
-
Evan Cheng authored
large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346
-
- Jul 16, 2012
-
-
Evan Cheng authored
uint32_t hi(uint64_t res) { uint_32t hi = res >> 32; return !hi; } llvm IR looks like this: define i32 @hi(i64 %res) nounwind uwtable ssp { entry: %lnot = icmp ult i64 %res, 4294967296 %lnot.ext = zext i1 %lnot to i32 ret i32 %lnot.ext } The optimizer has optimize away the right shift and truncate but the resulting constant is too large to fit in the 32-bit immediate field. The resulting x86 code is worse as a result: movabsq $4294967296, %rax ## imm = 0x100000000 cmpq %rax, %rdi sbbl %eax, %eax andl $1, %eax This patch teaches the x86 lowering code to handle ult against a large immediate with trailing zeros. It will issue a right shift and a truncate followed by a comparison against a shifted immediate. shrq $32, %rdi testl %edi, %edi sete %al movzbl %al, %eax It also handles a ugt comparison against a large immediate with trailing bits set. i.e. X > 0x0ffffffff -> (X >> 32) >= 1 rdar://11866926 llvm-svn: 160312
-
Chad Rosier authored
llvm-svn: 160293
-
Nadav Rotem authored
undef virtual register. The problem is that ProcessImplicitDefs removes the definition of the register and marks all uses as undef. If we lose the undef marker then we get a register which has no def, is not marked as undef. The live interval analysis does not collect information for these virtual registers and we crash in later passes. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160260
-
Alexey Samsonov authored
It is intended to fix PR11468. Old prologue and epilogue looked like this: push %rbp mov %rsp, %rbp and $alignment, %rsp push %r14 push %r15 ... pop %r15 pop %r14 mov %rbp, %rsp pop %rbp The problem was to reference the locations of callee-saved registers in exception handling: locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would take some effort to implement this in LLVM, as currently MachineLocation can only have the form "Register + Offset". Funciton prologue and epilogue are now changed to: push %rbp mov %rsp, %rbp push %14 push %15 and $alignment, %rsp ... lea -$size_of_saved_registers(%rbp), %rsp pop %r15 pop %r14 pop %rbp Reviewed by Chad Rosier. llvm-svn: 160248
-
- Jul 15, 2012
-
-
Nadav Rotem authored
llvm-svn: 160234
-
Nadav Rotem authored
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot. PR12782. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160230
-
Nadav Rotem authored
AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector. This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions. llvm-svn: 160222
-
- Jul 13, 2012
-
-
Benjamin Kramer authored
llvm-svn: 160173
-
Craig Topper authored
llvm-svn: 160162
-
- Jul 12, 2012
-
-
Benjamin Kramer authored
Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it. I already had the necessary things in place for IR-level passes but missed the machine passes. llvm-svn: 160137
-
Benjamin Kramer authored
The rdrand/cmov sequence is the same that is emitted by both GCC and ICC. Fixes PR13284. llvm-svn: 160117
-
Craig Topper authored
llvm-svn: 160110
-
- Jul 11, 2012
-
-
Chad Rosier authored
comments. llvm-svn: 160069
-
Manman Ren authored
When Movr0 is between sub and cmp, we move Movr0 before sub if it enables removal of Cmp. llvm-svn: 160066
-
Chad Rosier authored
to Selection DAG isel. Patch by Andrew Kaylor <andrew.kaylor@intel.com>. llvm-svn: 160055
-
Nadav Rotem authored
When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers. llvm-svn: 160044
-
- Jul 10, 2012
-
-
Chad Rosier authored
X86MachineFunctionInfo as this is currently only used by X86. If this ever becomes an issue on another arch (e.g., ARM) then we can hoist it back out. llvm-svn: 160009
-
Chad Rosier authored
X86. Basically, this is a reapplication of r158087 with a few fixes. Specifically, (1) the stack pointer is restored from the base pointer before popping callee-saved registers and (2) in obscure cases (see comments in patch) we must cache the value of the original stack adjustment in the prologue and apply it in the epilogue. rdar://11496434 llvm-svn: 160002
-
Nadav Rotem authored
Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991
-
Craig Topper authored
llvm-svn: 159983
-
- Jul 09, 2012
-
-
Manman Ren authored
getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond No functional change intended. If we want to update the condition code of CMOV|SET|Jcc, we first analyze the opcode to get the condition code, then update the condition code, finally synthesize the new opcode form the new condition code. llvm-svn: 159955
-
- Jul 07, 2012
-
-
Andrew Trick authored
subtarget CPU descriptions and support new features of MachineScheduler. MachineModel has three categories of data: 1) Basic properties for coarse grained instruction cost model. 2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD). 3) Instruction itineraties for detailed per-cycle reservation tables. These will all live side-by-side. Any subtarget can use any combination of them. Instruction itineraries will not change in the near term. In the long run, I expect them to only be relevant for in-order VLIW machines that have complex contraints and require a precise scheduling/bundling model. Once itineraries are only actively used by VLIW-ish targets, they could be replaced by something more appropriate for those targets. This tablegen backend rewrite sets things up for introducing MachineModel type #2: per opcode/operand cost model. llvm-svn: 159891
-
Manman Ren authored
It is safe if EFLAGS is killed or re-defined. When we are done with the basic block, check whether EFLAGS is live-out. Do not optimize away cmp if EFLAGS is live-out. llvm-svn: 159888
-
- Jul 06, 2012
-
-
Manman Ren authored
For each Cmp, we check whether there is an earlier Sub which make Cmp redundant. We handle the case where SUB operates on the same source operands as Cmp, including the case where the two source operands are swapped. llvm-svn: 159838
-
- Jul 05, 2012
-
-
Jakob Stoklund Olesen authored
Function argument and return value registers aren't part of the encoding, so they should be implicit operands. llvm-svn: 159728
-
- Jul 04, 2012
-
-
Jakob Stoklund Olesen authored
The CopyToReg nodes that set up the argument registers before a call must be glued to the call instruction. Otherwise, the scheduler may emit the physreg copies long before the call, causing long live ranges for the fixed registers. Besides disabling good register allocation, that can also expose problems when EmitInstrWithCustomInserter() splits a basic block during the live range of a physreg. llvm-svn: 159721
-
Jakob Stoklund Olesen authored
Implement the TII hooks needed by EarlyIfConversion to create cmov instructions and estimate their latency. Early if-conversion is still not enabled by default. llvm-svn: 159695
-
- Jul 03, 2012
-
-
Craig Topper authored
llvm-svn: 159647
-
Craig Topper authored
llvm-svn: 159646
-
Craig Topper authored
Add aliases for pblendvb, blendvpd, and blendvps instructions with the implicit xmm0 operand specified. Fixes PR13252. llvm-svn: 159644
-