- Jul 28, 2012
-
-
Craig Topper authored
llvm-svn: 160922
-
Craig Topper authored
llvm-svn: 160921
-
Manman Ren authored
Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919
-
Craig Topper authored
llvm-svn: 160914
-
Craig Topper authored
llvm-svn: 160913
-
Manman Ren authored
It is possible that an instruction can use and update EFLAGS. When checking the safety, we should check the usage of EFLAGS first before declaring it is safe to optimize due to the update. llvm-svn: 160912
-
- Jul 27, 2012
-
-
Jakob Stoklund Olesen authored
llvm-svn: 160833
-
Jakob Stoklund Olesen authored
I'll remove these two sub-register indexes shortly. llvm-svn: 160831
-
Jakob Stoklund Olesen authored
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but copyPhysReg does the right thing with it. (The old pattern would eventually produce the same cross-class copy). llvm-svn: 160830
-
Jakob Stoklund Olesen authored
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns, simplifying the emitted code a bit. llvm-svn: 160820
-
Jakob Stoklund Olesen authored
The SUBREG_TO_REG instruction has magic semantics asserting that the source value was defined by an instruction that cleared the high half of the register. Those semantics are never actually exploited for xmm registers. llvm-svn: 160818
-
- Jul 26, 2012
-
-
Jakob Stoklund Olesen authored
These idempotent sub-register indices don't do anything --- They simply map XMM registers to themselves. They no longer affect register classes either since the SubRegClasses field has been removed from Target.td. This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns with COPY_TO_REGCLASS patterns which simply become COPY instructions. The number of IMPLICIT_DEF instructions before register allocation is reduced, and that is the cause of the test case changes. llvm-svn: 160816
-
Craig Topper authored
llvm-svn: 160775
-
- Jul 25, 2012
-
-
Rafael Espindola authored
llvm-svn: 160731
-
Rafael Espindola authored
to pop. llvm-svn: 160725
-
Rafael Espindola authored
change. llvm-svn: 160724
-
- Jul 24, 2012
-
-
Kevin Enderby authored
if Condition Is Met instuctions that was not correctly determining the target instruction. So for a jne rel32 instruction: % cat x.s .byte 0x0f, 0x85, 0x09, 0x00, 0x00, 0x00 % as x.s it was incorrectly deterining the target: % otool -q -tv a.out a.out: (__TEXT,__text) section 0000000000000000 jne 0xd and with the fix it gets this correct as: % otool -q -tv a.out a.out: (__TEXT,__text) section 0000000000000000 jne 0xf rdar://11505997 llvm-svn: 160694
-
David Chisnall authored
are targeting an ELF platform. Only fold gs-relative (and fs-relative) loads if it is actually sensible to do so for the target platform. This fixes PR13438. llvm-svn: 160687
-
- Jul 23, 2012
-
-
Sylvestre Ledru authored
llvm-svn: 160621
-
- Jul 20, 2012
-
-
Craig Topper authored
Don't use implicit register operands to calculate L-bit for AVX instructions. Needed because super reg defs and kills are added as implicit operands on 128-bit instructions. Fixes PR13349. Patch by Jose Fonseca. llvm-svn: 160543
-
- Jul 19, 2012
-
-
Preston Gurd authored
Atom buildbot will auto-detect Atom. llvm-svn: 160521
-
Bill Wendling authored
llvm-svn: 160479
-
Bill Wendling authored
llvm-svn: 160477
-
- Jul 18, 2012
-
-
Manman Ren authored
Updated OptimizeCompare in peephole to remove redundant cmp against zero. We only remove Compare if CF and OF are not used. rdar://11855129 llvm-svn: 160454
-
Preston Gurd authored
when run on an Intel Atom processor. The failures have arisen due to changes elsewhere in the trunk over the past 8 weeks or so. These failures were not detected by the Atom buildbot because the CPU on the Atom buildbot was not being detected as an Atom CPU. The fix for this problem is in Host.cpp and X86Subtarget.cpp, but shall remain commented out until the current set of Atom test failures are fixed. Patch by Andy Zhang and Tyler Nowicki! llvm-svn: 160451
-
Nadav Rotem authored
load source operand is used by multiple nodes. The v2i64 broadcast was emulated by shuffling the two lower i32 elements to the upper two. We had a bug in the immediate used for the broadcast. Replacing 0 to 0x44. 0x44 means [01|00|01|00] which corresponds to the correct lane. Patch by Michael Kuperstein. llvm-svn: 160430
-
Craig Topper authored
llvm-svn: 160425
-
Craig Topper authored
llvm-svn: 160423
-
Craig Topper authored
Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas. llvm-svn: 160420
-
- Jul 17, 2012
-
-
Evan Cheng authored
llvm-svn: 160387
-
Evan Cheng authored
llvm-svn: 160354
-
Evan Cheng authored
large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346
-
- Jul 16, 2012
-
-
Evan Cheng authored
uint32_t hi(uint64_t res) { uint_32t hi = res >> 32; return !hi; } llvm IR looks like this: define i32 @hi(i64 %res) nounwind uwtable ssp { entry: %lnot = icmp ult i64 %res, 4294967296 %lnot.ext = zext i1 %lnot to i32 ret i32 %lnot.ext } The optimizer has optimize away the right shift and truncate but the resulting constant is too large to fit in the 32-bit immediate field. The resulting x86 code is worse as a result: movabsq $4294967296, %rax ## imm = 0x100000000 cmpq %rax, %rdi sbbl %eax, %eax andl $1, %eax This patch teaches the x86 lowering code to handle ult against a large immediate with trailing zeros. It will issue a right shift and a truncate followed by a comparison against a shifted immediate. shrq $32, %rdi testl %edi, %edi sete %al movzbl %al, %eax It also handles a ugt comparison against a large immediate with trailing bits set. i.e. X > 0x0ffffffff -> (X >> 32) >= 1 rdar://11866926 llvm-svn: 160312
-
Chad Rosier authored
llvm-svn: 160293
-
Nadav Rotem authored
undef virtual register. The problem is that ProcessImplicitDefs removes the definition of the register and marks all uses as undef. If we lose the undef marker then we get a register which has no def, is not marked as undef. The live interval analysis does not collect information for these virtual registers and we crash in later passes. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160260
-
Alexey Samsonov authored
It is intended to fix PR11468. Old prologue and epilogue looked like this: push %rbp mov %rsp, %rbp and $alignment, %rsp push %r14 push %r15 ... pop %r15 pop %r14 mov %rbp, %rsp pop %rbp The problem was to reference the locations of callee-saved registers in exception handling: locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would take some effort to implement this in LLVM, as currently MachineLocation can only have the form "Register + Offset". Funciton prologue and epilogue are now changed to: push %rbp mov %rsp, %rbp push %14 push %15 and $alignment, %rsp ... lea -$size_of_saved_registers(%rbp), %rsp pop %r15 pop %r14 pop %rbp Reviewed by Chad Rosier. llvm-svn: 160248
-
- Jul 15, 2012
-
-
Nadav Rotem authored
llvm-svn: 160234
-
Nadav Rotem authored
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot. PR12782. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160230
-
Nadav Rotem authored
AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector. This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions. llvm-svn: 160222
-
- Jul 13, 2012
-
-
Benjamin Kramer authored
llvm-svn: 160173
-