Skip to content
  1. Jul 19, 2012
  2. Jul 18, 2012
  3. Jul 17, 2012
  4. Jul 16, 2012
    • Evan Cheng's avatar
      For something like · 75315b87
      Evan Cheng authored
      uint32_t hi(uint64_t res)
      {
              uint_32t hi = res >> 32;
              return !hi;
      }
      
      llvm IR looks like this:
      define i32 @hi(i64 %res) nounwind uwtable ssp {
      entry:
        %lnot = icmp ult i64 %res, 4294967296
        %lnot.ext = zext i1 %lnot to i32
        ret i32 %lnot.ext
      }
      
      The optimizer has optimize away the right shift and truncate but the resulting
      constant is too large to fit in the 32-bit immediate field. The resulting x86
      code is worse as a result:
              movabsq $4294967296, %rax       ## imm = 0x100000000
              cmpq    %rax, %rdi
              sbbl    %eax, %eax
              andl    $1, %eax
      
      This patch teaches the x86 lowering code to handle ult against a large immediate
      with trailing zeros. It will issue a right shift and a truncate followed by
      a comparison against a shifted immediate.
              shrq    $32, %rdi
              testl   %edi, %edi
              sete    %al
              movzbl  %al, %eax
      
      It also handles a ugt comparison against a large immediate with trailing bits
      set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1
      
      rdar://11866926
      
      llvm-svn: 160312
      75315b87
    • Chad Rosier's avatar
      With r160248 in place this code is no longer needed. · 10e8207c
      Chad Rosier authored
      llvm-svn: 160293
      10e8207c
    • Nadav Rotem's avatar
      Fix a bug in the 3-address conversion of LEA when one of the operands is an · 4968e45b
      Nadav Rotem authored
      undef virtual register. The problem is that ProcessImplicitDefs removes the
      definition of the register and marks all uses as undef. If we lose the undef
      marker then we get a register which has no def, is not marked as undef. The
      live interval analysis does not collect information for these virtual
      registers and we crash in later passes.
      
      Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
      
      llvm-svn: 160260
      4968e45b
    • Alexey Samsonov's avatar
      This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment. · dcc1291d
      Alexey Samsonov authored
      It is intended to fix PR11468.
      
      Old prologue and epilogue looked like this:
      push %rbp
      mov %rsp, %rbp
      and $alignment, %rsp
      push %r14
      push %r15
      ...
      pop %r15
      pop %r14
      mov %rbp, %rsp
      pop %rbp
      
      The problem was to reference the locations of callee-saved registers in exception handling:
      locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would
      take some effort to implement this in LLVM, as currently MachineLocation can only have the form
      "Register + Offset". Funciton prologue and epilogue are now changed to:
      
      push %rbp
      mov %rsp, %rbp
      push %14
      push %15
      and $alignment, %rsp
      ...
      lea -$size_of_saved_registers(%rbp), %rsp
      pop %r15
      pop %r14
      pop %rbp
      
      Reviewed by Chad Rosier.
      
      llvm-svn: 160248
      dcc1291d
  5. Jul 15, 2012
  6. Jul 13, 2012
  7. Jul 12, 2012
  8. Jul 11, 2012
  9. Jul 10, 2012
  10. Jul 09, 2012
    • Manman Ren's avatar
      X86: implement functions to analyze & synthesize CMOV|SET|Jcc · 5f6fa428
      Manman Ren authored
      getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond
      
      No functional change intended.
      If we want to update the condition code of CMOV|SET|Jcc, we first analyze the
      opcode to get the condition code, then update the condition code, finally
      synthesize the new opcode form the new condition code.
      
      llvm-svn: 159955
      5f6fa428
  11. Jul 07, 2012
    • Andrew Trick's avatar
      I'm introducing a new machine model to simultaneously allow simple · 87255e34
      Andrew Trick authored
      subtarget CPU descriptions and support new features of
      MachineScheduler.
      
      MachineModel has three categories of data:
      1) Basic properties for coarse grained instruction cost model.
      2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
      3) Instruction itineraties for detailed per-cycle reservation tables.
      
      These will all live side-by-side. Any subtarget can use any
      combination of them. Instruction itineraries will not change in the
      near term. In the long run, I expect them to only be relevant for
      in-order VLIW machines that have complex contraints and require a
      precise scheduling/bundling model. Once itineraries are only actively
      used by VLIW-ish targets, they could be replaced by something more
      appropriate for those targets.
      
      This tablegen backend rewrite sets things up for introducing
      MachineModel type #2: per opcode/operand cost model.
      
      llvm-svn: 159891
      87255e34
    • Manman Ren's avatar
      X86: Fix optimizeCompare to correctly check safe condition. · bb360740
      Manman Ren authored
      It is safe if EFLAGS is killed or re-defined.
      When we are done with the basic block, check whether EFLAGS is live-out.
      Do not optimize away cmp if EFLAGS is live-out.
      
      llvm-svn: 159888
      bb360740
  12. Jul 06, 2012
    • Manman Ren's avatar
      X86: peephole optimization to remove cmp instruction · c9656737
      Manman Ren authored
      For each Cmp, we check whether there is an earlier Sub which make Cmp
      redundant. We handle the case where SUB operates on the same source operands as
      Cmp, including the case where the two source operands are swapped.
      
      llvm-svn: 159838
      c9656737
  13. Jul 05, 2012
  14. Jul 04, 2012
    • Jakob Stoklund Olesen's avatar
      Ensure CopyToReg nodes are always glued to the call instruction. · 2dee8124
      Jakob Stoklund Olesen authored
      The CopyToReg nodes that set up the argument registers before a call
      must be glued to the call instruction. Otherwise, the scheduler may emit
      the physreg copies long before the call, causing long live ranges for
      the fixed registers.
      
      Besides disabling good register allocation, that can also expose
      problems when EmitInstrWithCustomInserter() splits a basic block during
      the live range of a physreg.
      
      llvm-svn: 159721
      2dee8124
    • Jakob Stoklund Olesen's avatar
      Add early if-conversion support to X86. · 49e4d4b3
      Jakob Stoklund Olesen authored
      Implement the TII hooks needed by EarlyIfConversion to create cmov
      instructions and estimate their latency.
      
      Early if-conversion is still not enabled by default.
      
      llvm-svn: 159695
      49e4d4b3
  15. Jul 03, 2012
Loading