Skip to content
  1. Apr 29, 2009
    • Evan Cheng's avatar
      spillPhysRegAroundRegDefsUses() may have invalidated iterators stored in... · 9cce299c
      Evan Cheng authored
      spillPhysRegAroundRegDefsUses() may have invalidated iterators stored in fixed_ IntervalPtrs. Reset them.
      
      llvm-svn: 70378
      9cce299c
    • Bill Wendling's avatar
      Second attempt: · 084669a1
      Bill Wendling authored
      Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
      use the old behavior, the flag is -O0. This change allows for finer-grained
      control over which optimizations are run at different -O levels.
      
      Most of this work was pretty mechanical. The majority of the fixes came from
      verifying that a "fast" variable wasn't used anymore. The JIT still uses a
      "Fast" flag. I'll change the JIT with a follow-up patch.
      
      llvm-svn: 70343
      084669a1
  2. Apr 28, 2009
  3. Apr 27, 2009
  4. Apr 25, 2009
    • Evan Cheng's avatar
      Do not share a single unknown val# for all the live ranges merged into a... · 362acf8a
      Evan Cheng authored
      Do not share a single unknown val# for all the live ranges merged into a physical sub-register live interval. When coalescer is merging in clobbered virtaul register live interval into a physical register live interval, give each virtual register val# a separate val# in the physical register live interval. Otherwise, the coalescer would have lost track of the definitions   information it needs to make correct coalescing decisions.
      
      llvm-svn: 70026
      362acf8a
  5. Apr 24, 2009
  6. Apr 23, 2009
  7. Apr 22, 2009
    • Evan Cheng's avatar
      It has finally happened. Spiller is now using live interval info. · 1a99a5f5
      Evan Cheng authored
      This fixes a very subtle bug. vr defined by an implicit_def is allowed overlap with any register since it doesn't actually modify anything. However, if it's used as a two-address use, its live range can be extended and it can be spilled. The spiller must take care not to emit a reload for the vn number that's defined by the implicit_def. This is both a correctness and performance issue.
      
      llvm-svn: 69743
      1a99a5f5
  8. Apr 20, 2009
    • Evan Cheng's avatar
      Added a linearscan register allocation optimization. When the register... · d67efaa8
      Evan Cheng authored
      Added a linearscan register allocation optimization. When the register allocator spill an interval with multiple uses in the same basic block, it creates a different virtual register for each of the reloads. e.g.
      
      	%reg1498<def> = MOV32rm %reg1024, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
              %reg1506<def> = MOV32rm %reg1024, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
              %reg1486<def> = MOV32rr %reg1506
              %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
              %reg1510<def> = MOV32rm %reg1024, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
      
      =>
      
              %reg1498<def> = MOV32rm %reg2036, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
              %reg1506<def> = MOV32rm %reg2037, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
              %reg1486<def> = MOV32rr %reg1506
              %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
              %reg1510<def> = MOV32rm %reg2038, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
      
      From linearscan's point of view, each of reg2036, 2037, and 2038 are separate registers, each is "killed" after a single use. The reloaded register is available and it's often clobbered right away. e.g. In thise case reg1498 is allocated EAX while reg2036 is allocated RAX. This means we end up with multiple reloads from the same stack slot in the same basic block.
      
      Now linearscan recognize there are other reloads from same SS in the same BB. So it'll "downgrade" RAX (and its aliases) after reg2036 is allocated until the next reload (reg2037) is done. This greatly increase the likihood reloads from SS are reused.
      
      This speeds up sha1 from OpenSSL by 5.8%. It is also an across the board win for SPEC2000 and 2006.
      
      llvm-svn: 69585
      d67efaa8
  9. Apr 18, 2009
  10. Apr 17, 2009
    • Rafael Espindola's avatar
      For general dynamic TLS access we must use · 355fe12c
      Rafael Espindola authored
      leaq	foo@TLSGD(%rip), %rdi
      
      as part of the instruction sequence. Using a register other than %rdi and then
      copying it to %rdi is not valid.
      
      llvm-svn: 69350
      355fe12c
    • Evan Cheng's avatar
      Teach spiller to unfold instructions which modref spill slot when a scratch · b96a1082
      Evan Cheng authored
      register is available and when it's profitable.
      
      e.g.
           xorq  %r12<kill>, %r13
           addq  %rax, -184(%rbp)
           addq  %r13, -184(%rbp)
      ==>
           xorq  %r12<kill>, %r13
           movq  -184(%rbp), %r12
           addq  %rax, %r12
           addq  %r13, %r12
           movq  %r12, -184(%rbp)
      
      Two more instructions, but fewer memory accesses. It can also open up
      opportunities for more optimizations.
      
      llvm-svn: 69341
      b96a1082
  11. Apr 16, 2009
    • Rafael Espindola's avatar
      fix PR3995. A scale must be 1, 2, 4 or 8. · 5e42177a
      Rafael Espindola authored
      llvm-svn: 69284
      5e42177a
    • Dan Gohman's avatar
      Expand GEPs in ScalarEvolution expressions. SCEV expressions can now · 0a40ad93
      Dan Gohman authored
      have pointer types, though in contrast to C pointer types, SCEV
      addition is never implicitly scaled. This not only eliminates the
      need for special code like IndVars' EliminatePointerRecurrence
      and LSR's own GEP expansion code, it also does a better job because
      it lets the normal optimizations handle pointer expressions just
      like integer expressions.
      
      Also, since LLVM IR GEPs can't directly index into multi-dimensional
      VLAs, moving the GEP analysis out of client code and into the SCEV
      framework makes it easier for clients to handle multi-dimensional
      VLAs the same way as other arrays.
      
      Some existing regression tests show improved optimization.
      test/CodeGen/ARM/2007-03-13-InstrSched.ll in particular improved to
      the point where if-conversion started kicking in; I turned it off
      for this test to preserve the intent of the test.
      
      llvm-svn: 69258
      0a40ad93
  12. Apr 15, 2009
  13. Apr 14, 2009
  14. Apr 13, 2009
    • Evan Cheng's avatar
      PR3934: Fix a bogus two-address pass assertion. · f0843803
      Evan Cheng authored
      llvm-svn: 68979
      f0843803
    • Dan Gohman's avatar
      Implement x86 h-register extract support. · 57d6bd36
      Dan Gohman authored
       - Add patterns for h-register extract, which avoids a shift and mask,
         and in some cases a temporary register.
       - Add address-mode matching for turning (X>>(8-n))&(255<<n), where
         n is a valid address-mode scale value, into an h-register extract
         and a scaled-offset address.
       - Replace X86's MOV32to32_ and related instructions with the new
         target-independent COPY_TO_SUBREG instruction.
      
      On x86-64 there are complicated constraints on h registers, and
      CodeGen doesn't currently provide a high-level way to express all of them,
      so they are handled with a bunch of special code. This code currently only
      supports extracts where the result is used by a zero-extend or a store,
      though these are fairly common.
      
      These transformations are not always beneficial; since there are only
      4 h registers, they sometimes require extra move instructions, and
      this sometimes increases register pressure because it can force out
      values that would otherwise be in one of those registers. However,
      this appears to be relatively uncommon.
      
      llvm-svn: 68962
      57d6bd36
    • Rafael Espindola's avatar
      X86-64 TLS support for local exec and initial exec. · 6d6c6043
      Rafael Espindola authored
      llvm-svn: 68947
      6d6c6043
    • Rafael Espindola's avatar
      In X86DAGToDAGISel::MatchWrapper, if base or index are set, avoid matching · 7186f20a
      Rafael Espindola authored
      only if symbolic addresses are RIP relatives.
      
      llvm-svn: 68924
      7186f20a
  15. Apr 12, 2009
  16. Apr 10, 2009
  17. Apr 09, 2009
Loading