Skip to content
  1. Jun 02, 2009
  2. May 26, 2009
    • Jeffrey Yasskin's avatar
      LiveVariables::VarInfo contains an AliveBlocks BitVector, which has as many · 7d287cb7
      Jeffrey Yasskin authored
      entries as there are basic blocks in the function.  LiveVariables::getVarInfo
      creates a VarInfo struct for every register in the function, leading to
      quadratic space use.  This patch changes the BitVector to a SparseBitVector,
      which doesn't help the worst-case memory use but does reduce the actual use in
      very long functions with short-lived variables.
      
      llvm-svn: 72426
      7d287cb7
  3. May 03, 2009
    • Evan Cheng's avatar
      In some rare cases, the register allocator can spill registers but end up not... · 210fc62a
      Evan Cheng authored
      In some rare cases, the register allocator can spill registers but end up not utilizing registers at all. The fundamental problem is linearscan's backtracking can end up freeing more than one allocated registers. However,  reloads and restores might be folded into uses / defs and freed registers might not be used at all.
      
      VirtRegMap keeps track of allocations so it knows what's not used. As a horrible hack, the stack coloring can color spill slots with *free* registers. That is, it replace reload and spills with copies from and to the free register. It unfold instructions that load and store the spill slot and replace them with register using variants.
      
      Not yet enabled. This is part 1. More coming.
      
      llvm-svn: 70787
      210fc62a
  4. Apr 27, 2009
  5. Apr 22, 2009
    • Evan Cheng's avatar
      It has finally happened. Spiller is now using live interval info. · 1a99a5f5
      Evan Cheng authored
      This fixes a very subtle bug. vr defined by an implicit_def is allowed overlap with any register since it doesn't actually modify anything. However, if it's used as a two-address use, its live range can be extended and it can be spilled. The spiller must take care not to emit a reload for the vn number that's defined by the implicit_def. This is both a correctness and performance issue.
      
      llvm-svn: 69743
      1a99a5f5
  6. Apr 20, 2009
    • Evan Cheng's avatar
      Added a linearscan register allocation optimization. When the register... · d67efaa8
      Evan Cheng authored
      Added a linearscan register allocation optimization. When the register allocator spill an interval with multiple uses in the same basic block, it creates a different virtual register for each of the reloads. e.g.
      
      	%reg1498<def> = MOV32rm %reg1024, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
              %reg1506<def> = MOV32rm %reg1024, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
              %reg1486<def> = MOV32rr %reg1506
              %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
              %reg1510<def> = MOV32rm %reg1024, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
      
      =>
      
              %reg1498<def> = MOV32rm %reg2036, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
              %reg1506<def> = MOV32rm %reg2037, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
              %reg1486<def> = MOV32rr %reg1506
              %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
              %reg1510<def> = MOV32rm %reg2038, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
      
      From linearscan's point of view, each of reg2036, 2037, and 2038 are separate registers, each is "killed" after a single use. The reloaded register is available and it's often clobbered right away. e.g. In thise case reg1498 is allocated EAX while reg2036 is allocated RAX. This means we end up with multiple reloads from the same stack slot in the same basic block.
      
      Now linearscan recognize there are other reloads from same SS in the same BB. So it'll "downgrade" RAX (and its aliases) after reg2036 is allocated until the next reload (reg2037) is done. This greatly increase the likihood reloads from SS are reused.
      
      This speeds up sha1 from OpenSSL by 5.8%. It is also an across the board win for SPEC2000 and 2006.
      
      llvm-svn: 69585
      d67efaa8
  7. Apr 13, 2009
  8. Apr 09, 2009
    • Bob Wilson's avatar
      Fix pr3954. The register scavenger asserts for inline assembly with · 51856173
      Bob Wilson authored
      register destinations that are tied to source operands.  The
      TargetInstrDescr::findTiedToSrcOperand method silently fails for inline
      assembly.  The existing MachineInstr::isRegReDefinedByTwoAddr was very
      close to doing what is needed, so this revision makes a few changes to
      that method and also renames it to isRegTiedToUseOperand (for consistency
      with the very similar isRegTiedToDefOperand and because it handles both
      two-address instructions and inline assembly with tied registers).
      
      llvm-svn: 68714
      51856173
  9. Apr 08, 2009
    • Dan Gohman's avatar
      Implement support for using modeling implicit-zero-extension on x86-64 · ad3e549a
      Dan Gohman authored
      with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
      SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
      instructions), and teach the DAGCombiner to take advantage of this on
      targets which support it. This eliminates many redundant
      zero-extension operations on x86-64.
      
      This adds a new TargetLowering hook, isZExtFree. It's similar to
      isTruncateFree, except it only applies to actual definitions, and not
      no-op truncates which may not zero the high bits.
      
      Also, this adds a new optimization to SimplifyDemandedBits: transform
      operations like x+y into (zext (add (trunc x), (trunc y))) on targets
      where all the casts are no-ops. In contexts where the high part of the
      add is explicitly masked off, this allows the mask operation to be
      eliminated. Fix the DAGCombiner to avoid undoing these transformations
      to eliminate casts on targets where the casts are no-ops.
      
      Also, this adds a new two-address lowering heuristic. Since
      two-address lowering runs before coalescing, it helps to be able to
      look through copies when deciding whether commuting and/or
      three-address conversion are profitable.
      
      Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
      the case that a clobber range extended both before and beyond an
      existing live range. In that case, multiple live ranges need to be
      added. This was exposed by the new subreg coalescing code.
      
      Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
      spiller behavior it was looking for no longer occurrs with the new
      instruction selection.
      
      llvm-svn: 68576
      ad3e549a
  10. Mar 26, 2009
  11. Mar 23, 2009
  12. Mar 20, 2009
    • Sebastian Redl's avatar
      Fix the Win32 VS2008 build: · 8d5baa09
      Sebastian Redl authored
       - Make type declarations match the struct/class keyword of the definition.
       - Move AddSignalHandler into the namespace where it belongs.
       - Correctly call functions from template base.
       - Some other small changes.
      With this patch, LLVM and Clang should build properly and with far less noise under VS2008.
      
      llvm-svn: 67347
      8d5baa09
  13. Mar 19, 2009
  14. Mar 05, 2009
  15. Feb 08, 2009
  16. Jan 29, 2009
  17. Jan 20, 2009
  18. Jan 07, 2009
    • Evan Cheng's avatar
      The coalescer does not coalesce a virtual register to a physical register if... · f6768bd9
      Evan Cheng authored
      The coalescer does not coalesce a virtual register to a physical register if any of the physical register's sub-register live intervals overlaps with the virtual register. This is overly conservative. It prevents a extract_subreg from being coalesced away:
      
      v1024 = EDI  // not killed
            =
            = EDI
      
      One possible solution is for the coalescer to examine the sub-register live intervals in the same manner as the physical register. Another possibility is to examine defs and uses (when needed) of sub-registers. Both solutions are too expensive. For now, look for "short virtual intervals" and scan instructions to look for conflict instead.
      
      This is a small win on x86-64. e.g. It shaves 403.gcc by ~80 instructions.
      
      llvm-svn: 61847
      f6768bd9
  19. Dec 19, 2008
    • Evan Cheng's avatar
      Fix PR3149. If an early clobber def is a physical register and it is tied to... · 0869f785
      Evan Cheng authored
      Fix PR3149. If an early clobber def is a physical register and it is tied to an input operand, it effectively extends the live range of the physical register. Currently we do not have a good way to represent this.
      
      172     %ECX<def> = MOV32rr %reg1039<kill>
      180     INLINEASM <es:subl $5,$1
              sbbl $3,$0>, 10, %EAX<def>, 14, %ECX<earlyclobber,def>, 9, %EAX<kill>,
      36, <fi#0>, 1, %reg0, 0, 9, %ECX<kill>, 36, <fi#1>, 1, %reg0, 0
      188     %EAX<def> = MOV32rr %EAX<kill>
      196     %ECX<def> = MOV32rr %ECX<kill>
      204     %ECX<def> = MOV32rr %ECX<kill>
      212     %EAX<def> = MOV32rr %EAX<kill>
      220     %EAX<def> = MOV32rr %EAX
      228     %reg1039<def> = MOV32rr %ECX<kill>
      
      The early clobber operand ties ECX input to the ECX def.
      
      The live interval of ECX is represented as this:
      %reg20,inf = [46,47:1)[174,230:0)  0@174-(230) 1@46-(47)
      
      The right way to represent this is something like
      %reg20,inf = [46,47:2)[174,182:1)[181:230:0)  0@174-(182) 1@181-230 @2@46-(47)
      
      Of course that won't work since that means overlapping live ranges defined by two val#.
      
      The workaround for now is to add a bit to val# which says the val# is redefined by a early clobber def somewhere. This prevents the move at 228 from being optimized away by SimpleRegisterCoalescing::AdjustCopiesBackFrom.
      
      llvm-svn: 61259
      0869f785
  20. Dec 08, 2008
  21. Dec 05, 2008
  22. Dec 03, 2008
  23. Nov 26, 2008
  24. Nov 21, 2008
  25. Nov 13, 2008
  26. Nov 12, 2008
  27. Oct 29, 2008
  28. Oct 27, 2008
    • David Greene's avatar
      · b00b267b
      David Greene authored
      Fix PR2634.  Create new virtual registers from spills early so that we
      can give it the same stack slot as the spilled interval if it is folded.
      This prevents the fold/unfold code from pointing to the wrong register.
      
      llvm-svn: 58255
      b00b267b
  29. Oct 24, 2008
  30. Oct 18, 2008
  31. Oct 07, 2008
  32. Oct 03, 2008
  33. Oct 01, 2008
  34. Sep 30, 2008
Loading