Skip to content
  1. Feb 16, 2012
    • Jakob Stoklund Olesen's avatar
      Enable register mask operands for x86 calls. · 8a450cb2
      Jakob Stoklund Olesen authored
      Call instructions no longer have a list of 43 call-clobbered registers.
      Instead, they get a single register mask operand with a bit vector of
      call-preserved registers.
      
      This saves a lot of memory, 42 x 32 bytes = 1344 bytes per call
      instruction, and it speeds up building call instructions because those
      43 imp-def operands no longer need to be added to use-def lists. (And
      removed and shifted and re-added for every explicit call operand).
      
      Passes like LiveVariables, LiveIntervals, RAGreedy, PEI, and
      BranchFolding are significantly faster because they can deal with call
      clobbers in bulk.
      
      Overall, clang -O2 is between 0% and 8% faster, uniformly distributed
      depending on call density in the compiled code.  Debug builds using
      clang -O0 are 0% - 3% faster.
      
      I have verified that this patch doesn't change the assembly generated
      for the LLVM nightly test suite when building with -disable-copyprop
      and -disable-branch-fold.
      
      Branch folding behaves slightly differently in a few cases because call
      instructions have different hash values now.
      
      Copy propagation flushes its data structures when it crosses a register
      mask operand. This causes it to leave a few dead copies behind, on the
      order of 20 instruction across the entire nightly test suite, including
      SPEC. Fixing this properly would require the pass to use different data
      structures.
      
      llvm-svn: 150638
      8a450cb2
  2. Feb 15, 2012
  3. Feb 14, 2012
  4. Feb 13, 2012
  5. Feb 12, 2012
  6. Feb 11, 2012
  7. Feb 10, 2012
    • Jim Grosbach's avatar
      Revert r150222, as the clang driver now handles this properly. · 1c9dd297
      Jim Grosbach authored
      Now that the clang driver passes the CPU and feature information to
      the backend when processing assembly files (150273), this isn't necessary.
      
      llvm-svn: 150274
      1c9dd297
    • Jason W Kim's avatar
      Make valgrind happy. · c7f48417
      Jason W Kim authored
      llvm-svn: 150251
      c7f48417
    • Andrew Trick's avatar
      unnecessary include · f08915ca
      Andrew Trick authored
      llvm-svn: 150228
      f08915ca
    • Andrew Trick's avatar
      PTX no longer needs to provide its own backend. · f4ff2343
      Andrew Trick authored
      llvm-svn: 150227
      f4ff2343
    • Andrew Trick's avatar
      RegAlloc superpass: includes phi elimination, coalescing, and scheduling. · d3f8fe81
      Andrew Trick authored
      Creates a configurable regalloc pipeline.
      
      Ensure specific llc options do what they say and nothing more: -reglloc=... has no effect other than selecting the allocator pass itself. This patch introduces a new umbrella flag, "-optimize-regalloc", to enable/disable the optimizing regalloc "superpass". This allows for example testing coalscing and scheduling under -O0 or vice-versa.
      
      When a CodeGen pass requires the MachineFunction to have a particular property, we need to explicitly define that property so it can be directly queried rather than naming a specific Pass. For example, to check for SSA, use MRI->isSSA, not addRequired<PHIElimination>.
      
      CodeGen transformation passes are never "required" as an analysis
      
      ProcessImplicitDefs does not require LiveVariables.
      
      We have a plan to massively simplify some of the early passes within the regalloc superpass.
      
      llvm-svn: 150226
      d3f8fe81
Loading