Skip to content
  1. Jul 28, 2012
  2. Jul 27, 2012
  3. Jul 26, 2012
    • Jakob Stoklund Olesen's avatar
      Use an otherwise unused variable. · 35400b1d
      Jakob Stoklund Olesen authored
      llvm-svn: 160798
      35400b1d
    • Jakob Stoklund Olesen's avatar
      Start scaffolding for a MachineTraceMetrics analysis pass. · f9029fef
      Jakob Stoklund Olesen authored
      This is still a work in progress.
      
      Out-of-order CPUs usually execute instructions from multiple basic
      blocks simultaneously, so it is necessary to look at longer traces when
      estimating the performance effects of code transformations.
      
      The MachineTraceMetrics analysis will pick a typical trace through a
      given basic block and provide performance metrics for the trace. Metrics
      will include:
      
      - Instruction count through the trace.
      - Issue count per functional unit.
      - Critical path length, and per-instruction 'slack'.
      
      These metrics can be used to determine the performance limiting factor
      when executing the trace, and how it will be affected by a code
      transformation.
      
      Initially, this will be used by the early if-conversion pass.
      
      llvm-svn: 160796
      f9029fef
    • Dan Gohman's avatar
      Add a floor intrinsic. · 0b3d7829
      Dan Gohman authored
      llvm-svn: 160791
      0b3d7829
  4. Jul 25, 2012
    • Manman Ren's avatar
      Disable rematerialization in TwoAddressInstructionPass. · cc1dc6dc
      Manman Ren authored
      It is redundant; RegisterCoalescer will do the remat if it can't eliminate
      the copy. Collected instruction counts before and after this. A few extra
      instructions are generated due to spilling but it is normal to see these kinds
      of changes with almost any small codegen change, according to Jakob.
      
      This also fixed rdar://11830760 where xor is expected instead of movi0.
      
      llvm-svn: 160749
      cc1dc6dc
    • Jakob Stoklund Olesen's avatar
      Preserve 2-addr constraints in ConnectedVNInfoEqClasses. · cef9a618
      Jakob Stoklund Olesen authored
      When a live range splits into multiple connected components, we would
      arbitrarily assign <undef> uses to component 0. This is wrong when the
      use is tied to a def that gets assigned to a different component:
      
        %vreg69<def> = ADD8ri %vreg68<undef>, 1
      
      The use and def must get the same virtual register.
      
      Fix this by assigning <undef> uses to the same component as the value
      defined by the instruction, if any:
      
        %vreg69<def> = ADD8ri %vreg69<undef>, 1
      
      This fixes PR13402. The PR has a test case which I am not including
      because it is unlikely to keep exposing this behavior in the future.
      
      llvm-svn: 160739
      cef9a618
    • Jakob Stoklund Olesen's avatar
      Verify two-address constraints more carefully. · c6fd3dee
      Jakob Stoklund Olesen authored
      Include <undef> operands and virtual registers after leaving SSA form.
      
      llvm-svn: 160734
      c6fd3dee
  5. Jul 24, 2012
  6. Jul 23, 2012
  7. Jul 21, 2012
  8. Jul 20, 2012
    • Jakob Stoklund Olesen's avatar
      Avoid folding loads that are unsafe to move. · e2cfd0d4
      Jakob Stoklund Olesen authored
      LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
      into its only use. Only do that when the load is safe to move, and it
      won't extend any live ranges.
      
      This fixes PR13414.
      
      llvm-svn: 160575
      e2cfd0d4
    • Jakob Stoklund Olesen's avatar
      Split loop exiting edges more aggressively. · f62c07f1
      Jakob Stoklund Olesen authored
      PHIElimination splits critical edges when it predicts it can resolve
      interference and eliminate copies. It doesn't split the edge if the
      interference wouldn't be resolved anyway because the phi-use register is
      live in the critical edge anyway.
      
      Teach PHIElimination to split loop exiting edges with interference, even
      if it wouldn't resolve the interference. This removes the necessary
      copies from the loop, which is still an improvement from injecting the
      copies into the loop.
      
      The test case demonstrates the improvement. Before:
      
      LBB0_1:
        cmpb  $0, (%rdx)
        leaq  1(%rdx), %rdx
        movl  %esi, %eax
        je  LBB0_1
      
      After:
      
      LBB0_1:
        cmpb  $0, (%rdx)
        leaq  1(%rdx), %rdx
        je  LBB0_1
      
        movl  %esi, %eax
      
      llvm-svn: 160571
      f62c07f1
    • Pete Cooper's avatar
  9. Jul 19, 2012
  10. Jul 18, 2012
    • Chandler Carruth's avatar
      Fix a somewhat nasty crasher in PR13378. This crashes inside of · 985454e0
      Chandler Carruth authored
      LiveIntervals due to the two-addr pass generating bogus MI code.
      
      The crux of the issue was a loop nesting problem. The intent of the code
      which attempts to transform instructions before converting them to
      two-addr form is to defer and reprocess any transformed instructions as
      the second processing is likely to have more opportunities to coalesce
      copies, etc. Unfortunately, there was one section of processing that was
      not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
      rewriting proceeded, not only did it occur early, it removed the bits of
      information needed for the deferred processing to correctly generate the
      necessary two address form (specifically inserting a copy), but didn't
      trigger any immediate assertions and produced what appeared to be
      already valid two-address from code. Thus, the assertion only fired much
      later in the pipeline.
      
      The fix is to hoist the transformation logic up layer to where it can
      more firmly defer all further processing, and to teach the normal
      processing to handle an edge case previously handled as part of the
      transformation logic. This edge case (already matched tied register
      operands) needs to *not* defer any steps.
      
      As has been brought up repeatedly in the process: wow does this code
      need refactoring. I *may* squeeze in some time to at least bring sanity
      to this loop... but wow... =]
      
      Thanks to Jakob for helpful hints on the way here, and the review.
      
      llvm-svn: 160443
      985454e0
    • Nuno Lopes's avatar
      2151497d
  11. Jul 17, 2012
  12. Jul 16, 2012
    • Nadav Rotem's avatar
      Minor cleanup and docs. · 60f7904d
      Nadav Rotem authored
      llvm-svn: 160311
      60f7904d
    • Nadav Rotem's avatar
      · 839a06e9
      Nadav Rotem authored
      Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
      In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
      reported that some of the bits were both known to be one and known to be zero.
      
      Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
      
      llvm-svn: 160305
      839a06e9
  13. Jul 15, 2012
    • Nadav Rotem's avatar
      Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be... · 3050e071
      Nadav Rotem authored
      Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.
      
      Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
      
      llvm-svn: 160235
      3050e071
    • Nadav Rotem's avatar
      Refactor the code that checks that all operands of a node are UNDEFs. · a62368c9
      Nadav Rotem authored
      Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
      Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.
      
      Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
      
      llvm-svn: 160229
      a62368c9
    • Chandler Carruth's avatar
      Reapply r160194, switching to use LV information for finding local kills. · db5536f0
      Chandler Carruth authored
      The notable fix is to look at any dependencies attached to the kill
      instruction (or other instructions between MI nad the kill) where the
      dependencies are specific to the register in question.
      
      The old code implicitly handled this by rejecting the transform if *any*
      other uses were found within the block, but after the start point. The
      new code directly finds the kill, and has to re-use the existing
      dependency scan to check for non-kill uses.
      
      This was caught by self-host, but I found the bug via inspection and use
      of absurd assert scaffolding to compute the kills in two ways and
      compare them. So I have no useful testcase for this other than
      "bootstrap". I'd work harder to reduce a test case if this particular
      code were likely to live for a long time.
      
      Thanks to Benjamin Kramer for reviewing the fix itself.
      
      llvm-svn: 160228
      db5536f0
  14. Jul 14, 2012
Loading