Skip to content
  1. Feb 15, 2012
  2. Feb 13, 2012
  3. Jan 20, 2012
  4. Jan 12, 2012
    • Chandler Carruth's avatar
      Switch all of the uses of my InsertDAGNode helper to follow the exact · eb21da06
      Chandler Carruth authored
      same pattern. We already had this pattern is a few places, but others
      tried to make a rough approximation of an actual DAG structure. As not
      everywhere went to this trouble, nothing could rely on this being done.
      In fact, I've checked all references to these node Ids, and the ones
      that are using the topo-sort properties are actually satisfied with
      a strict-weak-ordering. The requirement appears to be that Use >= Def.
      
      I've added a big blurb of comments to this bit of the transform to
      clarify why the order is so important for the next reader of the code.
      
      I'm starting with this change as it is very small, and trivially
      reverted if something breaks or the >= above really does need to be >.
      If that proves the case, we can hide the problem by reverting this
      patch, but the problem exists elsewhere as well, and so a more
      comprehensive solution will be needed.
      
      llvm-svn: 148001
      eb21da06
  5. Jan 11, 2012
  6. Jan 09, 2012
    • Chandler Carruth's avatar
      Don't rely on the fact that shift values are never very large, and thus · c16622da
      Chandler Carruth authored
      this substraction will result in small negative numbers at worst which
      become very large positive numbers on assignment and are thus caught by
      the <=4 check on the next line. The >0 check clearly intended to catch
      these as negative numbers.
      
      Spotted by inspection, and impossible to trigger given the shift widths
      that can be used.
      
      llvm-svn: 147773
      c16622da
  7. Nov 16, 2011
  8. Nov 15, 2011
  9. Nov 03, 2011
  10. Oct 29, 2011
  11. Oct 28, 2011
    • Dan Gohman's avatar
      Reapply r143177 and r143179 (reverting r143188), with scheduler · 73057ad2
      Dan Gohman authored
      fixes: Use a separate register, instead of SP, as the
      calling-convention resource, to avoid spurious conflicts with
      actual uses of SP. Also, fix unscheduling of calling sequences,
      which can be triggered by pseudo-two-address dependencies.
      
      llvm-svn: 143206
      73057ad2
    • Duncan Sands's avatar
      Speculatively disable Dan's commits 143177 and 143179 to see if · 225a7037
      Duncan Sands authored
      it fixes the dragonegg self-host (it looks like gcc is miscompiled).
      Original commit messages:
      Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
      on every node as it legalizes them. This makes it easier to use
      hasOneUse() heuristics, since unneeded nodes can be removed from the
      DAG earlier.
      
      Make LegalizeOps visit the DAG in an operands-last order. It previously
      used operands-first, because LegalizeTypes has to go operands-first, and
      LegalizeTypes used to be part of LegalizeOps, but they're now split.
      The operands-last order is more natural for several legalization tasks.
      For example, it allows lowering code for nodes with floating-point or
      vector constants to see those constants directly instead of seeing the
      lowered form (often constant-pool loads). This makes some things
      somewhat more complicated today, though it ought to allow things to be
      simpler in the future. It also fixes some bugs exposed by Legalizing
      using RAUW aggressively.
      
      Remove the part of LegalizeOps that attempted to patch up invalid chain
      operands on libcalls generated by LegalizeTypes, since it doesn't work
      with the new LegalizeOps traversal order. Instead, define what
      LegalizeTypes is doing to be correct, and transfer the responsibility
      of keeping calls from having overlapping calling sequences into the
      scheduler.
      
      Teach the scheduler to model callseq_begin/end pairs as having a
      physical register definition/use to prevent calls from having
      overlapping calling sequences. This is also somewhat complicated, though
      there are ways it might be simplified in the future.
      
      This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
      Please direct high-level questions about this patch to management.
      
      Delete #if 0 code accidentally left in.
      
      llvm-svn: 143188
      225a7037
    • Dan Gohman's avatar
      Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW · 4db3f7dd
      Dan Gohman authored
      on every node as it legalizes them. This makes it easier to use
      hasOneUse() heuristics, since unneeded nodes can be removed from the
      DAG earlier.
      
      Make LegalizeOps visit the DAG in an operands-last order. It previously
      used operands-first, because LegalizeTypes has to go operands-first, and
      LegalizeTypes used to be part of LegalizeOps, but they're now split.
      The operands-last order is more natural for several legalization tasks.
      For example, it allows lowering code for nodes with floating-point or
      vector constants to see those constants directly instead of seeing the
      lowered form (often constant-pool loads). This makes some things
      somewhat more complicated today, though it ought to allow things to be
      simpler in the future. It also fixes some bugs exposed by Legalizing
      using RAUW aggressively.
      
      Remove the part of LegalizeOps that attempted to patch up invalid chain
      operands on libcalls generated by LegalizeTypes, since it doesn't work
      with the new LegalizeOps traversal order. Instead, define what
      LegalizeTypes is doing to be correct, and transfer the responsibility
      of keeping calls from having overlapping calling sequences into the
      scheduler.
      
      Teach the scheduler to model callseq_begin/end pairs as having a
      physical register definition/use to prevent calls from having
      overlapping calling sequences. This is also somewhat complicated, though
      there are ways it might be simplified in the future.
      
      This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
      Please direct high-level questions about this patch to management.
      
      llvm-svn: 143177
      4db3f7dd
  12. Oct 08, 2011
    • Jakob Stoklund Olesen's avatar
      Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies. · 729abd36
      Jakob Stoklund Olesen authored
      In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX
      instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot
      target all GR8 registers, only those in GR8_NOREX.
      
      TO enforce this, we ensure that all instructions using the
      EXTRACT_SUBREG are GR8_NOREX constrained.
      
      This fixes PR11088.
      
      llvm-svn: 141499
      729abd36
  13. Aug 01, 2011
  14. Jul 13, 2011
  15. Jul 02, 2011
  16. Jun 30, 2011
  17. May 20, 2011
  18. May 17, 2011
  19. May 11, 2011
  20. Apr 23, 2011
  21. Apr 22, 2011
    • Benjamin Kramer's avatar
      X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X &... · 4c816247
      Benjamin Kramer authored
      X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039)
      
      This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is
      uint64_t foo(uint64_t x) { return (x&1) << 42; }
      which used to compile into bloated code:
      	shlq	$42, %rdi               ## encoding: [0x48,0xc1,0xe7,0x2a]
      	movabsq	$4398046511104, %rax    ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00]
      	andq	%rdi, %rax              ## encoding: [0x48,0x21,0xf8]
      	ret                             ## encoding: [0xc3]
      
      with this patch we can fold the immediate into the and:
      	andq	$1, %rdi                ## encoding: [0x48,0x83,0xe7,0x01]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	shlq	$42, %rax               ## encoding: [0x48,0xc1,0xe0,0x2a]
      	ret                             ## encoding: [0xc3]
      
      It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing
      that without making this code even more complicated. See the TODOs in the code.
      
      llvm-svn: 129990
      4c816247
  22. Feb 16, 2011
  23. Feb 13, 2011
    • Chris Lattner's avatar
      Enhance ComputeMaskedBits to know that aligned frameindexes · 46c01a30
      Chris Lattner authored
      have their low bits set to zero.  This allows us to optimize
      out explicit stack alignment code like in stack-align.ll:test4 when
      it is redundant.
      
      Doing this causes the code generator to start turning FI+cst into
      FI|cst all over the place, which is general goodness (that is the
      canonical form) except that various pieces of the code generator
      don't handle OR aggressively.  Fix this by introducing a new
      SelectionDAG::isBaseWithConstantOffset predicate, and using it
      in places that are looking for ADD(X,CST).  The ARM backend in
      particular was missing a lot of addressing mode folding opportunities
      around OR.
      
      llvm-svn: 125470
      46c01a30
  24. Jan 27, 2011
  25. Jan 16, 2011
  26. Jan 14, 2011
Loading