Skip to content
  1. Aug 07, 2013
  2. Aug 06, 2013
    • Tim Northover's avatar
      Refactor isInTailCallPosition handling · a4415854
      Tim Northover authored
      This change came about primarily because of two issues in the existing code.
      Niether of:
      
      define i64 @test1(i64 %val) {
        %in = trunc i64 %val to i32
        tail call i32 @ret32(i32 returned %in)
        ret i64 %val
      }
      
      define i64 @test2(i64 %val) {
        tail call i32 @ret32(i32 returned undef)
        ret i32 42
      }
      
      should be tail calls, and the function sameNoopInput is responsible. The main
      problem is that it is completely symmetric in the "tail call" and "ret" value,
      but in reality different things are allowed on each side.
      
      For these cases:
      1. Any truncation should lead to a larger value being generated by "tail call"
         than needed by "ret".
      2. Undef should only be allowed as a source for ret, not as a result of the
         call.
      
      Along the way I noticed that a mismatch between what this function treats as a
      valid truncation and what the backends see can lead to invalid calls as well
      (see x86-32 test case).
      
      This patch refactors the code so that instead of being based primarily on
      values which it recurses into when necessary, it starts by inspecting the type
      and considers each fundamental slot that the backend will see in turn. For
      example, given a pathological function that returned {{}, {{}, i32, {}}, i32}
      we would consider each "real" i32 in turn, and ask if it passes through
      unchanged. This is much closer to what the backend sees as a result of
      ComputeValueVTs.
      
      Aside from the bug fixes, this eliminates the recursion that's going on and, I
      believe, makes the bulk of the code significantly easier to understand. The
      trade-off is the nasty iterators needed to find the real types inside a
      returned value.
      
      llvm-svn: 187787
      a4415854
    • NAKAMURA Takumi's avatar
      e359e856
    • Eric Christopher's avatar
      Recommit previous cleanup with a fix for c++98 ambiguity. · 0062f2ed
      Eric Christopher authored
      llvm-svn: 187752
      0062f2ed
    • Tom Stellard's avatar
      TargetLowering: Add getVectorIdxTy() function v2 · d42c5949
      Tom Stellard authored
      This virtual function can be implemented by targets to specify the type
      to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT,
      INSERT_SUBVECTOR, EXTRACT_SUBVECTOR.  The default implementation returns
      the result from TargetLowering::getPointerTy()
      
      The previous code was using TargetLowering::getPointerTy() for vector
      indices, because this is guaranteed to be legal on all targets.  However,
      using TargetLowering::getPointerTy() can be a problem for targets with
      pointer sizes that differ across address spaces.  On such targets,
      when vectors need to be loaded or stored to an address space other than the
      default 'zero' address space (which is the address space assumed by
      TargetLowering::getPointerTy()), having an index that
      is a different size than the pointer can lead to inefficient
      pointer calculations, (e.g. 64-bit adds for a 32-bit address space).
      
      There is no intended functionality change with this patch.
      
      llvm-svn: 187748
      d42c5949
    • Eric Christopher's avatar
      Revert "Use existing builtin hashing functions to make this routine more" · 432c99af
      Eric Christopher authored
      This reverts commit r187745.
      
      llvm-svn: 187747
      432c99af
    • Eric Christopher's avatar
      Use existing builtin hashing functions to make this routine more · d728355a
      Eric Christopher authored
      simple.
      
      llvm-svn: 187745
      d728355a
  3. Aug 05, 2013
  4. Aug 02, 2013
  5. Aug 01, 2013
  6. Jul 31, 2013
    • Eric Christopher's avatar
      Fix crashing on invalid inline asm with matching constraints. · e6656ac8
      Eric Christopher authored
      For a testcase like the following:
      
       typedef unsigned long uint64_t;
      
       typedef struct {
         uint64_t lo;
         uint64_t hi;
       } blob128_t;
      
       void add_128_to_128(const blob128_t *in, blob128_t *res) {
         asm ("PAND %1, %0" : "+Q"(*res) : "Q"(*in));
       }
      
      where we'll fail to allocate the register for the output constraint,
      our matching input constraint will not find a register to match,
      and could try to search past the end of the current operands array.
      
      On the idea that we'd like to attempt to keep compilation going
      to find more errors in the module, change the error cases when
      we're visiting inline asm IR to return immediately and avoid
      trying to create a node in the DAG. This leaves us with only
      a single error message per inline asm instruction, but allows us
      to safely keep going in the general case.
      
      llvm-svn: 187470
      e6656ac8
    • Eric Christopher's avatar
      Reflow this to be easier to read. · 029af150
      Eric Christopher authored
      llvm-svn: 187459
      029af150
  7. Jul 30, 2013
  8. Jul 29, 2013
    • Nico Rieck's avatar
      Use proper section suffix for COFF weak symbols · 7fdaee8f
      Nico Rieck authored
      32-bit symbols have "_" as global prefix, but when forming the name of
      COMDAT sections this prefix is ignored. The current behavior assumes that
      this prefix is always present which is not the case for 64-bit and names
      are truncated.
      
      llvm-svn: 187356
      7fdaee8f
  9. Jul 27, 2013
  10. Jul 26, 2013
  11. Jul 25, 2013
    • Andrew Trick's avatar
      RegAllocGreedy comment. · f4b1ee34
      Andrew Trick authored
      llvm-svn: 187141
      f4b1ee34
    • Andrew Trick's avatar
      Evict local live ranges if they can be reassigned. · 8bb0a251
      Andrew Trick authored
      The previous change to local live range allocation also suppressed
      eviction of local ranges. In rare cases, this could result in more
      expensive register choices. This commit actually revives a feature
      that I added long ago: check if live ranges can be reassigned before
      eviction. But now it only happens in rare cases of evicting a local
      live range because another local live range wants a cheaper register.
      
      The benefit is improved code size for some benchmarks on x86 and armv7.
      
      I measured no significant compile time increase and performance
      changes are noise.
      
      llvm-svn: 187140
      8bb0a251
    • Andrew Trick's avatar
      Allocate local registers in order for optimal coloring. · 8485257d
      Andrew Trick authored
      Also avoid locals evicting locals just because they want a cheaper register.
      
      Problem: MI Sched knows exactly how many registers we have and assumes
      they can be colored. In cases where we have large blocks, usually from
      unrolled loops, greedy coloring fails. This is a source of
      "regressions" from the MI Scheduler on x86. I noticed this issue on
      x86 where we have long chains of two-address defs in the same live
      range. It's easy to see this in matrix multiplication benchmarks like
      IRSmk and even the unit test misched-matmul.ll.
      
      A fundamental difference between the LLVM register allocator and
      conventional graph coloring is that in our model a live range can't
      discover its neighbors, it can only verify its neighbors. That's why
      we initially went for greedy coloring and added eviction to deal with
      the hard cases. However, for singly defined and two-address live
      ranges, we can optimally color without visiting neighbors simply by
      processing the live ranges in instruction order.
      
      Other beneficial side effects:
      
      It is much easier to understand and debug regalloc for large blocks
      when the live ranges are allocated in order. Yes, global allocation is
      still very confusing, but it's nice to be able to comprehend what
      happened locally.
      
      Heuristics could be added to bias register assignment based on
      instruction locality (think late register pairing, banks...).
      
      Intuituvely this will make some test cases that are on the threshold
      of register pressure more stable.
      
      llvm-svn: 187139
      8485257d
    • Adrian Prantl's avatar
      typo. · e4daf52a
      Adrian Prantl authored
      llvm-svn: 187135
      e4daf52a
    • Andrew Trick's avatar
      MI Sched: Register pressure heuristics. · 401b6959
      Andrew Trick authored
      Consider which set is being increased or decreased before comparing.
      
      llvm-svn: 187110
      401b6959
    • Andrew Trick's avatar
Loading