Skip to content
  1. Mar 07, 2013
  2. Mar 06, 2013
    • Akira Hatanaka's avatar
      [mips] Custom-legalize BR_JT. · 0f693a8a
      Akira Hatanaka authored
      In N64-static, GOT address is needed to compute the branch address.
      
      llvm-svn: 176580
      0f693a8a
    • Michael Liao's avatar
      Fix PR15355 · da22b30b
      Michael Liao authored
      - Clear 'mayStore' flag when loading from the atomic variable before the
        spin loop
      - Clear kill flag from one use to multiple use in registers forming the
        address to that atomic variable
      - don't use a physical register as live-in register in BB (neither entry
        nor landing pad.) by copying it into virtual register
      
      (patch by Cameron Zwarich)
      
      llvm-svn: 176538
      da22b30b
    • Akira Hatanaka's avatar
      [mips] Remove android calling convention. · 1454ed8a
      Akira Hatanaka authored
      This calling convention was added just to handle functions which return vector
      of floats. The fix committed in r165585 solves the problem.
      
      llvm-svn: 176530
      1454ed8a
  3. Mar 05, 2013
  4. Mar 04, 2013
  5. Mar 02, 2013
    • Jim Grosbach's avatar
      ARM: Creating a vector from a lane of another. · a3c5c769
      Jim Grosbach authored
      The VDUP instruction source register doesn't allow a non-constant lane
      index, so make sure we don't construct a ARM::VDUPLANE node asking it to
      do so.
      
      rdar://13328063
      http://llvm.org/bugs/show_bug.cgi?id=13963
      
      llvm-svn: 176413
      a3c5c769
    • Jim Grosbach's avatar
      Clean up code format a bit. · c6f1914e
      Jim Grosbach authored
      llvm-svn: 176412
      c6f1914e
    • Jim Grosbach's avatar
      Tidy up. Trailing whitespace. · 54efea0a
      Jim Grosbach authored
      llvm-svn: 176411
      54efea0a
    • Arnold Schwaighofer's avatar
      ARM NEON: Fix v2f32 float intrinsics · 99cba969
      Arnold Schwaighofer authored
      Mark them as expand, they are not legal as our backend does not match them.
      
      llvm-svn: 176410
      99cba969
    • Arnold Schwaighofer's avatar
      X86 cost model: Adjust cost for custom lowered vector multiplies · 20ef54f4
      Arnold Schwaighofer authored
      This matters for example in following matrix multiply:
      
      int **mmult(int rows, int cols, int **m1, int **m2, int **m3) {
        int i, j, k, val;
        for (i=0; i<rows; i++) {
          for (j=0; j<cols; j++) {
            val = 0;
            for (k=0; k<cols; k++) {
              val += m1[i][k] * m2[k][j];
            }
            m3[i][j] = val;
          }
        }
        return(m3);
      }
      
      Taken from the test-suite benchmark Shootout.
      
      We estimate the cost of the multiply to be 2 while we generate 9 instructions
      for it and end up being quite a bit slower than the scalar version (48% on my
      machine).
      
      Also, properly differentiate between avx1 and avx2. On avx-1 we still split the
      vector into 2 128bits and handle the subvector muls like above with 9
      instructions.
      Only on avx-2 will we have a cost of 9 for v4i64.
      
      I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an
      add instead of a mul because with a mul we now no longer vectorize. I did
      verify that the mul would be indeed more expensive when vectorized with 3
      kernels:
      
      for (i ...)
         r += a[i] * 3;
      for (i ...)
        m1[i] = m1[i] * 3; // This matches the test case in avx1.ll
      and a matrix multiply.
      
      In each case the vectorized version was considerably slower.
      
      radar://13304919
      
      llvm-svn: 176403
      20ef54f4
    • Andrew Trick's avatar
      Added FIXME for future Hexagon cleanup. · 63474629
      Andrew Trick authored
      llvm-svn: 176400
      63474629
  6. Mar 01, 2013
    • Akira Hatanaka's avatar
      [mips] Fix inefficient code generation. · ece459bb
      Akira Hatanaka authored
      This patch eliminates the need to emit a constant move instruction when this
      pattern is matched:
      
      (select (setgt a, Constant), T, F)
      
      The pattern above effectively turns into this:
      
      (conditional-move (setlt a, Constant + 1), F, T)
      
      llvm-svn: 176384
      ece459bb
    • Akira Hatanaka's avatar
      Fix indentation. · a4c03415
      Akira Hatanaka authored
      llvm-svn: 176380
      a4c03415
    • Michael Liao's avatar
      Fix PR10475 · 6af16fc3
      Michael Liao authored
      - ISD::SHL/SRL/SRA must have either both scalar or both vector operands
        but TLI.getShiftAmountTy() so far only return scalar type. As a
        result, backend logic assuming that breaks.
      - Rename the original TLI.getShiftAmountTy() to
        TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to
        return target-specificed scalar type or the same vector type as the
        1st operand.
      - Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar
        type.
      
      llvm-svn: 176364
      6af16fc3
    • Chad Rosier's avatar
      Add support for using non-pic code for arm and thumb1 when emitting the sjlj · 9660343b
      Chad Rosier authored
      dispatch code.  As far as I can tell the thumb2 code is behaving as expected.
      I was able to compile and run the associated test case for both arm and thumb1.
      rdar://13066352
      
      llvm-svn: 176363
      9660343b
Loading