Skip to content
  1. Nov 14, 2016
  2. Nov 08, 2016
  3. Oct 31, 2016
  4. Oct 27, 2016
  5. Oct 23, 2016
  6. Oct 20, 2016
  7. Oct 18, 2016
    • Simon Pilgrim's avatar
      [X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32 · 4ddc92b6
      Simon Pilgrim authored
      As discussed on PR28461 we currently miss the chance to lower "fptosi <2 x double> %arg to <2 x i32>" to cvttpd2dq due to its use of illegal types.
      
      This patch adds support for fptosi to 2i32 from both 2f64 and 2f32.
      
      It also recognises that cvttpd2dq zeroes the upper 64-bits of the xmm result (similar to D23797) - we still don't do this for the cvttpd2dq/cvttps2dq intrinsics - this can be done in a future patch.
      
      Differential Revision: https://reviews.llvm.org/D23808
      
      llvm-svn: 284459
      4ddc92b6
  8. Oct 03, 2016
  9. Oct 01, 2016
  10. Sep 18, 2016
  11. Aug 21, 2016
  12. Aug 19, 2016
  13. Aug 05, 2016
    • Michael Kuperstein's avatar
      [LV, X86] Be more optimistic about vectorizing shifts. · 3ceac2bb
      Michael Kuperstein authored
      Shifts with a uniform but non-constant count were considered very expensive to
      vectorize, because the splat of the uniform count and the shift would tend to
      appear in different blocks. That made the splat invisible to ISel, and we'd
      scalarize the shift at codegen time.
      
      Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we
      are able to select the appropriate vector shifts. This updates the cost model to
      to take this into account by making shifts by a uniform cheap again.
      
      Differential Revision: https://reviews.llvm.org/D23049
      
      llvm-svn: 277782
      3ceac2bb
  14. Aug 04, 2016
  15. Jul 20, 2016
  16. Jul 17, 2016
  17. Jul 11, 2016
  18. Jul 06, 2016
    • Sanjay Patel's avatar
      [x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434) · 04b3496d
      Sanjay Patel authored
      This is "cvtdq2ps" which does not appear to be particularly slow on any CPU
      according to Agner's tables. Choosing "5" as a cost here as suggested in:
      https://llvm.org/bugs/show_bug.cgi?id=21356
      ...but it seems very conservative given that the instruction is fully pipelined,
      and I think these costs are supposed to model throughput.
      
      Note that related costs are also most likely too high, but this fixes PR21356
      and partly fixes PR28434.
      
      llvm-svn: 274658
      04b3496d
    • Michael Kuperstein's avatar
      [TTI] The cost model should not assume vector casts get completely scalarized · aa71bdd3
      Michael Kuperstein authored
      The cost model should not assume vector casts get completely scalarized, since
      on targets that have vector support, the common case is a partial split up to
      the legal vector size. So, when a vector cast  gets split, the resulting casts
      end up legal and cheap.
      
      Instead of pessimistically assuming scalarization, base TTI can use the costs
      the concrete TTI provides for the split vector, plus a fudge factor to account
      for the cost of the split itself. This fudge factor is currently 1 by default,
      except on AMDGPU where inserts and extracts are considered free.
      
      Differential Revision: http://reviews.llvm.org/D21251
      
      llvm-svn: 274642
      aa71bdd3
  19. Jun 28, 2016
    • Artur Pilipenko's avatar
      Support arbitrary addrspace pointers in masked load/store intrinsics · 7ad95ec2
      Artur Pilipenko authored
      This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
      
      This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
      
      The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
      
      Reviewed By: reames
      
      Differential Revision: http://reviews.llvm.org/D17270
      
      llvm-svn: 274043
      7ad95ec2
  20. Jun 27, 2016
    • Artur Pilipenko's avatar
      Revert -r273892 "Support arbitrary addrspace pointers in masked load/store... · 72f76b88
      Artur Pilipenko authored
      Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures. 
      
      llvm-svn: 273895
      72f76b88
    • Artur Pilipenko's avatar
      Support arbitrary addrspace pointers in masked load/store intrinsics · a36aa415
      Artur Pilipenko authored
      This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
      
      This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
      
      The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
      
      Reviewed By: reames
      
      Differential Revision: http://reviews.llvm.org/D17270
      
      llvm-svn: 273892
      a36aa415
  21. Jun 21, 2016
  22. Jun 11, 2016
Loading