Skip to content
  1. Nov 18, 2016
  2. Nov 17, 2016
  3. Nov 16, 2016
  4. Nov 15, 2016
  5. Nov 14, 2016
  6. Nov 13, 2016
  7. Nov 12, 2016
  8. Nov 11, 2016
  9. Nov 10, 2016
  10. Nov 08, 2016
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies · 92e01ee9
      Stanislav Mekhanoshin authored
      Codegen prepare sinks comparisons close to a user is we have only one register
      for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
      Changed BE to report we have many condition registers. That way IR LICM pass
      would hoist an invariant comparison out of a loop and codegen prepare will not
      sink it.
      
      With that done a condition is calculated in one block and used in another.
      Current behavior is to store workitem's condition in a VGPR using v_cndmask
      and then restore it with yet another v_cmp instruction from that v_cndmask's
      result. To mitigate the issue a forward propagation of a v_cmp 64 bit result
      to an user is implemented. Additional side effect of this is that we may
      consume less VGPRs in a cost of more SGPRs in case if holding of multiple
      conditions is needed, and that is a clear win in most cases.
      
      llvm-svn: 286171
      92e01ee9
  11. Nov 07, 2016
  12. Nov 04, 2016
  13. Nov 03, 2016
  14. Nov 02, 2016
  15. Nov 01, 2016
    • Matt Arsenault's avatar
      AMDGPU: Default to using scalar mov to materialize immediate · 3d463193
      Matt Arsenault authored
      This is the conservatively correct way because it's easy to
      move or replace a scalar immediate. This was incorrect in the case
      when the register class wasn't known from the static instruction
      definition, but still needed to be an SGPR. The main example of this
      is inlineasm has an SGPR constraint.
      
      Also start verifying the register classes of inlineasm operands.
      
      llvm-svn: 285762
      3d463193
    • Konstantin Zhuravlyov's avatar
      [AMDGPU] Check if type transforms to i16 (VI+) when getting AMDGPUISD::FFBH_U32 · d971a112
      Konstantin Zhuravlyov authored
      This will prevent following regression when enabling i16 support (D18049):
      
      test/CodeGen/AMDGPU/ctlz.ll
      test/CodeGen/AMDGPU/ctlz_zero_undef.ll
      
      Differential Revision: https://reviews.llvm.org/D25802
      
      llvm-svn: 285716
      d971a112
    • Tom Stellard's avatar
      AMDGPU: Implement expansion of f16 = FP_TO_FP16 f64 · 94c21bc0
      Tom Stellard authored
      I wanted to implement this as a target independent expansion, however when
      targets say they want to expand FP_TO_FP16 what they actually want is
      the unsafe math expansion when possible and expansion to a libcall in all
      other cases.
      
      The only way to make this work as a target independent would be to add logic
      to target's TargetLowering construction to mark theses nodes as Expand when
      LegalizeDAG can use the unsafe expansion and mark them as LibCall when it
      cannot.  I think this would be possible, but I think it would be too fragile
      and complex as it would require targets to keep their expansion logic up
      to date with the code in LegalizeDAG.
      
      Reviewers: bogner, ab, t.p.northover, arsenm
      
      Subscribers: wdng, llvm-commits, nhaehnle
      
      Differential Revision: https://reviews.llvm.org/D25999
      
      llvm-svn: 285704
      94c21bc0
    • Valery Pykhtin's avatar
      [AMDGPU] Expand vector mulhu/mulhs · 8a89d366
      Valery Pykhtin authored
      Differential revision: https://reviews.llvm.org/D26077
      
      llvm-svn: 285684
      8a89d366
  16. Oct 29, 2016
  17. Oct 28, 2016
Loading