Skip to content
  1. Jul 26, 2013
  2. Jul 24, 2013
  3. Jul 16, 2013
    • Juergen Ributzka's avatar
      [X86] Use min/max to optimze unsigend vector comparison on X86 · 3d527d80
      Juergen Ributzka authored
      Use PMIN/PMAX for UGE/ULE vector comparions to reduce the number of required
      instructions. This trick also works for UGT/ULT, but there is no advantage in
      doing so. It wouldn't reduce the number of instructions and it would actually
      reduce performance.
      
      Reviewer: Ben
      
      radar:5972691
      
      llvm-svn: 186432
      3d527d80
  4. Jul 15, 2013
  5. Jul 14, 2013
  6. Jul 12, 2013
  7. Jul 09, 2013
    • Stephen Lin's avatar
      AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all · 73de7bf5
      Stephen Lin authored
      in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in
      order to resolve the following issues with fmuladd (i.e. optional FMA)
      intrinsics:
      
      1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd
      intrinsics even if the subtarget does not support FMA instructions, leading
      to laughably bad code generation in some situations.
      
      2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128,
      resulting in a call to a software fp128 FMA implementation.
      
      3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types
      like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize,
      etc. to types that support hardware FMAs.
      
      The function has also been slightly renamed for consistency and to force a
      merge/build conflict for any out-of-tree target implementing it. To resolve,
      see comments and fixed in-tree examples.
      
      llvm-svn: 185956
      73de7bf5
  8. Jul 08, 2013
  9. Jul 07, 2013
  10. Jul 06, 2013
  11. Jul 04, 2013
  12. Jul 03, 2013
  13. Jun 26, 2013
  14. Jun 22, 2013
  15. Jun 07, 2013
  16. May 30, 2013
    • Andrew Trick's avatar
      Order CALLSEQ_START and CALLSEQ_END nodes. · ad6d08ac
      Andrew Trick authored
      Fixes PR16146: gdb.base__call-ar-st.exp fails after
      pre-RA-sched=source fixes.
      
      Patch by Xiaoyi Guo!
      
      This also fixes an unsupported dbg.value test case. Codegen was
      previously incorrect but the test was passing by luck.
      
      llvm-svn: 182885
      ad6d08ac
  17. May 25, 2013
  18. May 22, 2013
  19. May 21, 2013
  20. May 18, 2013
  21. May 17, 2013
    • Benjamin Kramer's avatar
      X86: Make shuffle -> shift conversion more aggressive about undefs. · fc33e1d9
      Benjamin Kramer authored
      Shuffles that only move an element into position 0 of the vector are common in
      the output of the loop vectorizer and often generate suboptimal code when SSSE3
      is not available. Lower them to vector shifts if possible.
      
      We still prefer palignr over psrldq because it has higher throughput on
      sandybridge.
      
      llvm-svn: 182102
      fc33e1d9
  22. May 05, 2013
  23. May 02, 2013
  24. Apr 20, 2013
  25. Apr 19, 2013
  26. Apr 18, 2013
  27. Apr 11, 2013
    • Michael Liao's avatar
      Optimize vector select from all 0s or all 1s · 55658d42
      Michael Liao authored
      As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane,
      vector select could be simplified to AND/OR or removed if one or both values
      being selected is all 0s or all 1s.
      
      llvm-svn: 179267
      55658d42
    • Michael Liao's avatar
      Enhance bool simplifcation in X86 to handle more cases · f7bf8705
      Michael Liao authored
      This patch is revised based on patch from Victor Umansky
      <victor.umansky@intel.com>. More cases are handled in X86's bool
      simplification, i.e.
      - SETCC_CARRY
      - value is truncated to i1 with AND
      
      As a by-product, PR5443 is also fixed.
      
      llvm-svn: 179265
      f7bf8705
Loading