Skip to content
  1. Jan 30, 2016
  2. Jan 29, 2016
  3. Jan 27, 2016
  4. Jan 26, 2016
  5. Jan 23, 2016
  6. Jan 22, 2016
  7. Jan 19, 2016
  8. Jan 08, 2016
  9. Jan 01, 2016
  10. Dec 31, 2015
  11. Dec 28, 2015
  12. Dec 20, 2015
  13. Dec 17, 2015
    • Artem Belevich's avatar
      [CUDA] runtime wrapper header tweaks · 8e9ba042
      Artem Belevich authored
      * Pull in host-only implementations of few CUDA-specific math functions.
      * #nclude <cmath> early to prevent its inclusion from CUDA headers after
        they've messed with __THROW macro.
      
      llvm-svn: 255933
      8e9ba042
  14. Dec 16, 2015
  15. Dec 08, 2015
  16. Dec 07, 2015
  17. Dec 02, 2015
  18. Dec 01, 2015
    • Craig Topper's avatar
      [X86] Improve codegen for AVX2 gather with an all 1s mask. · 5ec97a7b
      Craig Topper authored
      Use undefined instead of setzero as the pass through input since its going to be fully overwritten. Use cmpeq of two zero vectors to produce the all 1s vector. Casting -1 to a double and vectorizing causes a constant load of a -1.0 floating point value.
      
      llvm-svn: 254389
      5ec97a7b
  19. Nov 29, 2015
  20. Nov 20, 2015
  21. Nov 17, 2015
    • Artem Belevich's avatar
      [CUDA] Added a wrapper header for inclusion of stock CUDA headers. · c29db844
      Artem Belevich authored
      Header files that come with CUDA are assuming split host/device
      compilation and are not usable by clang out of the box.
      With a bit of preprocessor magic it's possible to twist them
      into something clang can use.
      
      This wrapper always includes CUDA headers exactly the same way during
      host and device compilation passes and produces identical preprocessed
      content during host and device side compilation for sm_35 GPUs. Device
      compilation passes for older GPUs will see a smaller subset of device
      functions supported by particular GPU.
      
      The wrapper assumes specific contents of CUDA header files and works
      only with CUDA 7.0 and 7.5.
      
      Differential Revision: http://reviews.llvm.org/D13171
      
      llvm-svn: 253388
      c29db844
    • Hans Wennborg's avatar
      bmiintrin.h: Allow using the tzcnt intrinsics for non-BMI targets · 1acf955a
      Hans Wennborg authored
      The tzcnt intrinsics are used non non-BMI targets by code (e.g. ffmpeg)
      that uses it as a potentially faster BSF.
      
      The TZCNT instruction is special in that it's encoded in a
      backward-compatible way and behaves as BSF on non-BMI targets.
      
      Differential Revision: http://reviews.llvm.org/D14748
      
      llvm-svn: 253358
      1acf955a
  22. Nov 16, 2015
    • Oliver Stannard's avatar
      [ARM,AArch64] Fix __rev16l and __rev16ll intrinsics · 7aa90f57
      Oliver Stannard authored
      These two intrinsics are defined in arm_acle.h.
      
      __rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits.
      For AArch64, where long is 64 bits, this would still be wrong.
      
      __rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than
      each 16-bit halfword. The correct implementation is to apply __rev16 to the top
      and bottom words of the 64-bit value.
      
      For AArch32 targets, these get compiled down to the hardware rev16 instruction
      at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two
      32-bit rev16 instructions, because there is not currently a pattern for the
      64-bit rev16 instruction.
      
      Differential Revision: http://reviews.llvm.org/D14609
      
      llvm-svn: 253211
      7aa90f57
  23. Nov 11, 2015
  24. Nov 10, 2015
  25. Nov 03, 2015
  26. Oct 27, 2015
    • Eric Christopher's avatar
      Handle target builtin options that are all required rather than · 99af5b2e
      Eric Christopher authored
      only one of a group of possibilities.
      
      This changes the syntax in the builtin files to represent:
      
      , as the and operator
      | as the or operator
      
      The former syntax matches how the backend tablegen files represent
      multiple subtarget features being required.
      
      Updated the builtin and intrinsic headers accordingly for the new
      syntax.
      
      llvm-svn: 251388
      99af5b2e
  27. Oct 20, 2015
    • Andrea Di Biagio's avatar
      [x86] Fix maskload/store intrinsic definitions in avxintrin.h · 8bb12d0a
      Andrea Di Biagio authored
      According to the Intel documentation, the mask operand of a maskload and
      maskstore intrinsics is always a vector of packed integer/long integer values.
      This patch introduces the following two changes:
       1. It fixes the avx maskload/store intrinsic definitions in avxintrin.h.
       2. It changes BuiltinsX86.def to match the correct gcc definitions for avx
          maskload/store (see D13861 for more details).
      
      Differential Revision: http://reviews.llvm.org/D13861
      
      llvm-svn: 250816
      8bb12d0a
  28. Oct 16, 2015
  29. Oct 15, 2015
  30. Oct 14, 2015
  31. Oct 13, 2015
Loading