- Jan 30, 2016
-
-
Ekaterina Romanova authored
The doxygen comments are automatically generated based on Sony's intrinsics document. Differential Revision: http://reviews.llvm.org/D16562 llvm-svn: 259275
-
- Jan 29, 2016
-
-
Ekaterina Romanova authored
This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_pclmul.h. The doxygen comments are automatically generated based on Sony's intrinsics document. Differential Revision: http://reviews.llvm.org/D15999 llvm-svn: 259239
-
- Jan 27, 2016
-
-
Artem Belevich authored
CUDA expects math functions in std:: namespace to work on device side. In order to make it work with clang without allowing device-side code generation for functions w/o appropriate target attributes, this patch provides device-side implementations for <cmath> functions. Most of them call global-scope math functions provided by CUDA headers. In few cases we use clang builtins. Tested out-of tree by compiling and running thrust's unit_tests. https://github.com/thrust/thrust/tree/master/testing Differential Revision: http://reviews.llvm.org/D16593 llvm-svn: 258880
-
- Jan 26, 2016
-
-
Chris Bieneman authored
Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "This is the way [autoconf] ends Not with a bang but a whimper." -T.S. Eliot Reviewers: chandlerc, grosbach, bob.wilson, echristo Subscribers: klimek, cfe-commits Differential Revision: http://reviews.llvm.org/D16472 llvm-svn: 258862
-
- Jan 23, 2016
-
-
Justin Lebar authored
Summary: The code in CGCUDACall is largely based on a patch written by Eli Bendersky: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html That patch implemented an LLVM pass lowering printf to vprintf; this one does something similar, but in Clang codegen. Reviewers: echristo Subscribers: cfe-commits, jhen, tra, majnemer Differential Revision: http://reviews.llvm.org/D16372 llvm-svn: 258642
-
- Jan 22, 2016
-
-
Ekaterina Romanova authored
Differential Revision: http://reviews.llvm.org/D16177 llvm-svn: 258492
-
- Jan 19, 2016
-
-
Adam Nemet authored
Hal noticed that the double/float got mixed up on the parameters for these. llvm-svn: 258108
-
- Jan 08, 2016
-
-
Kyle Butt authored
Add long long/double support for vec_cts, vec_ctu and vec_ctf. Similar to this change in GCC: https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02653.html Patch by Tim Shen. llvm-svn: 257135
-
- Jan 01, 2016
-
-
David Majnemer authored
Lean on LLVM to provide this functionality now that it provides the necessary intrinsics. llvm-svn: 256686
-
- Dec 31, 2015
-
-
Asaf Badouh authored
Differential Revision: http://reviews.llvm.org/D15837 llvm-svn: 256672
-
- Dec 28, 2015
-
-
Eric Christopher authored
llvm-svn: 256508
-
- Dec 20, 2015
-
-
Michael Kuperstein authored
Define the 64-bit equivalents of _m_to_int and _m_from_int. Differential Revision: http://reviews.llvm.org/D15572 llvm-svn: 256122
-
Michael Kuperstein authored
The Intel manual documents both an unsigned form (_mm_popcnt_u32) and a signed form (_popcnt32) of the intrinsic. Add the missing signed form. Differential Revision: http://reviews.llvm.org/D15568 llvm-svn: 256121
-
- Dec 17, 2015
-
-
Artem Belevich authored
* Pull in host-only implementations of few CUDA-specific math functions. * #nclude <cmath> early to prevent its inclusion from CUDA headers after they've messed with __THROW macro. llvm-svn: 255933
-
- Dec 16, 2015
-
-
Artem Belevich authored
Currently it's easy to break CUDA compilation by passing "-isystem /path/to/cuda/include" to compiler which leads to compiler including real cuda_runtime.h from there instead of the wrapper we need. Renaming the wrapper ensures that we can include the wrapper regardless of user-specified include paths and files. Differential Revision: http://reviews.llvm.org/D15534 llvm-svn: 255802
-
- Dec 08, 2015
-
-
Asaf Badouh authored
Differential Revision: http://reviews.llvm.org/D15328 llvm-svn: 255012
-
- Dec 07, 2015
-
-
Asaf Badouh authored
rename the gcc intrinsics suffix : _mask ->_round Differential Revision: http://reviews.llvm.org/D15284 llvm-svn: 254906
-
- Dec 02, 2015
-
-
Paul Robinson authored
This more closely matches their locations as described by Intel documentation, and lets us remove a pair of redundant typedefs. Differential Revision: http://reviews.llvm.org/D15127 llvm-svn: 254528
-
- Dec 01, 2015
-
-
Craig Topper authored
Use undefined instead of setzero as the pass through input since its going to be fully overwritten. Use cmpeq of two zero vectors to produce the all 1s vector. Casting -1 to a double and vectorizing causes a constant load of a -1.0 floating point value. llvm-svn: 254389
-
- Nov 29, 2015
-
-
Craig Topper authored
llvm-svn: 254270
-
Craig Topper authored
llvm-svn: 254247
-
- Nov 20, 2015
-
-
Argyrios Kyrtzidis authored
llvm-svn: 253636
-
- Nov 17, 2015
-
-
Artem Belevich authored
Header files that come with CUDA are assuming split host/device compilation and are not usable by clang out of the box. With a bit of preprocessor magic it's possible to twist them into something clang can use. This wrapper always includes CUDA headers exactly the same way during host and device compilation passes and produces identical preprocessed content during host and device side compilation for sm_35 GPUs. Device compilation passes for older GPUs will see a smaller subset of device functions supported by particular GPU. The wrapper assumes specific contents of CUDA header files and works only with CUDA 7.0 and 7.5. Differential Revision: http://reviews.llvm.org/D13171 llvm-svn: 253388
-
Hans Wennborg authored
The tzcnt intrinsics are used non non-BMI targets by code (e.g. ffmpeg) that uses it as a potentially faster BSF. The TZCNT instruction is special in that it's encoded in a backward-compatible way and behaves as BSF on non-BMI targets. Differential Revision: http://reviews.llvm.org/D14748 llvm-svn: 253358
-
- Nov 16, 2015
-
-
Oliver Stannard authored
These two intrinsics are defined in arm_acle.h. __rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits. For AArch64, where long is 64 bits, this would still be wrong. __rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than each 16-bit halfword. The correct implementation is to apply __rev16 to the top and bottom words of the 64-bit value. For AArch32 targets, these get compiled down to the hardware rev16 instruction at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two 32-bit rev16 instructions, because there is not currently a pattern for the 64-bit rev16 instruction. Differential Revision: http://reviews.llvm.org/D14609 llvm-svn: 253211
-
- Nov 11, 2015
-
-
Craig Topper authored
[X86] Add 'pause' builtin that's already in llvm and use it instead of inline assembly to implement _mm_pause. llvm-svn: 252712
-
Craig Topper authored
[X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a couple intrinsics that were supposed to operate on MMX registers. Otherwise we end up operating on GPRs. Throw in a test for _mm_mul_su32 while I was there. llvm-svn: 252711
-
Craig Topper authored
llvm-svn: 252710
-
Craig Topper authored
[X86] Add missing typecasts in intrinsic macros. This should make them more robust against inputs that aren't already the right type. llvm-svn: 252700
-
Craig Topper authored
[X86] Change pointer type in AVX2 gather builtins to be the scalar type instead of the vector type. This matches gcc and removes extras casts. llvm-svn: 252697
-
- Nov 10, 2015
-
-
Craig Topper authored
llvm-svn: 252587
-
Craig Topper authored
[X86] Remove temporary variables from macros in x86 intrinsic headers. Prevents duplicate names appearing from multiple macro expansions. NFC llvm-svn: 252586
-
Craig Topper authored
llvm-svn: 252585
-
- Nov 03, 2015
-
-
Craig Topper authored
llvm-svn: 251900
-
- Oct 27, 2015
-
-
Eric Christopher authored
only one of a group of possibilities. This changes the syntax in the builtin files to represent: , as the and operator | as the or operator The former syntax matches how the backend tablegen files represent multiple subtarget features being required. Updated the builtin and intrinsic headers accordingly for the new syntax. llvm-svn: 251388
-
- Oct 20, 2015
-
-
Andrea Di Biagio authored
According to the Intel documentation, the mask operand of a maskload and maskstore intrinsics is always a vector of packed integer/long integer values. This patch introduces the following two changes: 1. It fixes the avx maskload/store intrinsic definitions in avxintrin.h. 2. It changes BuiltinsX86.def to match the correct gcc definitions for avx maskload/store (see D13861 for more details). Differential Revision: http://reviews.llvm.org/D13861 llvm-svn: 250816
-
- Oct 16, 2015
-
-
Craig Topper authored
llvm-svn: 250498
-
- Oct 15, 2015
-
-
Peter Collingbourne authored
llvm-svn: 250388
-
- Oct 14, 2015
-
-
Hans Wennborg authored
llvm-svn: 250301
-
- Oct 13, 2015
-
-
Eric Christopher authored
llvm-svn: 250202
-