Commits · 0e19cf2dd88e36974bbbfaa2333e13f9eb83864d · Roger Ferrer / llvm-epi

Jan 30, 2016

This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_aes.h. · 0e19cf2d

Ekaterina Romanova authored Jan 29, 2016

The doxygen comments are automatically generated based on Sony's intrinsics document.

Differential Revision: http://reviews.llvm.org/D16562

llvm-svn: 259275

0e19cf2d

Jan 29, 2016

This patch adds doxygen comments for the intrinsincs in the header file... · deec50a3

Ekaterina Romanova authored Jan 29, 2016

This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_pclmul.h. The doxygen comments are automatically generated based on Sony's intrinsics document.

Differential Revision: http://reviews.llvm.org/D15999

llvm-svn: 259239

deec50a3

Jan 27, 2016

[CUDA] Implemented device-side support functions in <cmath>. · c5f41a34

Artem Belevich authored Jan 26, 2016

CUDA expects math functions in std:: namespace to work on device side.
In order to make it work with clang without allowing device-side code
generation for functions w/o appropriate target attributes, this patch
provides device-side implementations for <cmath> functions. Most of
them call global-scope math functions provided by CUDA headers. In few
cases we use clang builtins.

Tested out-of tree by compiling and running thrust's unit_tests.
https://github.com/thrust/thrust/tree/master/testing

Differential Revision: http://reviews.llvm.org/D16593

llvm-svn: 258880

c5f41a34

Jan 26, 2016

Remove autoconf support · 2bf68c6c

Chris Bieneman authored Jan 26, 2016

Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

    "This is the way [autoconf] ends
    Not with a bang but a whimper."
    -T.S. Eliot

Reviewers: chandlerc, grosbach, bob.wilson, echristo

Subscribers: klimek, cfe-commits

Differential Revision: http://reviews.llvm.org/D16472

llvm-svn: 258862

2bf68c6c

Jan 23, 2016

[CUDA] Make printf work. · 3039a593

Justin Lebar authored Jan 23, 2016

Summary:
The code in CGCUDACall is largely based on a patch written by Eli
Bendersky:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html

That patch implemented an LLVM pass lowering printf to vprintf; this
one does something similar, but in Clang codegen.

Reviewers: echristo

Subscribers: cfe-commits, jhen, tra, majnemer

Differential Revision: http://reviews.llvm.org/D16372

llvm-svn: 258642

3039a593

Jan 22, 2016
- 2 missing intrinsics _cvtss_sh and _mm_cvtps_ph were added to the intrinsics header f16intrin.h · 08d1f243
  Ekaterina Romanova authored Jan 22, 2016
```
Differential Revision: http://reviews.llvm.org/D16177

llvm-svn: 258492
```
  08d1f243
Jan 19, 2016

[AVX512] Fix typo in r226298 · e7087471

Adam Nemet authored Jan 19, 2016

Hal noticed that the double/float got mixed up on the parameters for
these.

llvm-svn: 258108

e7087471

Jan 08, 2016

[PPC] Add long long/double support for vec_cts, vec_ctu and vec_ctf · 436ff85b

Kyle Butt authored Jan 08, 2016

Add long long/double support for vec_cts, vec_ctu and vec_ctf.

Similar to this change in GCC:
https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02653.html

Patch by Tim Shen.

llvm-svn: 257135

436ff85b

Jan 01, 2016
- Reimplement __readeflags and __writeeflags on top of intrinsics · 30f9bfd5
  David Majnemer authored Jan 01, 2016
```
Lean on LLVM to provide this functionality now that it provides the
necessary intrinsics.

llvm-svn: 256686
```
  30f9bfd5
Dec 31, 2015
- [X86][PKU] add clang intrinsic for {RD|WR}PKRU · a9d1e18f
  Asaf Badouh authored Dec 31, 2015
```
Differential Revision: http://reviews.llvm.org/D15837

llvm-svn: 256672
```
  a9d1e18f
Dec 28, 2015
- Fix up comment in header. · 7f7d9bea
  Eric Christopher authored Dec 28, 2015
```
llvm-svn: 256508
```
  7f7d9bea
Dec 20, 2015

[X86] Add missing m64/int64 conversions · 591278c0

Michael Kuperstein authored Dec 20, 2015

Define the 64-bit equivalents of _m_to_int and _m_from_int.

Differential Revision: http://reviews.llvm.org/D15572

llvm-svn: 256122

591278c0

[X86] Add signed aliases for popcnt intrinsics · beae0267

Michael Kuperstein authored Dec 20, 2015

The Intel manual documents both an unsigned form (_mm_popcnt_u32)
and a signed form (_popcnt32) of the intrinsic. Add the missing signed form.

Differential Revision: http://reviews.llvm.org/D15568

llvm-svn: 256121

beae0267

Dec 17, 2015

[CUDA] runtime wrapper header tweaks · 8e9ba042

Artem Belevich authored Dec 17, 2015

* Pull in host-only implementations of few CUDA-specific math functions.
* #nclude <cmath> early to prevent its inclusion from CUDA headers after
  they've messed with __THROW macro.

llvm-svn: 255933

8e9ba042

Dec 16, 2015

[CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h · 7fda3c9f

Artem Belevich authored Dec 16, 2015

Currently it's easy to break CUDA compilation by passing
"-isystem /path/to/cuda/include" to compiler which leads to
compiler including real cuda_runtime.h from there instead
of the wrapper we need.

Renaming the wrapper ensures that we can include the wrapper
regardless of user-specified include paths and files.

Differential Revision: http://reviews.llvm.org/D15534

llvm-svn: 255802

7fda3c9f

Dec 08, 2015
- [x86][avx512] more changes in intrinsics to be align with gcc format · 5e4248b4
  Asaf Badouh authored Dec 08, 2015
```
Differential Revision: http://reviews.llvm.org/D15328

llvm-svn: 255012
```
  5e4248b4
Dec 07, 2015

[avx512] rename gcc intrinsics to be align with gcc format · 3e5111e3

Asaf Badouh authored Dec 07, 2015

rename the gcc intrinsics suffix : _mask ->_round

Differential Revision: http://reviews.llvm.org/D15284

llvm-svn: 254906

3e5111e3

Dec 02, 2015

Move _mm256_cvtps_ph and _mm256_cvtph_ps to immintrin.h. · 941bc915

Paul Robinson authored Dec 02, 2015

This more closely matches their locations as described by Intel
documentation, and lets us remove a pair of redundant typedefs.

Differential Revision: http://reviews.llvm.org/D15127

llvm-svn: 254528

941bc915

Dec 01, 2015

[X86] Improve codegen for AVX2 gather with an all 1s mask. · 5ec97a7b

Craig Topper authored Dec 01, 2015

Use undefined instead of setzero as the pass through input since its going to be fully overwritten. Use cmpeq of two zero vectors to produce the all 1s vector. Casting -1 to a double and vectorizing causes a constant load of a -1.0 floating point value.

llvm-svn: 254389

5ec97a7b

Nov 29, 2015
- [X86] _mm256_permutevar8x32_ps should take an integer vector for its shuffle index input. · e20b8c68
  Craig Topper authored Nov 29, 2015
```
llvm-svn: 254270
```
  e20b8c68
- [X86] Remove temporary variables from intrinsic macros. NFC · 3a71f35a
  Craig Topper authored Nov 29, 2015
```
llvm-svn: 254247
```
  3a71f35a
Nov 20, 2015
- [CMake] Add a specific 'install-clang-headers' target. · dcb56535
  Argyrios Kyrtzidis authored Nov 20, 2015
```
llvm-svn: 253636
```
  dcb56535
Nov 17, 2015

[CUDA] Added a wrapper header for inclusion of stock CUDA headers. · c29db844

Artem Belevich authored Nov 17, 2015

Header files that come with CUDA are assuming split host/device
compilation and are not usable by clang out of the box.
With a bit of preprocessor magic it's possible to twist them
into something clang can use.

This wrapper always includes CUDA headers exactly the same way during
host and device compilation passes and produces identical preprocessed
content during host and device side compilation for sm_35 GPUs. Device
compilation passes for older GPUs will see a smaller subset of device
functions supported by particular GPU.

The wrapper assumes specific contents of CUDA header files and works
only with CUDA 7.0 and 7.5.

Differential Revision: http://reviews.llvm.org/D13171

llvm-svn: 253388

c29db844

bmiintrin.h: Allow using the tzcnt intrinsics for non-BMI targets · 1acf955a

Hans Wennborg authored Nov 17, 2015

The tzcnt intrinsics are used non non-BMI targets by code (e.g. ffmpeg)
that uses it as a potentially faster BSF.

The TZCNT instruction is special in that it's encoded in a
backward-compatible way and behaves as BSF on non-BMI targets.

Differential Revision: http://reviews.llvm.org/D14748

llvm-svn: 253358

1acf955a

Nov 16, 2015

[ARM,AArch64] Fix __rev16l and __rev16ll intrinsics · 7aa90f57

Oliver Stannard authored Nov 16, 2015

These two intrinsics are defined in arm_acle.h.

__rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits.
For AArch64, where long is 64 bits, this would still be wrong.

__rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than
each 16-bit halfword. The correct implementation is to apply __rev16 to the top
and bottom words of the 64-bit value.

For AArch32 targets, these get compiled down to the hardware rev16 instruction
at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two
32-bit rev16 instructions, because there is not currently a pattern for the
64-bit rev16 instruction.

Differential Revision: http://reviews.llvm.org/D14609

llvm-svn: 253211

7aa90f57

Nov 11, 2015

[X86] Add 'pause' builtin that's already in llvm and use it instead of inline... · fb79b5f2
Craig Topper authored Nov 11, 2015
```
[X86] Add 'pause' builtin that's already in llvm and use it instead of inline assembly to implement _mm_pause.

llvm-svn: 252712
```
fb79b5f2

[X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a couple... · a5455524

Craig Topper authored Nov 11, 2015

[X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a couple intrinsics that were supposed to operate on MMX registers. Otherwise we end up operating on GPRs. Throw in a test for _mm_mul_su32 while I was there.

llvm-svn: 252711

a5455524

[X86] Header formatting fixes. NFC · 880f60b7
Craig Topper authored Nov 11, 2015
```
llvm-svn: 252710
```
880f60b7

[X86] Add missing typecasts in intrinsic macros. This should make them more... · d619eaaa

Craig Topper authored Nov 11, 2015

[X86] Add missing typecasts in intrinsic macros. This should make them more robust against inputs that aren't already the right type.

llvm-svn: 252700

d619eaaa

[X86] Change pointer type in AVX2 gather builtins to be the scalar type... · 19744ee6

Craig Topper authored Nov 11, 2015

[X86] Change pointer type in AVX2 gather builtins to be the scalar type instead of the vector type. This matches gcc and removes extras casts.

llvm-svn: 252697

19744ee6

Nov 10, 2015
- [X86] Use setzero instead of set1(0) in a few places in intrinsic headers. · fd778eeb
  Craig Topper authored Nov 10, 2015
```
llvm-svn: 252587
```
  fd778eeb
- [X86] Remove temporary variables from macros in x86 intrinsic headers.... · 71481667
  Craig Topper authored Nov 10, 2015
```
[X86] Remove temporary variables from macros in x86 intrinsic headers. Prevents duplicate names appearing from multiple macro expansions. NFC

llvm-svn: 252586
```
  71481667
- [X86] Fix bad intrinsic header comment. NFC. · 166f8b20
  Craig Topper authored Nov 10, 2015
```
llvm-svn: 252585
```
  166f8b20
Nov 03, 2015
- Fix a couple intrinsic header comments. NFC · 991d4994
  Craig Topper authored Nov 03, 2015
```
llvm-svn: 251900
```
  991d4994
Oct 27, 2015

Handle target builtin options that are all required rather than · 99af5b2e

Eric Christopher authored Oct 27, 2015

only one of a group of possibilities.

This changes the syntax in the builtin files to represent:

, as the and operator
| as the or operator

The former syntax matches how the backend tablegen files represent
multiple subtarget features being required.

Updated the builtin and intrinsic headers accordingly for the new
syntax.

llvm-svn: 251388

99af5b2e

Oct 20, 2015

[x86] Fix maskload/store intrinsic definitions in avxintrin.h · 8bb12d0a

Andrea Di Biagio authored Oct 20, 2015

According to the Intel documentation, the mask operand of a maskload and
maskstore intrinsics is always a vector of packed integer/long integer values.
This patch introduces the following two changes:
 1. It fixes the avx maskload/store intrinsic definitions in avxintrin.h.
 2. It changes BuiltinsX86.def to match the correct gcc definitions for avx
    maskload/store (see D13861 for more details).

Differential Revision: http://reviews.llvm.org/D13861

llvm-svn: 250816

8bb12d0a

Oct 16, 2015
- [X86] Add fxsr feature name for fxsave/fxrestore builtins. · e33f51fa
  Craig Topper authored Oct 16, 2015
```
llvm-svn: 250498
```
  e33f51fa
Oct 15, 2015
- Headers: Switch some headers to LF line endings for consistency. · e919b0f9
  Peter Collingbourne authored Oct 15, 2015
```
llvm-svn: 250388
```
  e919b0f9
Oct 14, 2015
- Intrin.h: implement __emul and __emulu · 4ca00afd
  Hans Wennborg authored Oct 14, 2015
```
llvm-svn: 250301
```
  4ca00afd
Oct 13, 2015
- Add subtarget feature support for 3dnowa to the 3dnowa intrinsics. · 525334cf
  Eric Christopher authored Oct 13, 2015
```
llvm-svn: 250202
```
  525334cf