Commits · b6a11acb533132bd1da786c73962ed8307b50ff7 · Roger Ferrer / llvm-epi

Sep 09, 2016

Albert Gutowski authored Sep 08, 2016

Reviewers: thakis, Prazek, compnerd, rnk

Subscribers: majnemer, cfe-commits

Differential Revision: https://reviews.llvm.org/D24311

llvm-svn: 280997

b6a11acb

Aug 19, 2016

AMDGPU: Add clang builtin for ds_swizzle. · 03bdd8f7

Changpeng Fang authored Aug 18, 2016

Summary:
  int __builtin_amdgcn_ds_swizzle (int a, int imm);
while imm is a constant.

Differential Revision:
  http://reviews.llvm.org/D23682

llvm-svn: 279165

03bdd8f7

Aug 16, 2016
- Revert "[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms" · 66e7717b
  Reid Kleckner authored Aug 16, 2016
```
This reverts commit r278783.  It breaks usage of _xgetbv on Windows.

llvm-svn: 278814
```
  66e7717b
- [X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms · 197b65f8
  Marina Yatsina authored Aug 16, 2016
```
commit on behalf of guyblank

Differential Revision: https://reviews.llvm.org/D21959

llvm-svn: 278783
```
  197b65f8
Aug 10, 2016

[x86] Fix a really nasty bug introduced in r276417 where alignment · 4c5e8ccf

Chandler Carruth authored Aug 10, 2016

constraints were added to _mm256_broadcast_{pd,ps} intel intrinsics.

The spec for these intrinics is ... pretty much silent on alignment.
This is especially frustrating considering the amount of discussion of
alignment in the load and store instrinsics. So I was forced to rely on
the specification for the VBROADCASTF128 instruction.

That instruction's spec is *also* completely silent on alignment.
Fortunately, when it comes to the instruction's spec, silence is enough.
There is no #GP fault option for an underaligned address so this
instruction, and by inference the intrinsic, can read any alignment.

As it happens, the old code worked exactly this way and in fact we have
plenty of code that hands pointers with less than 16-byte alignment to
these intrinsics. This code broke pretty spectacularly with this commit.

Fortunately, the fix is super simple! Change a 16 to a 1, and ta da!

Anyways, a lot of debugging for a really boring fix. =]

llvm-svn: 278202

4c5e8ccf

Aug 05, 2016
- AMDGPU : Add Clang builtin intrinsics for compare with the full · 91c84509
  Wei Ding authored Aug 05, 2016
```
wavefront result.

Differential Revision: http://reviews.llvm.org/D22934

llvm-svn: 277824
```
  91c84509
Aug 04, 2016

[OpenCL] Added underscores to the names of 'to_addr' OpenCL built-ins. · d8162326

Alexey Bader authored Aug 04, 2016

Summary:
In order to re-define OpenCL built-in functions
'to_{private,local,global}' in OpenCL run-time library LLVM names must
be different from the clang built-in function names.

Reviewers: yaxunl, Anastasia

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D23120

llvm-svn: 277743

d8162326

Jul 22, 2016

[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 with generic IR · 2d851730

Simon Pilgrim authored Jul 22, 2016

As discussed on D22460, I've updated the vbroadcastf128 pd256/ps256 builtins to map directly to generic IR - load+splat a 128-bit vector to both lanes of a 256-bit vector.

Fix for PR28657.

llvm-svn: 276417

2d851730

Jul 15, 2016
- AMDGPU: Remove legacy ldexp builtin · c7536a5d
  Matt Arsenault authored Jul 15, 2016
```
llvm-svn: 275623
```
  c7536a5d
- AMDGPU: Update for rsq intrinsic changes · c86671da
  Matt Arsenault authored Jul 15, 2016
```
llvm-svn: 275622
```
  c86671da
- AMDGPU: Add Clang Builtin for v_lerp_u8 · ea41f356
  Wei Ding authored Jul 15, 2016
```
Differential Revision: http://reviews.llvm.org/D22380

llvm-svn: 275577
```
  ea41f356
Jul 11, 2016

AMDGPU: Export workitem builtins · d7e03a5b

Jan Vesely authored Jul 10, 2016

Reviewers: tstellardAMD

Differential Revision: http://reviews.llvm.org/D20299

llvm-svn: 275030

d7e03a5b

Jul 08, 2016
- [CodeGen] Use llvm::Type::getVectorNumElements instead of casting to... · f2f1a099
  Craig Topper authored Jul 08, 2016
```
[CodeGen] Use llvm::Type::getVectorNumElements instead of casting to llvm::VectorType and calling getNumElements. This is equivalent and shorter.

llvm-svn: 274823
```
  f2f1a099
- [X86] Reuse existing lambda and remove unnecessary argument from vector cmp builtin handling. NFC · 0160063a
  Craig Topper authored Jul 08, 2016
```
llvm-svn: 274821
```
  0160063a
- [X86] Remove a couple calls to create V2F64 and V4F32 types for builtin... · 925ef0a1
  Craig Topper authored Jul 08, 2016
```
[X86] Remove a couple calls to create V2F64 and V4F32 types for builtin handling. Just get the type from the operand of the builtin instead. NFC

llvm-svn: 274820
```
  925ef0a1
Jul 06, 2016
- [X86] Use native IR for immediate values 0-7 of packed fp cmp builtins. This... · 425d02d3
  Craig Topper authored Jul 06, 2016
```
[X86] Use native IR for immediate values 0-7 of packed fp cmp builtins. This makes them the same as what is done when using the SSE builtins for these same encodings.

llvm-svn: 274608
```
  425d02d3
- [AVX512] Use the generic ctlz intrinsic to implement the vplzcntd/q builtins. · 46e7555d
  Craig Topper authored Jul 06, 2016
```
llvm-svn: 274603
```
  46e7555d
Jul 05, 2016

[OpenCL] An implementation of device side enqueue (DSE) from OpenCL v2.0 s6.13.17. · db7a31cc

Anastasia Stulova authored Jul 05, 2016

- Added new Builtins: enqueue_kernel, get_kernel_work_group_size
and get_kernel_preferred_work_group_size_multiple.

These Builtins use custom check to diagnose parameters of the passed Blocks
i. e. variable number of 'local void*' type params, and check different
overloads specified in Table 6.31 of OpenCL v2.0.

- IR is generated as an internal library call for each OpenCL Builtin,
reusing ObjC Block implementation.

Review: http://reviews.llvm.org/D20249
llvm-svn: 274540

db7a31cc

Jul 04, 2016

[OpenCL] Make OpenCL Builtins added according to the right version. · 7f8d6dc0

Anastasia Stulova authored Jul 04, 2016

Currently we only have OpenCL 2.0 Builtins i.e. pipes or address space conversions.

They have to be added only in the version 2.0 compilation mode to make the identifiers
available for use in the other versions.

Review: http://reviews.llvm.org/D20249
llvm-svn: 274509

7f8d6dc0

[AVX512] Modify what indices we emit for the zero vector we use for zero... · ac1823f6

Craig Topper authored Jul 04, 2016

[AVX512] Modify what indices we emit for the zero vector we use for zero extension of the result of a v2i1 or v4i1 masked compare. This way we emit something that the backend easily interprets as a concatenation rather than a true shuffle. This delivers slightly better codegen with the current backend capabilities.

llvm-svn: 274484

ac1823f6

Jul 01, 2016

Emit more intrinsics for builtin functions · f652caea

Matt Arsenault authored Jul 01, 2016

This is important for building libclc. Since r273039 tests are failing
due to now emitting calls to these functions instead of emitting the
DAG node. The libm function names are implemented for OpenCL, and should
call the locally defined versions, so -fno-builtin is used. The IR
Some functions use the __builtins and expect the intrinsics to be
emitted. Without this we end up with nobuiltin calls to intrinsics
or to unsupported library calls.

llvm-svn: 274370

f652caea

Jun 29, 2016
- [AVX512] Zero extend cmp intrinsic return value. · 2c880cf9
  Igor Breger authored Jun 29, 2016
```
Differential Revision: http://reviews.llvm.org/D21746

llvm-svn: 274110
```
  2c880cf9
Jun 28, 2016
- AMDGPU: Add builtin to read exec mask · 64665bc5
  Matt Arsenault authored Jun 28, 2016
```
llvm-svn: 273965
```
  64665bc5
Jun 22, 2016
- [AVX512] Replace masked integer cmp and ucmp builtins with native IR. · d1691c70
  Craig Topper authored Jun 22, 2016
```
llvm-svn: 273378
```
  d1691c70
Jun 17, 2016

[X86][SSE4A] Use native IR for mask movntsd/movntss intrinsics. · d39d0263
Simon Pilgrim authored Jun 17, 2016
```
Depends on llvm side commit r273002.

llvm-svn: 273003
```
d39d0263

[ARM] Add mrrc/mrrc2 intrinsics and update existing mcrr/mcrr2 intrinsics. · ca2b3e7b

Ranjeet Singh authored Jun 17, 2016

Reapplying patch in r272777 which was reverted
because the llvm patch which added support
for generating the mcrr/mcrr2 instructions
from the intrinsic was causing an assertion
failure. This has now been fixed in llvm.

llvm-svn: 272983

ca2b3e7b

Jun 16, 2016

[x86] generate IR for AVX2 integer min/max builtins · dbd68dd0
Sanjay Patel authored Jun 16, 2016
```
Sibling patch to r272932:
http://reviews.llvm.org/rL272932

llvm-svn: 272933
```
dbd68dd0

[Builtin] Make __builtin_thread_pointer target-independent. · a46fade6

Marcin Koscielnicki authored Jun 16, 2016

This is now supported for ARM, AArch64, PowerPC, SystemZ, SPARC, Mips.

Differential Revision: http://reviews.llvm.org/D19589

llvm-svn: 272893

a46fade6

Jun 15, 2016

[x86] translate SSE packed FP comparison builtins to IR · 280cfd1a

Sanjay Patel authored Jun 15, 2016

As noted in the code comment, a potential follow-on would be to remove
the builtins themselves. Other than ord/unord, this already works as 
expected. Eg:

  typedef float v4sf __attribute__((__vector_size__(16)));
  v4sf fcmpgt(v4sf a, v4sf b) { return a > b; }

Differential Revision: http://reviews.llvm.org/D21268

llvm-svn: 272840

280cfd1a

[x86] generate IR for SSE integer min/max builtins · 7495ec02
Sanjay Patel authored Jun 15, 2016
```
Sibling patch to r272806:
http://reviews.llvm.org/rL272806

llvm-svn: 272807
```
7495ec02
Reverting r272777 because one of the tests · d48760da
Ranjeet Singh authored Jun 15, 2016
```
added in the llvm patch is causing an assertion
to fail.

llvm-svn: 272790
```
d48760da
[AVX512] Use native IR for mask pcmpeq/pcmpgt intrinsics. · a54c21e7
Craig Topper authored Jun 15, 2016
```
llvm-svn: 272787
```
a54c21e7

[ARM] Add mrrc/mrrc2 intrinsics and update existing mcrr/mcrr2 intrinsics. · 8d5ad5bd

Ranjeet Singh authored Jun 15, 2016

Patch adds intrinsics for mrrc/mrrc2. The
intrinsics for mrrc/mrrc2 return a single
uint64_t to represent two 32 bit values.

The mcrr/mcrr2 intrinsic was changed to
accept a single uint64_t instead of two
32 bit values as the input for consistency.

Differential Revision: http://reviews.llvm.org/D21179

llvm-svn: 272777

8d5ad5bd

Jun 13, 2016

Fix unused variable warning · 532de1ce
Simon Pilgrim authored Jun 13, 2016
```
llvm-svn: 272541
```
532de1ce

[Clang][X86] Convert non-temporal store builtins to generic __builtin_nontemporal_store in headers · beca5f29

Simon Pilgrim authored Jun 13, 2016

We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp

The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores.

The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load

Differential Revision: http://reviews.llvm.org/D21272

llvm-svn: 272540

beca5f29

Jun 12, 2016

[CodeGen] Update to use an ArrayRef of uint32_t instead of int in calls to... · d1cb4cea

Craig Topper authored Jun 12, 2016

[CodeGen] Update to use an ArrayRef of uint32_t instead of int in calls to CreateShuffleVector to match llvm interface change.

llvm-svn: 272492

d1cb4cea

Jun 09, 2016
- [X86] Handle AVX2 pslldqi and psrldqi intrinsics shufflevector creation... · 2769bb57
  Craig Topper authored Jun 09, 2016
```
[X86] Handle AVX2 pslldqi and psrldqi intrinsics shufflevector creation directly in the header file instead of in CGBuiltin.cpp. Simplify the sse2 equivalents as well.

llvm-svn: 272246
```
  2769bb57
- [X86] Reuse the EmitX86Select routine to handle the select for masked palignr too. · c1442973
  Craig Topper authored Jun 09, 2016
```
llvm-svn: 272245
```
  c1442973
Jun 08, 2016

[AVX512] Emit select instruction instead of using x86 specific instrinsics. · aadb8762

Igor Breger authored Jun 08, 2016

This will allow us to remove the x86 instrinics from the backend.

Differential Revision: http://reviews.llvm.org/D21060

llvm-svn: 272141

aadb8762

Jun 06, 2016

[AVX512] Convert masked palignr builtins directly to native IR similar to the... · f51cc077

Craig Topper authored Jun 06, 2016

[AVX512] Convert masked palignr builtins directly to native IR similar to the other palignr builtins, but with a select to handle masking.

llvm-svn: 271873

f51cc077