Commits · a4451d88ee456304c26d552749aea6a7f5154bde · Lorenzo Albano / LLVM bpEVL

Jan 18, 2020

Consolidate internal denormal flushing controls · a4451d88

Matt Arsenault authored Nov 01, 2019

Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.

- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute

AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).

Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.

Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.

-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.

Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.

This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.

AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.

a4451d88

Jan 17, 2020
- AMDGPU: Update clang test · 9b549f26
  Matt Arsenault authored Jan 16, 2020
  
  9b549f26
Dec 03, 2019

[OpenCL] Fix mangling of single-overload builtins · 6713670b

Sven van Haastregt authored Dec 03, 2019

Commit 9a8d477a ("[OpenCL] Add builtin function attribute
handling", 2019-11-05) stopped Clang from mangling single-overload
builtins, which is incorrect.

6713670b

Nov 05, 2019

[OpenCL] Add builtin function attribute handling · 9a8d477a

Sven van Haastregt authored Nov 05, 2019

Add handling for the "pure", "const" and "convergent" function
attributes for OpenCL builtin functions.

Patch by Pierre Gondois and Sven van Haastregt.

Differential Revision: https://reviews.llvm.org/D64319

9a8d477a

Sep 05, 2019
- AMDGPU: Add builtins for is_shared/is_private · 281f2e2c
  Matt Arsenault authored Sep 05, 2019
```
llvm-svn: 371010
```
  281f2e2c
Aug 27, 2019

AMDGPU: Always emit amdgpu-flat-work-group-size · eac783a9

Matt Arsenault authored Aug 27, 2019

The backend default maximum should be the hardware maximum, so the
frontend should set the implementation defined default maximum.

llvm-svn: 370101

eac783a9

Aug 06, 2019

Builtins: Start adding half versions of math builtins · acd0a53c

Matt Arsenault authored Aug 06, 2019

The implementation of the OpenCL builtin currently library uses 2
different hacks to get to the corresponding IR intrinsics from the
source. This will allow removal of those.

This is the set that is currently used (minus a few vector ones).

llvm-svn: 367973

acd0a53c

Aug 05, 2019
- [OpenCL] Fix vector literal test broken in rL367675. · ab4a5d14
  Anastasia Stulova authored Aug 05, 2019
```
Avoid checking alignment unnecessary that is not portable
among targets.

llvm-svn: 367823
```
  ab4a5d14
Aug 03, 2019

IR: print value numbers for unnamed function arguments · a009a60a

Tim Northover authored Aug 03, 2019

For consistency with normal instructions and clarity when reading IR,
it's best to print the %0, %1, ... names of function arguments in
definitions.

Also modifies the parser to accept IR in that form for obvious reasons.

llvm-svn: 367755

a009a60a

Aug 02, 2019

[OpenCL] Allow OpenCL C style vector initialization in C++ · 8d99a5c0

Anastasia Stulova authored Aug 02, 2019

Allow creating vector literals from other vectors.

 float4 a = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
 float4 v = (float4)(a.s23, a.s01);

Differential revision: https://reviews.llvm.org/D65286

llvm-svn: 367675

8d99a5c0

Jul 31, 2019
- AMDGPU: Add missing builtin declarations · 64d7af09
  Matt Arsenault authored Jul 31, 2019
```
llvm-svn: 367431
```
  64d7af09
Jul 25, 2019

[OpenCL] Rename lang mode flag for C++ mode · 88ed70e2

Anastasia Stulova authored Jul 25, 2019

Rename lang mode flag to -cl-std=clc++/-cl-std=CLC++
or -std=clc++/-std=CLC++.

This aligns with OpenCL C conversion and removes ambiguity
with OpenCL C++. 

Differential Revision: https://reviews.llvm.org/D65102

llvm-svn: 367008

88ed70e2

Jul 22, 2019

Updated the signature for some stack related intrinsics (CLANG) · 8c5e6fa6

Christudasan Devadasan authored Jul 22, 2019

Modified the intrinsics
int_addressofreturnaddress,
int_frameaddress & int_sponentry.
This commit depends on the changes in rL366679

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D64563

llvm-svn: 366683

8c5e6fa6

Jul 17, 2019
- AMDGPU: Add some missing builtins · e56865d4
  Matt Arsenault authored Jul 17, 2019
```
llvm-svn: 366286
```
  e56865d4
Jul 16, 2019

[OpenCL] Fixing sampler initialisations for C++ mode. · 8ece3b67

Neil Hickey authored Jul 16, 2019

Allow conversions between integer and sampler type.

Differential Revision: https://reviews.llvm.org/D64791

llvm-svn: 366212

8ece3b67

Jul 10, 2019

[clang] Preserve names of addrspacecast'ed values. · de811d1f
Vyacheslav Zakharin authored Jul 10, 2019
```
Differential Revision: https://reviews.llvm.org/D63846

llvm-svn: 365666
```
de811d1f

[AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG). · 18ba9d60

Christudasan Devadasan authored Jul 10, 2019

To enable a new implicit kernel argument,
increased the number of argument bytes from 48 to 56.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D63756

llvm-svn: 365643

18ba9d60

Jul 09, 2019

Use the Itanium C++ ABI for the pipe_builtin.cl test · 9b28d9c3

Reid Kleckner authored Jul 09, 2019

Certain OpenCL constructs cannot yet be mangled in the MS C++ ABI.
Add a FIXME for it if anyone cares to implement it.

llvm-svn: 365557

9b28d9c3

[AMDGPU] gfx908 clang target · 0cfd75a0
Stanislav Mekhanoshin authored Jul 09, 2019
```
Differential Revision: https://reviews.llvm.org/D64430

llvm-svn: 365528
```
0cfd75a0

[OpenCL][Sema] Fix builtin rewriting · b00d5f73

Marco Antognini authored Jul 09, 2019

This patch ensures built-in functions are rewritten using the proper
parent declaration.

Existing tests are modified to run in C++ mode to ensure the
functionality works also with C++ for OpenCL while not increasing the
testing runtime.

llvm-svn: 365499

b00d5f73

Jul 08, 2019

Add nofree attribute to CodeGenOpenCL/convergent.cl test · e6ba2254

Brian Homerding authored Jul 08, 2019

The revision at https://reviews.llvm.org/rL365336 added inference of the nofree
attribute.  This revision updates the test to reflect this.

Differential Revision: https://reviews.llvm.org/D49165

llvm-svn: 365341

e6ba2254

Jun 25, 2019
- AMDGPU: Fix missing declaration for mbcnt builtins · 5495f781
  Matt Arsenault authored Jun 24, 2019
```
llvm-svn: 364251
```
  5495f781
Jun 24, 2019

[clang][NewPM] Add RUNS for tests that produce slightly different IR under new PM · f336eb34

Leonard Chan authored Jun 24, 2019

For CodeGenOpenCL/convergent.cl, the new PM produced a slightly different for
loop, but this still checks for no loop unrolling as intended. This is
committed separately from D63174.

llvm-svn: 364202

f336eb34

Jun 22, 2019
- AMDGPU: Fix target builtins for gfx10 · fc849252
  Matt Arsenault authored Jun 22, 2019
```
This wasn't setting some of the features from older generations.

llvm-svn: 364123
```
  fc849252
Jun 20, 2019
- AMDGPU: Add DS GWS sema builtins · bcdbc9a1
  Matt Arsenault authored Jun 20, 2019
```
llvm-svn: 363986
```
  bcdbc9a1
Jun 19, 2019
- Reapply "r363684: AMDGPU: Add GWS instruction builtins" · f46f4141
  Matt Arsenault authored Jun 19, 2019
```
llvm-svn: 363871
```
  f46f4141
- Revert rL363684 : AMDGPU: Add GWS instruction builtins · 6828bc56
  Simon Pilgrim authored Jun 19, 2019
```
........
Depends on rL363678 which was reverted at rL363797

llvm-svn: 363824
```
  6828bc56
Jun 18, 2019
- AMDGPU: Add GWS instruction builtins · 2acc7176
  Matt Arsenault authored Jun 18, 2019
```
llvm-svn: 363684
```
  2acc7176
Jun 14, 2019
- [AMDGPU] gfx1011/gfx1012 clang support · cafccd7a
  Stanislav Mekhanoshin authored Jun 14, 2019
```
Differential Revision: https://reviews.llvm.org/D63308

llvm-svn: 363345
```
  cafccd7a
- [AMDGPU] gfx1010 wave32 clang support · 8a8131a3
  Stanislav Mekhanoshin authored Jun 13, 2019
```
Differential Revision: https://reviews.llvm.org/D63209

llvm-svn: 363341
```
  8a8131a3
Jun 05, 2019

LLVM IR: Generate new-style byval-with-Type from Clang · c46827c7

Tim Northover authored Jun 05, 2019

LLVM IR recently added a Type parameter to the byval Attribute, so that
when pointers become opaque and no longer have an element type the
information will still be present in IR.

For now the Type parameter is optional (which is why Clang didn't need
this change at the time), but it will become mandatory soon.

llvm-svn: 362652

c46827c7

May 30, 2019

Reapply: LLVM IR: update Clang tests for byval being a typed attribute. · fcb00d4a

Tim Northover authored May 30, 2019

Since byval is now a typed attribute it gets sorted slightly differently by
LLVM when the order of attributes is being canonicalized. This updates the few
Clang tests that depend on the old order.

Clang patch is unchanged.

llvm-svn: 362129

fcb00d4a

[OpenCL] Fix OpenCL/SPIR version metadata in C++ mode. · f61b5481

Anastasia Stulova authored May 30, 2019

C++ is derived from OpenCL v2.0 therefore set the versions
identically.

Differential Revision: https://reviews.llvm.org/D62657

llvm-svn: 362102

f61b5481

[OpenCL] Support logical vector operators in C++ mode · ce127bb6

Sven van Haastregt authored May 30, 2019

Support logical operators on vectors in C++ for OpenCL mode, to
preserve backwards compatibility with OpenCL C.

Differential Revision: https://reviews.llvm.org/D62588

llvm-svn: 362087

ce127bb6

May 29, 2019

Revert "LLVM IR: update Clang tests for byval being a typed attribute." · 4b281755
Tim Northover authored May 29, 2019
```
The underlying LLVM change couldn't cope with llvm-link and broke LTO builds.

llvm-svn: 362028
```
4b281755

LLVM IR: update Clang tests for byval being a typed attribute. · 45e8cc66

Tim Northover authored May 29, 2019

Since byval is now a typed attribute it gets sorted slightly differently by
LLVM when the order of attributes is being canonicalized. This updates the few
Clang tests that depend on the old order.

llvm-svn: 362013

45e8cc66

May 27, 2019

[OpenCL] Fix file-scope const sampler variable for 2.0 · a53d48b7

Yaxun Liu authored May 27, 2019

OpenCL spec v2.0 s6.13.14:

Samplers can also be declared as global constants in the program
source using the following syntax.

   const sampler_t <sampler name> = <value>
This works fine for OpenCL 1.2 but fails for 2.0, because clang duduces
address space of file-scope const sampler variable to be in global address
space whereas spec v2.0 s6.9.b forbids file-scope sampler variable to be
in global address space.

The fix is not to deduce address space for file-scope sampler variables.

Differential Revision: https://reviews.llvm.org/D62197

llvm-svn: 361757

a53d48b7

May 24, 2019

[OpenCL] Add support for the cl_arm_integer_dot_product extensions · aa7754cc

Kevin Petit authored May 24, 2019

The specification is available in the Khronos OpenCL registry:

https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_integer_dot_product.txt



Signed-off-by: Kevin Petit <kevin.petit@arm.com>
llvm-svn: 361641

aa7754cc

May 17, 2019
- [NFC] Fix line endings in OpenCL tests · 151d4f88
  Sven van Haastregt authored May 17, 2019
```
llvm-svn: 361004
```
  151d4f88
May 14, 2019
- [AMDGPU] gfx1010 clang target · 91792f1b
  Stanislav Mekhanoshin authored May 13, 2019
```
Differential Revision: https://reviews.llvm.org/D61875

llvm-svn: 360634
```
  91792f1b