- Aug 08, 2018
-
-
Matt Arsenault authored
Fast FMAF is not a sufficient condition to enable denormals. Before VI, enabling denormals caused F32 instructions to run at F64 speeds. llvm-svn: 339278
-
Scott Linder authored
NFC refactor of code to generate debug info for OpenCL 2.X blocks. Differential Revision: https://reviews.llvm.org/D50099 llvm-svn: 339265
-
- Aug 07, 2018
-
-
Douglas Yung authored
llvm-svn: 339188
-
Douglas Yung authored
Make test more robust by not checking hard coded debug info values, but instead check the relationships between them. llvm-svn: 339185
-
Scott Linder authored
Always emit alloca in entry block for enqueue_kernel builtin. Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. llvm-svn: 339150
-
Matt Arsenault authored
llvm-svn: 339110
-
Matt Arsenault authored
llvm-svn: 339109
-
- Aug 03, 2018
-
-
Vlad Tsyrklevich authored
This reverts commit r338899, it was causing ASan test failures on sanitizer-x86_64-linux-fast. llvm-svn: 338904
-
Scott Linder authored
Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. Differential Revision: https://reviews.llvm.org/D50104 llvm-svn: 338899
-
- Aug 02, 2018
-
-
Matt Arsenault authored
llvm-svn: 338754
-
Matt Arsenault authored
The way address space declarations for builtins currently work is nearly useless. The code assumes the address spaces used for builtins is a confusingly named "target address space" from user code using __attribute__((address_space(N))) that matches the builtin declaration. There's no way to use this to declare a builtin that returns a language specific address space. The terminology used is highly cofusing since it has nothing to do with the the address space selected by the target to use for a language address space. This feature is essentially unused as-is. AMDGPU and NVPTX are the only in-tree targets attempting to use this. The AMDGPU builtins certainly do not behave as intended (i.e. all of the builtins returning pointers can never compile because the numbered address space never matches the expected named address space). The NVPTX builtins are missing tests for some, and the others seem to rely on an implicit addrspacecast. Change the used address space for builtins based on a target hook to allow using a language address space for a builtin. This allows the same builtin declaration to be used for multiple languages with similarly purposed address spaces (e.g. the same AMDGPU builtin can be used in OpenCL and CUDA even though the constant address spaces are arbitarily different). This breaks the possibility of using arbitrary numbered address spaces alongside the named address spaces for builtins. If this is an issue we probably need to introduce another builtin declaration character to distinguish language address spaces from so-called "target address spaces". llvm-svn: 338707
-
- Aug 01, 2018
-
-
Konstantin Zhuravlyov authored
Differential Revision: https://reviews.llvm.org/D50011 llvm-svn: 338471
-
- Jul 30, 2018
-
-
Scott Linder authored
OpenCL block literal structs have different fields which are now correctly identified in the debug info. Differential Revision: https://reviews.llvm.org/D49930 llvm-svn: 338299
-
- Jul 13, 2018
-
-
JF Bastien authored
Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds. Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D49209 llvm-svn: 337041
-
- May 21, 2018
-
-
Daniil Fukalov authored
1. added restrictions to memory scope, order and volatile parameters 2. added custom processing for these builtins - currently is not used code, needed to switch off GCCBuiltin link to the builtins (ongoing change to llvm tree) 3. builtins renamed as requested Differential Revision: https://reviews.llvm.org/D43281 llvm-svn: 332848
-
- May 16, 2018
-
-
Sanjay Patel authored
There shouldn't be any tests that run the entire optimizer here, but the last test in this file is definitely going to break with a change in LLVM IR canonicalization. Change that part to check the unoptimized IR because that's the real intent of this file. llvm-svn: 332473
-
- May 09, 2018
-
-
Yaxun Liu authored
Two typos: vaarg => vararg get_kernel_preferred_work_group_multiple => get_kernel_preferred_work_group_size_multiple Differential Revision: https://reviews.llvm.org/D46601 llvm-svn: 331895
-
Anastasia Stulova authored
Added string literal helper function to obtain the type attributed by a constant address space. Also fixed predefind __func__ expr to use the helper to constract the string literal correctly. Differential Revision: https://reviews.llvm.org/D46049 llvm-svn: 331877
-
- May 01, 2018
-
-
Erich Keane authored
Half-type mangling is accomplished following the method introduced by Erich Keane for mangling _Float16. Updated the half.cl LIT test to cover this particular case. Patch By: vbridgers Differential Revision: https://reviews.llvm.org/D46131 llvm-svn: 331263
-
- Apr 30, 2018
-
-
Matt Arsenault authored
Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331216
-
- Apr 27, 2018
-
-
Sven van Haastregt authored
SPIR-V encodes the read_only and write_only access qualifiers of pipes, so separate LLVM IR types are required to target SPIR-V. Other backends may also find this useful. These new types are `opencl.pipe_ro_t` and `opencl.pipe_wo_t`, which replace `opencl.pipe_t`. This replaces __get_pipe_num_packets(...) and __get_pipe_max_packets(...) which took a read_only pipe with separate versions for read_only and write_only pipes, namely: * __get_pipe_num_packets_ro(...) * __get_pipe_num_packets_wo(...) * __get_pipe_max_packets_ro(...) * __get_pipe_max_packets_wo(...) These separate versions exist to avoid needing a bitcast to one of the two qualified pipe types. Patch by Stuart Brady. Differential Revision: https://reviews.llvm.org/D46015 llvm-svn: 331026
-
- Apr 20, 2018
-
-
Hans Wennborg authored
llvm-svn: 330441
-
Alexey Sotkin authored
Summary: Generate attribute 'denorms-are-zero'='true' if '-cl-denorms-are-zero' compile option was specified and 'denorms-are-zero'='false' otherwise. Patch by krisb Reviewers: Anastasia, yaxunl Reviewed By: yaxunl Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D45808 llvm-svn: 330404
-
- Apr 06, 2018
-
-
Alexander Kornienko authored
Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399
-
- Mar 27, 2018
-
-
Matt Arsenault authored
llvm-svn: 328657
-
- Mar 23, 2018
-
-
Yaxun Liu authored
Need to override convertConstraint to recognise amdgpu specific register names. Differential Revision: https://reviews.llvm.org/D44533 llvm-svn: 328359
-
Tony Tye authored
Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44696 llvm-svn: 328350
-
Tony Tye authored
[AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU (CLANG) - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use a function attribute to communicate to the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D43735 llvm-svn: 328347
-
- Mar 15, 2018
-
-
Yaxun Liu authored
llvm-svn: 327634
-
- Mar 10, 2018
-
-
Richard Smith authored
llvm-svn: 327195
-
- Mar 07, 2018
-
-
Yaxun Liu authored
The indirect function argument is in alloca address space in LLVM IR. However, during Clang codegen for C++, the address space of indirect function argument should match its address space in the source code, i.e., default addr space, even for indirect argument. This is because destructor of the indirect argument may be called in the caller function, and address of the indirect argument may be taken, in either case the indirect function argument is expected to be in default addr space, not the alloca address space. Therefore, the indirect function argument should be mapped to the temp var casted to default address space. The caller will cast it to alloca addr space when passing it to the callee. In the callee, the argument is also casted to the default address space and used. CallArg is refactored to facilitate this fix. Differential Revision: https://reviews.llvm.org/D34367 llvm-svn: 326946
-
Yaxun Liu authored
OpenCL runtime tracks the invoke function emitted for any block expression. Due to restrictions on blocks in OpenCL (v2.0 s6.12.5), it is always possible to know the block invoke function when emitting call of block expression or __enqueue_kernel builtin functions. Since __enqueu_kernel already has an argument for the invoke function, it is redundant to have invoke function member in the llvm block literal structure. This patch removes invoke function from the llvm block literal structure. It also removes the bitcast of block invoke function to the generic block literal type which is useless for OpenCL. This will save some space for the kernel argument, and also eliminate some store instructions. Differential Revision: https://reviews.llvm.org/D43783 llvm-svn: 326937
-
- Feb 23, 2018
-
-
Rafael Espindola authored
The tests that failed on a windows host have been fixed. Original message: Start setting dso_local for COFF. With this there are still some GVs where we don't set dso_local because setGVProperties is never called. I intend to fix that in followup commits. This is just the bare minimum to teach shouldAssumeDSOLocal what it should do for COFF. llvm-svn: 325940
-
- Feb 22, 2018
-
-
Alexey Sotkin authored
Summary: OpenCL 2.0 specification defines '-cl-uniform-work-group-size' option, which requires that the global work-size be a multiple of the work-group size specified to clEnqueueNDRangeKernel and allows optimizations that are made possible by this restriction. The patch introduces the support of this option. To keep information about whether an OpenCL kernel has uniform work group size or not, clang generates 'uniform-work-group-size' function attribute for every kernel: - "uniform-work-group-size"="true" for OpenCL 1.2 and lower, - "uniform-work-group-size"="true" for OpenCL 2.0 and higher if '-cl-uniform-work-group-size' option was specified, - "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no '-cl-uniform-work-group-size' options was specified. If the function is not an OpenCL kernel, 'uniform-work-group-size' attribute isn't generated. Patch by: krisb Reviewers: yaxunl, Anastasia, b-sumner Reviewed By: yaxunl, Anastasia Subscribers: nhaehnle, yaxunl, Anastasia, cfe-commits Differential Revision: https://reviews.llvm.org/D43570 llvm-svn: 325771
-
- Feb 15, 2018
-
-
Yaxun Liu authored
Differential Revision: https://reviews.llvm.org/D43340 llvm-svn: 325279
-
Yaxun Liu authored
The following test case causes issue with codegen of __enqueue_block void (^block)(void) = ^{ callee(id, out); }; enqueue_kernel(queue, 0, ndrange, block); Clang first does codegen for block expression in the first line and deletes its block info. Clang then tries to do codegen for the same block expression again for the second line, and fails because the block info is gone. The fix is to do normal codegen for both lines. Introduce an API to OpenCL runtime to record llvm block invoke function and llvm block literal emitted for each AST block expression, and use the recorded information for generating the wrapper kernel. The EmitBlockLiteral APIs are cleaned up to minimize changes to the normal codegen of blocks. Another minor issue is that some clean up AST expression is generated for block with captures, which can be stripped by IgnoreImplicit. Differential Revision: https://reviews.llvm.org/D43240 llvm-svn: 325264
-
- Feb 13, 2018
-
-
Yaxun Liu authored
Differential Revision: https://reviews.llvm.org/D43171 llvm-svn: 325031
-
- Feb 09, 2018
-
-
Matt Arsenault authored
llvm-svn: 324748
-
- Feb 08, 2018
-
-
Matt Arsenault authored
llvm-svn: 324641
-
- Feb 04, 2018
-
-
Daniil Fukalov authored
Fixed asserts in tests. llvm-svn: 324201
-