- May 23, 2018
-
-
Philip Pfaffe authored
statement naming - A recent ppcg/isl update caused the grid/block size upper bounds to deviate by one from the oracle. This is not an effect that's visible at runtime. - Statement naming changed in polly. Update the testcases. llvm-svn: 333090
-
- May 10, 2018
-
-
Tobias Grosser authored
Rename variable to retainedNodes. This unbreaks the Polly builds. llvm-svn: 331960
-
- Dec 01, 2017
-
-
Philip Pfaffe authored
Using numeric registers is flaky, since as soon as one additional instruction is generated by us, all the tests need to be adapted. llvm-svn: 319544
-
- Oct 29, 2017
-
-
Philip Pfaffe authored
Add missing %loadPolly directive to support out of tree builds. One of the changes is somewhat bigger, because the directive turns on LLVM names, and the testcase deosn't use those. llvm-svn: 316870
-
- Oct 04, 2017
-
-
Tobias Grosser authored
We make sure that the final reload of an invariant scalar memory access uses the same stack slot into which the invariant memory access was stored originally. Earlier, this was broken as we introduce a new stack slot aside of the preload stack slot, which remained uninitialized and caused our escaping loads to contain garbage. This happened due to us clearing the pre-populated values in EscapeMap after kernel code generation. We address this issue by preserving the original host values and restoring them after kernel code generation. EscapeMap is not expected to be used during kernel code generation, hence we clear it during kernel generation to make sure that any unintended uses are noticed. llvm-svn: 314894
-
- Oct 01, 2017
-
-
Tobias Grosser authored
llvm-svn: 314625
-
Tobias Grosser authored
This matches the behavior we already have in lib/Codegen/CodeGeneration.cpp and makes sure that we fall back to the original code. It seems when invariant load hoisting was introduced to the GPGPU backend we missed to reset the RTC flag, such that kernels where invariant load hoisting failed executed the 'optimized' SCoP, which however is set to a simple 'unreachable'. Unsurprisingly, this results in hard to debug issues that are a lot of fun to debug. llvm-svn: 314624
-
- Sep 01, 2017
-
-
Siddharth Bhat authored
In Polly, we specifically add a paramter to represent the outermost dimension size of fortran arrays. We do this because this information is statically available from the fortran metadata generated by dragonegg. However, we were only materializing these parameters (meaning, creating an llvm::Value to back the isl_id) from *memory accesses*. This is wrong, we should materialize parameters from *scop array info*. It is wrong because if there is a case where we detect 2 fortran arrays, but only one of them is accessed, we may not materialize the other array's dimensions at all. This is incorrect. We fix this by looping over all `polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`. Differential Revision: https://reviews.llvm.org/D37379 llvm-svn: 312350
-
- Aug 31, 2017
-
-
Siddharth Bhat authored
This is useful when we face certain intrinsics such as `llvm.exp.*` which cannot be lowered by the NVPTX backend while other intrinsics can. So, we would need to keep blacklists of intrinsics that cannot be handled by the NVPTX backend. It is much simpler to try and promote all intrinsics to libdevice versions. This patch makes function/intrinsic very uniform, and will always try to use a libdevice version if it exists. Differential Revision: https://reviews.llvm.org/D37056 llvm-svn: 312239
-
- Aug 22, 2017
-
-
Tobias Grosser authored
llvm-svn: 311423
-
- Aug 21, 2017
-
-
Tobias Grosser authored
These intrinsics are used in COSMO. llvm-svn: 311324
-
Tobias Grosser authored
These two functions are used in COSMO llvm-svn: 311322
-
- Aug 20, 2017
-
-
Tobias Grosser authored
We still see some issues with parameter space mismatches. Revert this to get a clean baseline. We will recommit after these issues have been resolved. This reverts commit 0e360a14194f722ded7aa2bc9d4be2ed2efeeb49. llvm-svn: 311268
-
- Aug 19, 2017
-
-
Tobias Grosser authored
Summary: This information is necessary for PPCG to perform correct life range reordering. With these changes applied we can live-range reorder some of the important kernels in COSMO. We also update and rename one test case, which previously could not be optimized and now is optimized thanks to live-range reordering. To preserve test coverage we add a new test case scalar-writes-in-scop-requires-abort.ll, which exercises our automatic abort in case of scalar writes in the kernel. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36929 llvm-svn: 311259
-
Philipp Schaad authored
Kernel argument sizes now only get appended to the kernel launch parameter list if the OpenCL runtime is selected, not if CUDA runtime is chosen. Differential revision: D36925 llvm-svn: 311248
-
Tobias Grosser authored
When using -polly-ignore-integer-wrapping and -polly-acc-codegen-managed-memory we add parameter dimensions lazily to the domains, which results in PPCG not including parameter dimensions that are only used in memory accesses in the kernel space. To make sure these parameters are still passed to the kernel, we collect these parameter dimensions and align the kernel's parameter space before code-generating it. llvm-svn: 311239
-
- Aug 18, 2017
-
-
Tobias Grosser authored
Summary: Drop unused parameter dimensions to reduce the size of the sets we are working with. Especially the computed dependences tend to accumulate a lot of parameters that are present in the input memory accesses, but often not necessary to express the actual dependences. As isl represents maps and sets with dense matrices, reducing the dimensionality of isl sets commonly reduces code generation performance. This reduces compile time from 17 to 11 seconds for our test case. While this is not impressive, this patch helped me to identify the previous two performance improvements and additionally also increases readability of the isl data structures we use. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36869 llvm-svn: 311161
-
- Aug 17, 2017
-
-
Siddharth Bhat authored
Reuse the machinery built for replacing global arrays to replace malloc/free as well. Example replacement that was missed earlier: ``` call void \ bitcast (void (i8*)* @free to void (%custom_type*)*) (%custom_type* %13) ``` - Since the `bitcast` is a `ConstantExpr`, `replaceAllUsesWith` would miss this. We don't miss this anymore. Differential Revision: https://reviews.llvm.org/D36825 llvm-svn: 311121
-
Tobias Grosser authored
In release builds LLVM may not pass along LLVM names consistently. We make the test cases independent of the LLVM-IR names to avoid spurious test case failures. llvm-svn: 311118
-
Siddharth Bhat authored
- If we have global arrays, we would like to rewrite them to global pointers which are allocated using `cudaMallocManaged`. - If we have allocas in a function, we would like to rewrite them to heap-allocations with `cudaMallocManaged` and `cudaFree`. - With these rewrite mechanisms, we can offload _any_ function to the GPU with no code rewrite whatsover. Differential Revision: https://reviews.llvm.org/D36516 llvm-svn: 311080
-
Tobias Grosser authored
llvm-svn: 311046
-
- Aug 16, 2017
-
-
Tobias Grosser authored
Before this change kernels that used invariant loads would have resulted in invalid PTX code. llvm-svn: 311042
-
- Aug 10, 2017
-
-
Philip Pfaffe authored
llvm-svn: 310595
-
Tobias Grosser authored
llvm-svn: 310555
-
Tobias Grosser authored
This is necessary for partial writes (as used by delicm) to work. llvm-svn: 310553
-
- Aug 09, 2017
-
-
Siddharth Bhat authored
We do not need to keep `malloc` and `free` around since they are replaced by `polly_{malloc,free}Managed.` llvm-svn: 310504
-
Siddharth Bhat authored
llvm-svn: 310473
-
Siddharth Bhat authored
This pass is useful to automatically convert a codebase that uses malloc/free to use their managed memory counterparts. Currently, rewrite malloc and free to the `polly_{malloc,free}Managed` variants. A future patch will teach ManagedMemoryRewrite to rewrite global arrays as pointers to globally allocated managed memory. Differential Revision: https://reviews.llvm.org/D36513 llvm-svn: 310471
-
Siddharth Bhat authored
Previously, we used to compute this with `elementSizeInBits / 8`. This would yield an element size of 0 when the array had element size < 8 in bits. To fix this, ask data layout what the size in bytes should be. Differential Revision: https://reviews.llvm.org/D36459 llvm-svn: 310448
-
- Aug 08, 2017
-
-
Siddharth Bhat authored
llvm-svn: 310354
-
Siddharth Bhat authored
To do this, we replicate what `CodeGeneration` does. We expose `markNodeUnreachable` from `CodeGeneration` to `PPCGCodeGeneration`. Differential Revision: https://reviews.llvm.org/D36457 llvm-svn: 310350
-
- Aug 06, 2017
-
-
Tobias Grosser authored
llvm-svn: 310197
-
Tobias Grosser authored
Summary: This resolves some "instruction does not dominate use" errors, as we used to prepare the arrays at the location of the first kernel, which not necessarily dominated all other kernel calls. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D36372 llvm-svn: 310196
-
Tobias Grosser authored
llvm-svn: 310194
-
Siddharth Bhat authored
A Scop with a loop outside it is not handled currently by PPCGCodeGeneration. The test case is such that the Scop has only one inner loop that is detected. This currently breaks codegen. The fix is to reuse the existing mechanism in `IslNodeBuilder` within `GPUNodeBuilder. Differential Revision: https://reviews.llvm.org/D36290 llvm-svn: 310193
-
- Aug 03, 2017
-
-
Tobias Grosser authored
llvm-svn: 309943
-
Tobias Grosser authored
Summary: In case the option -polly-ignore-parameter-bounds is set, not all parameters will be added to context and domains. This is useful to keep the size of the sets and maps we work with small. Unfortunately, for AST generation it is necessary to ensure all parameters are part of the schedule tree. Hence, we modify the GPGPU code generation to make sure this is the case. To obtain the necessary information we expose a new function Scop::getFullParamSpace(). We also make a couple of functions const to be able to make SCoP::getFullParamSpace() const. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Subscribers: nemanjai, kbarton, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36243 llvm-svn: 309939
-
Michael Kruse authored
llvm-svn: 309938
-
Siddharth Bhat authored
[PPCGCodeGeneration] Construct `isl_multi_pw_aff` of PPCGArray.bounds even when polly-ignore-parameter-bounds is turned on. When we have `-polly-ignore-parameter-bounds`, `Scop::Context` does not contain all the paramters present in the program. The construction of the `isl_multi_pw_aff` requires all the indivisual `pw_aff` to have the same parameter dimensions. To achieve this, we used to realign every `pw_aff` with `Scop::Context`. However, in conjunction with `-polly-ignore-parameter-bounds`, this is now incorrect, since `Scop::Context` does not contain all parameters. We set this up correctly by creating a space that has all the parameters used by all the `isl_pw_aff`. Then, we realign all `isl_pw_aff` to this space. llvm-svn: 309934
-
- Aug 02, 2017
-
-
Singapuram Sanjay Srivallabh authored
Summary: **Remove debug metadata from instruction to be copied to prevent the source file's debug metadata being copied into GPUModule and eventually failing Module verification and ASM string codegeneration.** When copying the instruction onto the Module meant for the GPU, debug metadata attached to an instruction causes all related metadata to be pulled into the Module, including the DICompileUnit, which is not listed in llvm.dbg.cu of the Module. This fails the verification of the Module and generation of the ASM string. The only debug metadata of the instruction, the DebugLoc, is unset by this patch. This patch reattempts https://reviews.llvm.org/D35630 by targeting only those instructions that are to end up in a Module meant for the GPU. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D36161 llvm-svn: 309822
-