Skip to content
Commit 6217e18a authored by Tobias Grosser's avatar Tobias Grosser
Browse files

Add preliminary implementation for GPGPU code generation.

Translate the selected parallel loop body into a ptx string and run it with the
cuda driver API. We limit this preliminary implementation to target the
following special test cases:

  - Support only 2-dimensional parallel loops with or without only one innermost
    non-parallel loop.
  - Support write memory access to only one array in a SCoP.

The patch was committed with smaller changes to the build system:

There is now a flag to enable gpu code generation explictly. This was required
as we need the llvm.codegen() patch applied on the llvm sources, to compile this
feature correctly. Also, enabling gpu code generation does not require cuda.
This requirement was removed to allow 'make polly-test' runs, even without an
installed cuda runtime.

Contributed by:  Yabin Hu  <yabin.hwu@gmail.com>

llvm-svn: 161239
parent 7555b540
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment