Commits · 161e2b422316cd57c0285aafb027aa38bd0e4e45 · Roger Ferrer / llvm-epi

Apr 18, 2017
- AMDGPU: Make MFI fields private · 161e2b42
  Matt Arsenault authored Apr 18, 2017
```
llvm-svn: 300596
```
  161e2b42
Apr 12, 2017

AMDGPU: Refactor argument lowering · e622dc38

Matt Arsenault authored Apr 11, 2017

Split into smaller functions and prepare for handling
non-entry functions.

llvm-svn: 299998

e622dc38

Apr 10, 2017

AMDGPU: Fix crash when disassembling VOP3 mac · 678e111e

Matt Arsenault authored Apr 10, 2017

The unused dummy src2_modifiers is missing, so it crashes
when trying to print it.

I tried to fully remove src2_modifiers, but there are some
irritations in the places where it is converted to mad since
it starts to require modifying use lists while iterating over
them.

llvm-svn: 299861

678e111e

Feb 21, 2017

AMDGPU: Don't use stack space for SGPR->VGPR spills · e0bf7d02

Matt Arsenault authored Feb 21, 2017

Before frame offsets are calculated, try to eliminate the
frame indexes used by SGPR spills. Then we can delete them
after.

I think for now we can be sure that no other instruction
will be re-using the same frame indexes. It should be easy
to notice if this assumption ever breaks since everything
asserts if it tries to use a dead frame index later.

The unused emergency stack slot seems to still be left behind,
so an additional 4 bytes is still wasted.

llvm-svn: 295753

e0bf7d02

Jan 25, 2017

AMDGPU add support for spilling to a user sgpr pointed buffers · 2f3f9855

Tom Stellard authored Jan 25, 2017

Summary:
This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].

Patch By: Dave Airlie

Reviewers: nhaehnle, arsenm, tstellarAMD

Reviewed By: arsenm

Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D25428

llvm-svn: 293000

2f3f9855

Jan 21, 2017
- [AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). · 6620376d
  Eugene Zelenko authored Jan 21, 2017
```
llvm-svn: 292688
```
  6620376d
Dec 20, 2016

AMDGPU/SI: Make a function const · bb138886
Tom Stellard authored Dec 20, 2016
```
llvm-svn: 290185
```
bb138886

AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.* · 6f9ef14b

Tom Stellard authored Dec 20, 2016

Reviewers: arsenm, nhaehnle, mareko

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D27834

llvm-svn: 290184

6f9ef14b

AMDGPU/SI: Add a MachineMemOperand to MIMG instructions · 244891d1

Tom Stellard authored Dec 20, 2016

Summary:
Without a MachineMemOperand, the scheduler was assuming MIMG instructions
were ordered memory references, so no loads or stores could be reordered
across them.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27536

llvm-svn: 290179

244891d1

Sep 06, 2016

[AMDGPU] Wave and register controls · 1d65026c

Konstantin Zhuravlyov authored Sep 06, 2016

- Implemented amdgpu-flat-work-group-size attribute
- Implemented amdgpu-num-active-waves-per-eu attribute
- Implemented amdgpu-num-sgpr attribute
- Implemented amdgpu-num-vgpr attribute
- Dynamic LDS constraints are in a separate patch

Patch by Tom Stellard and Konstantin Zhuravlyov

Differential Revision: https://reviews.llvm.org/D21562

llvm-svn: 280747

1d65026c

Aug 29, 2016
- AMDGPU: fix mismatch tags, NFC · 43e5fe3f
  Saleem Abdulrasool authored Aug 29, 2016
```
llvm-svn: 280006
```
  43e5fe3f
Aug 11, 2016
- AMDGPU: Remove unused tracking of flat instructions · 69fd2c11
  Matt Arsenault authored Aug 11, 2016
```
llvm-svn: 278361
```
  69fd2c11
Jul 26, 2016

AMDGPU: Make AMDGPUMachineFunction fields private · 52ef4019

Matt Arsenault authored Jul 26, 2016

ABIArgOffset is a problem because properly fsetting the
KernArgSize requires that the reserved area before the
real kernel arguments be correctly aligned, which requires
fixing clover.

llvm-svn: 276766

52ef4019

Jul 22, 2016
- AMDGPU: Add HSA dispatch id intrinsic · 8d718dcf
  Matt Arsenault authored Jul 22, 2016
```
llvm-svn: 276437
```
  8d718dcf
Jul 13, 2016

AMDGPU/SI: Emit the number of SGPR and VGPR spills · 0532c190

Marek Olsak authored Jul 13, 2016

Summary:
v2: don't count SGPRs spilled to scratch twice

I think this is sufficient. It doesn't count private memory usage, which
happens often and uses scratch but isn't technically a spill. The private
memory usage can be computed by:
  [scratch_per_thread - vgpr_spills - a random multiple of SGPR spills].

The fact SGPR spills add very high numbers to the scratch size make that
computation a guessing game, but I don't have a solution to that.

Reviewers: tstellarAMD

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D22197

llvm-svn: 275288

0532c190

Jun 27, 2016
- SIMachineFunctionInfo.cpp: Appease msc18 to use std::array. · 5cbd41e0
  NAKAMURA Takumi authored Jun 27, 2016
```
llvm-svn: 273860
```
  5cbd41e0
- Reformat blank lines. · d377ad80
  NAKAMURA Takumi authored Jun 27, 2016
```
llvm-svn: 273858
```
  d377ad80
Jun 25, 2016

[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header · f2f3d147

Konstantin Zhuravlyov authored Jun 25, 2016

Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.

Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
  - offset 0: work group ID x
  - offset 4: work group ID y
  - offset 8: work group ID z
  - offset 16: work item ID x
  - offset 20: work item ID y
  - offset 24: work item ID z

Set
  - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
  - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
  - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled

Differential Revision: http://reviews.llvm.org/D20335

llvm-svn: 273769

f2f3d147

May 24, 2016
- [AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs · 29ddd2b2
  Konstantin Zhuravlyov authored May 24, 2016
```
Differential Revision: http://reviews.llvm.org/D20081

llvm-svn: 270594
```
  29ddd2b2
Apr 26, 2016

[AMDGPU] Move reserved vgpr count for trap handler usage to... · 71515e57

Konstantin Zhuravlyov authored Apr 26, 2016

[AMDGPU] Move reserved vgpr count for trap handler usage to SIMachineFunctionInfo + minor commenting changes

Differential Revision: http://reviews.llvm.org/D19537

llvm-svn: 267573

71515e57

Apr 25, 2016
- AMDGPU: Implement addrspacecast · 99c14524
  Matt Arsenault authored Apr 25, 2016
```
llvm-svn: 267452
```
  99c14524
Apr 14, 2016

AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit · 79a1fd71

Tom Stellard authored Apr 14, 2016

Summary:
For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD.

This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions.

Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug.

Reviewers: mareko, arsenm, tstellarAMD, nhaehnle

Subscribers: FireBurn, kerberizer, llvm-commits, arsenm

Differential Revision: http://reviews.llvm.org/D18340

Patch By: Bas Nieuwenhuizen

llvm-svn: 266337

79a1fd71

AMDGPU/SI: Use the correct scratch wave offset register for shaders. · f110f8f9

Tom Stellard authored Apr 14, 2016



Summary:
The code previously always used s1 as it was using the user + system SGPR
information for compute kernels. This is incorrect for Mesa shaders though,

The register should be the next SGPR after all user and system SGPR's.
We use that Mesa adds arguments for all input and system SGPR's and
take the next available SGPR for the scratch wave offset register.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Reviewers: mareko, arsenm, nhaehnle, tstellarAMD

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18941

Patch By: Bas Nieuwenhuizen

llvm-svn: 266336

f110f8f9

Mar 11, 2016

AMDGPU: R600 code splitting cleanup · 6b6a2c37

Matt Arsenault authored Mar 11, 2016

Move a few functions only used by R600 to R600 specific code,
fix header macros to stop using R600, mark classes as final.

llvm-svn: 263204

6b6a2c37

Mar 04, 2016

AMDGPU/SI: Add support for spiling SGPRs to scratch buffer · 649b5db5

Tom Stellard authored Mar 04, 2016

Summary:
This is necessary for when we run out of VGPRs and can no
longer use v_{read,write}_lane for spilling SGPRs.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17592

llvm-svn: 262732

649b5db5

Feb 12, 2016

AMDGPU: Set flat_scratch from flat_scratch_init reg · 296b8491

Matt Arsenault authored Feb 12, 2016

This was hardcoded to the static private size, but this
would be missing the offset and additional size for someday
when we have dynamic sizing.

Also stops always initializing flat_scratch even when unused.

In the future we should stop emitting this unless flat instructions
are used to access private memory. For example this will initialize
it almost always on VI because flat is used for global access.

llvm-svn: 260658

296b8491

Jan 13, 2016

AMDGPU/SI: Add s_waitcnt at the end of non-void functions · 8e9cc63b

Marek Olsak authored Jan 13, 2016

Summary:
v2: Make ReturnsVoid private, so that I can another 8 lines of code and
    look more productive.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D16034

llvm-svn: 257622

8e9cc63b

AMDGPU/SI: Add new target attribute InitialPSInputAddr · fccabaf5

Marek Olsak authored Jan 13, 2016

Summary:
This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM.
The register assigns VGPR locations to PS inputs, while the ENA register
determines whether or not they are loaded.

Mesa needs to set some inputs as not-movable, so that a pixel shader prolog
binary appended at the beginning can assume where some inputs are.

v2: Make PSInputAddr private, because there is never enough silly getters
    and setters for people to read.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D16030

llvm-svn: 257591

fccabaf5

Nov 30, 2015

AMDGPU: Rework how private buffer passed for HSA · 26f8f3db

Matt Arsenault authored Nov 30, 2015

If we know we have stack objects, we reserve the registers
that the private buffer resource and wave offset are passed
and use them directly.

If not, reserve the last 5 SGPRs just in case we need to spill.
After register allocation, try to pick the next available registers
instead of the last SGPRs, and then insert copies from the inputs
to the reserved registers in the progloue.

This also only selectively enables all of the input registers
which are really required instead of always enabling them.

llvm-svn: 254331

26f8f3db

Nov 25, 2015
- AMDGPU: Check feature attributes in SIMachineFunctionInfo · 49affb84
  Matt Arsenault authored Nov 25, 2015
```
llvm-svn: 254091
```
  49affb84
Nov 05, 2015
- AMDGPU: Also track whether SGPRs were spilled · 5b22dfa6
  Matt Arsenault authored Nov 05, 2015
```
llvm-svn: 252145
```
  5b22dfa6
Jun 13, 2015
- R600 -> AMDGPU rename · 45bb48ea
  Tom Stellard authored Jun 13, 2015
```
llvm-svn: 239657
```
  45bb48ea
Jan 20, 2015
- R600/SI: Add subtarget feature to enable VGPR spilling for all shader types · e99fb65d
  Tom Stellard authored Jan 20, 2015
```
This is disabled by default, but can be enabled with the subtarget
feature: 'vgpr-spilling'

llvm-svn: 226597
```
  e99fb65d
Jan 14, 2015
- R600/SI: Spill VGPRs to scratch space for compute shaders · 42fb60e1
  Tom Stellard authored Jan 14, 2015
```
llvm-svn: 225988
```
  42fb60e1
Sep 24, 2014

R600/SI: Implement VGPR register spilling for compute at -O0 v3 · 96468903

Tom Stellard authored Sep 24, 2014

VGPRs are spilled to LDS.  This still needs more testing, but
we need to at least enable it at -O0, because the fast register
allocator spills all registers that are live at the end of blocks
and without this some future commits will break the
flat-address-space.ll test.

v2: Only calculate thread id once

v3: Move insertion of spill instructions to
    SIRegisterInfo::eliminateFrameIndex()
llvm-svn: 218348

96468903

Aug 21, 2014

R600/SI: Remove unused SGPR spilling code · 8e52375b
Tom Stellard authored Aug 21, 2014
```
llvm-svn: 216218
```
8e52375b

R600/SI: Use eliminateFrameIndex() to expand SGPR spill pseudos · c5cf2f04

Tom Stellard authored Aug 21, 2014

This will simplify the SGPR spilling and also allow us to use
MachineFrameInfo for calculating offsets, which should be more
reliable than our custom code.

This fixes a crash in some cases where a register would be spilled
in a branch such that the VGPR defined for spilling did not dominate
all the uses when restoring.

This fixes a crash in an ocl conformance test.  The test requries
register spilling and is too big to include.

llvm-svn: 216217

c5cf2f04

Aug 13, 2014

Canonicalize header guards into a common format. · a7c40ef0

Benjamin Kramer authored Aug 13, 2014

Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)

Changes made by clang-tidy with minor tweaks.

llvm-svn: 215558

a7c40ef0

Jul 21, 2014
- R600/SI: Use scratch memory for large private arrays · b02094e1
  Tom Stellard authored Jul 21, 2014
```
llvm-svn: 213551
```
  b02094e1
May 02, 2014

R600/SI: Only create one instruction when spilling/restoring register v3 · eba61071

Tom Stellard authored May 02, 2014

The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.

v2:
  - Fix calculation of lane index
  - Extend VGPR liveness to end of program.

v3:
  - Use SIMM16 field of S_NOP to specify multiple NOPs.

https://bugs.freedesktop.org/show_bug.cgi?id=75005

llvm-svn: 207843

eba61071