- Apr 18, 2017
-
-
Matt Arsenault authored
llvm-svn: 300596
-
- Apr 12, 2017
-
-
Matt Arsenault authored
Split into smaller functions and prepare for handling non-entry functions. llvm-svn: 299998
-
- Apr 10, 2017
-
-
Matt Arsenault authored
The unused dummy src2_modifiers is missing, so it crashes when trying to print it. I tried to fully remove src2_modifiers, but there are some irritations in the places where it is converted to mad since it starts to require modifying use lists while iterating over them. llvm-svn: 299861
-
- Feb 21, 2017
-
-
Matt Arsenault authored
Before frame offsets are calculated, try to eliminate the frame indexes used by SGPR spills. Then we can delete them after. I think for now we can be sure that no other instruction will be re-using the same frame indexes. It should be easy to notice if this assumption ever breaks since everything asserts if it tries to use a dead frame index later. The unused emergency stack slot seems to still be left behind, so an additional 4 bytes is still wasted. llvm-svn: 295753
-
- Jan 25, 2017
-
-
Tom Stellard authored
Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000
-
- Jan 21, 2017
-
-
Eugene Zelenko authored
llvm-svn: 292688
-
- Dec 20, 2016
-
-
Tom Stellard authored
llvm-svn: 290185
-
Tom Stellard authored
Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184
-
Tom Stellard authored
Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could be reordered across them. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27536 llvm-svn: 290179
-
- Sep 06, 2016
-
-
Konstantin Zhuravlyov authored
- Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747
-
- Aug 29, 2016
-
-
Saleem Abdulrasool authored
llvm-svn: 280006
-
- Aug 11, 2016
-
-
Matt Arsenault authored
llvm-svn: 278361
-
- Jul 26, 2016
-
-
Matt Arsenault authored
ABIArgOffset is a problem because properly fsetting the KernArgSize requires that the reserved area before the real kernel arguments be correctly aligned, which requires fixing clover. llvm-svn: 276766
-
- Jul 22, 2016
-
-
Matt Arsenault authored
llvm-svn: 276437
-
- Jul 13, 2016
-
-
Marek Olsak authored
Summary: v2: don't count SGPRs spilled to scratch twice I think this is sufficient. It doesn't count private memory usage, which happens often and uses scratch but isn't technically a spill. The private memory usage can be computed by: [scratch_per_thread - vgpr_spills - a random multiple of SGPR spills]. The fact SGPR spills add very high numbers to the scratch size make that computation a guessing game, but I don't have a solution to that. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D22197 llvm-svn: 275288
-
- Jun 27, 2016
-
-
NAKAMURA Takumi authored
llvm-svn: 273860
-
NAKAMURA Takumi authored
llvm-svn: 273858
-
- Jun 25, 2016
-
-
Konstantin Zhuravlyov authored
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769
-
- May 24, 2016
-
-
Konstantin Zhuravlyov authored
Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594
-
- Apr 26, 2016
-
-
Konstantin Zhuravlyov authored
[AMDGPU] Move reserved vgpr count for trap handler usage to SIMachineFunctionInfo + minor commenting changes Differential Revision: http://reviews.llvm.org/D19537 llvm-svn: 267573
-
- Apr 25, 2016
-
-
Matt Arsenault authored
llvm-svn: 267452
-
- Apr 14, 2016
-
-
Tom Stellard authored
Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD. This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions. Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug. Reviewers: mareko, arsenm, tstellarAMD, nhaehnle Subscribers: FireBurn, kerberizer, llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18340 Patch By: Bas Nieuwenhuizen llvm-svn: 266337
-
Tom Stellard authored
Summary: The code previously always used s1 as it was using the user + system SGPR information for compute kernels. This is incorrect for Mesa shaders though, The register should be the next SGPR after all user and system SGPR's. We use that Mesa adds arguments for all input and system SGPR's and take the next available SGPR for the scratch wave offset register. Signed-off-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewers: mareko, arsenm, nhaehnle, tstellarAMD Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18941 Patch By: Bas Nieuwenhuizen llvm-svn: 266336
-
- Mar 11, 2016
-
-
Matt Arsenault authored
Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204
-
- Mar 04, 2016
-
-
Tom Stellard authored
Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732
-
- Feb 12, 2016
-
-
Matt Arsenault authored
This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. llvm-svn: 260658
-
- Jan 13, 2016
-
-
Marek Olsak authored
Summary: v2: Make ReturnsVoid private, so that I can another 8 lines of code and look more productive. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16034 llvm-svn: 257622
-
Marek Olsak authored
Summary: This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM. The register assigns VGPR locations to PS inputs, while the ENA register determines whether or not they are loaded. Mesa needs to set some inputs as not-movable, so that a pixel shader prolog binary appended at the beginning can assume where some inputs are. v2: Make PSInputAddr private, because there is never enough silly getters and setters for people to read. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16030 llvm-svn: 257591
-
- Nov 30, 2015
-
-
Matt Arsenault authored
If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
-
- Nov 25, 2015
-
-
Matt Arsenault authored
llvm-svn: 254091
-
- Nov 05, 2015
-
-
Matt Arsenault authored
llvm-svn: 252145
-
- Jun 13, 2015
-
-
Tom Stellard authored
llvm-svn: 239657
-
- Jan 20, 2015
-
-
Tom Stellard authored
This is disabled by default, but can be enabled with the subtarget feature: 'vgpr-spilling' llvm-svn: 226597
-
- Jan 14, 2015
-
-
Tom Stellard authored
llvm-svn: 225988
-
- Sep 24, 2014
-
-
Tom Stellard authored
VGPRs are spilled to LDS. This still needs more testing, but we need to at least enable it at -O0, because the fast register allocator spills all registers that are live at the end of blocks and without this some future commits will break the flat-address-space.ll test. v2: Only calculate thread id once v3: Move insertion of spill instructions to SIRegisterInfo::eliminateFrameIndex() llvm-svn: 218348
-
- Aug 21, 2014
-
-
Tom Stellard authored
llvm-svn: 216218
-
Tom Stellard authored
This will simplify the SGPR spilling and also allow us to use MachineFrameInfo for calculating offsets, which should be more reliable than our custom code. This fixes a crash in some cases where a register would be spilled in a branch such that the VGPR defined for spilling did not dominate all the uses when restoring. This fixes a crash in an ocl conformance test. The test requries register spilling and is too big to include. llvm-svn: 216217
-
- Aug 13, 2014
-
-
Benjamin Kramer authored
Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
-
- Jul 21, 2014
-
-
Tom Stellard authored
llvm-svn: 213551
-
- May 02, 2014
-
-
Tom Stellard authored
The register spiller assumes that only one new instruction is created when spilling and restoring registers, so we need to emit pseudo instructions for vector register spills and lower them after register allocation. v2: - Fix calculation of lane index - Extend VGPR liveness to end of program. v3: - Use SIMM16 field of S_NOP to specify multiple NOPs. https://bugs.freedesktop.org/show_bug.cgi?id=75005 llvm-svn: 207843
-