- Apr 28, 2016
-
-
Matt Arsenault authored
llvm-svn: 267922
-
- Apr 26, 2016
-
-
Konstantin Zhuravlyov authored
[AMDGPU] Move reserved vgpr count for trap handler usage to SIMachineFunctionInfo + minor commenting changes Differential Revision: http://reviews.llvm.org/D19537 llvm-svn: 267573
-
Konstantin Zhuravlyov authored
Differential Revision: http://reviews.llvm.org/D19235 llvm-svn: 267563
-
- Apr 19, 2016
-
-
Nicolai Haehnle authored
Summary: A shader stored the live mask (initial exec mask) in an SGPR which was then spilled during register allocation. The allocator quite reasonably optimized turned the spill into v_writelane_b32 %vgpr, exec_lo, N v_writelane_b32 %vgpr, exec_hi, N+1 at the beginning of the shader, confusing the SGPR accounting. No test case, because si-sgpr-spill.ll together with an upcoming patch for WQM handling exhibits the problem. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19199 llvm-svn: 266824
-
- Apr 15, 2016
-
-
Matt Arsenault authored
llvm-svn: 266382
-
- Apr 13, 2016
-
-
Artem Tamazov authored
Tests added along with implemented feature. Note that there is a small leftover of unecessary MI sheduling issue (more info in the review). CodeGen/AMDGPU/salu-to-valu.ll updated to fix the false regression. TODO: Support for TTMP quads, comma-separated syntax in "[]" and more. Differential Revision: http://reviews.llvm.org/D17825 llvm-svn: 266205
-
- Apr 06, 2016
-
-
Nicolai Haehnle authored
This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
-
- Apr 05, 2016
-
-
Konstantin Zhuravlyov authored
Differential Revision: http://reviews.llvm.org/D18726 llvm-svn: 265408
-
- Mar 30, 2016
-
-
Aaron Ballman authored
Silencing warnings from MSVC 2015 Update 2. All of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC. llvm-svn: 264929
-
- Mar 01, 2016
-
-
Matt Arsenault authored
llvm-svn: 262297
-
- Feb 12, 2016
-
-
Matt Arsenault authored
Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. llvm-svn: 260651
-
- Jan 28, 2016
-
-
Matt Arsenault authored
llvm-svn: 259088
-
- Jan 14, 2016
-
-
Rui Ueyama authored
llvm-svn: 257804
-
- Jan 13, 2016
-
-
Marek Olsak authored
Summary: This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM. The register assigns VGPR locations to PS inputs, while the ENA register determines whether or not they are loaded. Mesa needs to set some inputs as not-movable, so that a pixel shader prolog binary appended at the beginning can assume where some inputs are. v2: Make PSInputAddr private, because there is never enough silly getters and setters for people to read. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16030 llvm-svn: 257591
-
- Jan 12, 2016
-
-
Tom Stellard authored
Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16010 llvm-svn: 257488
-
- Jan 08, 2016
-
-
Tom Stellard authored
Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15952 llvm-svn: 257173
-
- Jan 07, 2016
-
-
Nicolai Haehnle authored
Summary: Somehow, I first interpreted the docs as saying space for xnack_mask is only reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and went back to actually test what is happening, and it turns out that xnack_mask is always reserved at least on Tonga and Carrizo, in the sense that flat_scr is always fixed below the SGPRs that are used to implement xnack_mask, whether or not they are actually used. I confirmed this by writing a shader using inline assembly to tease out the aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so xnack_mask is s[76:77] and vcc is s[78:79]). This patch changes both the calculation of the total number of SGPRs and the various register reservations to account for this. It ought to be possible to use the gap left by xnack_mask when the feature isn't used, but this patch doesn't try to do that. (Note that the same applies to vcc.) Note that previously, even before my earlier change in r256794, the SGPRs that alias to xnack_mask could end up being used as well when flat_scr was unused and the total number of SGPRs happened to fall on the right alignment (e.g. highest regular SGPR being used s29 and VCC used would lead to number of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there were some conflict due to such aliasing, we should have noticed that already. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15898 llvm-svn: 257073
-
- Jan 05, 2016
-
-
Nicolai Haehnle authored
Summary: Enabling this feature will account for the two SGPRs used by the hardware to store the XNACK_MASK physically. The hardware only requires this reservation when the XNACK feature is explicitly enabled. At some point, HSA will probably want to do that, but it does increase SGPR register pressure, so leave it disabled by default for now (but do add a small test). Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15869 llvm-svn: 256794
-
- Dec 17, 2015
-
-
Tom Stellard authored
Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15583 Patch by: Changpeng Fang llvm-svn: 255908
-
- Dec 16, 2015
-
-
Tom Stellard authored
Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15493 llvm-svn: 255702
-
- Dec 15, 2015
-
-
Tom Stellard authored
Summary: I'm not sure how things worked before without this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15492 llvm-svn: 255692
-
Tom Stellard authored
Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15426 llvm-svn: 255689
-
- Dec 10, 2015
-
-
Tom Stellard authored
Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204
-
- Dec 03, 2015
-
-
Tom Stellard authored
Summary: This is done only when targeting HSA. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13807 llvm-svn: 254587
-
- Dec 02, 2015
-
-
Tom Stellard authored
Differential Revision: http://reviews.llvm.org/D14508 llvm-svn: 254540
-
Tom Stellard authored
Summary: Only global or readonly segment variables should appear in object files. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15111 llvm-svn: 254519
-
- Nov 30, 2015
-
-
Matt Arsenault authored
llvm-svn: 254332
-
Matt Arsenault authored
If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
-
- Nov 26, 2015
-
-
Tom Stellard authored
Summary: This returns a pointer to the dispatch packet, which can be used to load information about the kernel dispach. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D14898 llvm-svn: 254116
-
- Nov 11, 2015
-
-
Matt Arsenault authored
llvm-svn: 252677
-
- Nov 06, 2015
-
-
Tom Stellard authored
Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13804 llvm-svn: 252291
-
- Nov 05, 2015
-
-
Matt Arsenault authored
This doesn't quite match how SC prints it, which doesn't put it in a comment. llvm-svn: 252144
-
- Oct 01, 2015
-
-
Matt Arsenault authored
llvm-svn: 249082
-
- Sep 22, 2015
-
-
NAKAMURA Takumi authored
llvm-svn: 248264
-
NAKAMURA Takumi authored
llvm-svn: 248263
-
- Aug 15, 2015
-
-
Matt Arsenault authored
The comments at the bottom would all report 0 if amdhsa was used. llvm-svn: 245135
-
- Aug 12, 2015
-
-
Matt Arsenault authored
llvm-svn: 244728
-
- Jun 26, 2015
-
-
Tom Stellard authored
Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10772 llvm-svn: 240839
-
Tom Stellard authored
Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10757 llvm-svn: 240831
-
Tom Stellard authored
Summary: This way the function symbol points to the start of amd_kernel_code_t rather than the start of the function. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10705 llvm-svn: 240829
-