- Mar 17, 2014
-
-
Tom Stellard authored
The type of the immediates should not matter as long as the encoding is equivalent to the encoding of one of the legal inline constants. Tested-by:
Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204056
-
Tom Stellard authored
Added checks for number of operands and operand register classes. Tested-by:
Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204054
-
- Mar 14, 2014
-
-
Owen Anderson authored
operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
-
- Mar 11, 2014
-
-
Matt Arsenault authored
llvm-svn: 203517
-
- Feb 10, 2014
-
-
Tom Stellard authored
DS instructions that access local memory can only uses addresses that are less than or equal to the value of M0. When M0 is uninitialized, then we experience undefined behavior. This patch also changes the behavior to emit S_WQM_B64 on pixel shaders no matter what kind of DS instruction is used. llvm-svn: 201097
-
- Dec 17, 2013
-
-
Andrew Trick authored
Allow MachineCSE to coalesce trivial subregister copies the same way that it coalesces normal copies. Without this, MachineCSE is powerless to handle redundant operations with truncated source operands. This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled: %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1 %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2 %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def> Test case: cse-add-with-overflow.ll. This exposed an existing bug in PPCInstrInfo::commuteInstruction. Thanks to Rafael for the test case: PowerPC/crash.ll. llvm-svn: 197465
-
- Nov 27, 2013
-
-
Tom Stellard authored
SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions. v2: - Fix encoding of Lane Mask - Use correct register flags, so we don't overwrite the low dword when restoring multi-dword registers. v3: - Register spilling seems to hang the GPU, so replace all shaders that need spilling with a dummy shader. v4: - Fix *LANE definitions - Change destination reg class for 32-bit SMRD instructions v5: - Remove small optimization that was crashing Serious Sam 3. https://bugs.freedesktop.org/show_bug.cgi?id=68224 https://bugs.freedesktop.org/show_bug.cgi?id=71285 NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195880
-
- Nov 18, 2013
-
-
Matt Arsenault authored
Moving into a VSrc doesn't always work, since it could be replaced with an SGPR later. llvm-svn: 195042
-
Matt Arsenault authored
No other SGPR operands are allowed, so if VCC is used, move the other to a VGPR. llvm-svn: 195041
-
Matt Arsenault authored
llvm-svn: 195034
-
Matt Arsenault authored
When replacing scalar operations with vector, the wrong implicit output register was used. llvm-svn: 195033
-
- Nov 15, 2013
-
-
Matt Arsenault authored
llvm-svn: 194858
-
- Nov 14, 2013
-
-
Matt Arsenault authored
llvm-svn: 194688
-
Matt Arsenault authored
llvm-svn: 194684
-
Tom Stellard authored
llvm-svn: 194632
-
Tom Stellard authored
Private address space is emulated using the register file with MOVRELS and MOVRELD instructions. llvm-svn: 194626
-
Tom Stellard authored
All shift operations will be selected as SALU instructions and then if necessary lowered to VALU instructions in the SIFixSGPRCopies pass. This allows us to do more operations on the SALU which will improve performance and is also required for implementing private memory using indirect addressing, since the private memory pointers must stay in the scalar registers. This patch includes some fixes from Matt Arsenault. llvm-svn: 194625
-
- Oct 28, 2013
-
-
NAKAMURA Takumi authored
llvm-svn: 193510
-
- Oct 22, 2013
-
-
Tom Stellard authored
llvm-svn: 193183
-
Tom Stellard authored
llvm-svn: 193180
-
Tom Stellard authored
The AMDGPUIndirectAddressing pass was previously responsible for lowering private loads and stores to indirect addressing instructions. However, this pass was buggy and way too complicated. The only advantage it had over the new simplified code was that it saved one instruction per direct write to private memory. This optimization likely has a minimal impact on performance, and we may be able to duplicate it using some other transformation. For the private address space, we now: 1. Lower private loads/store to Register(Load|Store) instructions 2. Reserve part of the register file as 'private memory' 3. After regalloc lower the Register(Load|Store) instructions to MOV instructions that use indirect addressing. llvm-svn: 193179
-
Tom Stellard authored
llvm-svn: 193178
-
- Oct 16, 2013
-
-
Vincent Lejeune authored
llvm-svn: 192743
-
- Oct 10, 2013
-
-
Tom Stellard authored
The function is used by the machine verifier and checks that VOP* instructions have legal operands. llvm-svn: 192367
-
- Aug 18, 2013
-
-
Dmitri Gribenko authored
llvm-svn: 188626
-
- Aug 16, 2013
-
-
Michel Danzer authored
The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused it to corrupt the encoding of that by clobbering the first operand with the second one. Undo that damage and only apply the SMRD logic to that. Fixes some derivates related piglit regressions with radeonsi. Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188558
-
- Aug 15, 2013
-
-
Tom Stellard authored
The previous code declared the operand as unknown:$vaddr, which made it possible for scalar registers to be used instead of vector registers. llvm-svn: 188425
-
- Jul 15, 2013
-
-
Craig Topper authored
llvm-svn: 186307
-
- Jun 07, 2013
-
-
Bill Wendling authored
the internals of TargetMachine could change. No functionality change intended. llvm-svn: 183561
-
- Apr 10, 2013
-
-
Christian Konig authored
Depending on the number of bits set in the writemask. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 179166
-
- Mar 27, 2013
-
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Michel Dänzer <michel.daenzer@amd.com> Tested-by:
Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 178127
-
- Mar 26, 2013
-
-
Christian Konig authored
Prevent loading M0 multiple times. Signed-off-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178023
-
- Mar 01, 2013
-
-
Christian Konig authored
v2: based on Michels patch, but now allows copying of all registers sizes. Signed-off-by:
Michel Dänzer <michel.daenzer@amd.com> Signed-off-by:
Christian König <christian.koenig@amd.com> llvm-svn: 176346
-
- Feb 26, 2013
-
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 176102
-
- Feb 16, 2013
-
-
Christian Konig authored
Seems to be allot simpler, and also paves the way for further improvements. v2: rebased on master, use 0 in BUFFER_LOAD_FORMAT_XYZW, use VGPR0 in dummy EXP, avoid compiler warning, break after encoding the first literal. v3: correctly use V_ADD_F32_e64 This is a candidate for the stable branch. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 175354
-
- Feb 07, 2013
-
-
Tom Stellard authored
Allows nexuiz to run with radeonsi. Patch by: Michel Dänzer Signed-off-by:
Michel Dänzer <michel.daenzer@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174655
-
- Feb 06, 2013
-
-
Tom Stellard authored
Only implemented for R600 so far. SI is missing implementations of a few callbacks used by the Indirect Addressing pass and needs code to handle frame indices. At the moment R600 only supports array sizes of 16 dwords or less. Register packing of vector types is currently disabled, which means that a vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order to correctly pack registers in all cases, we will need to implement an analysis pass for R600 that determines the correct vector width for each array. v2: - Add support for i8 zext load from stack. - Coding style fixes v3: - Don't reserve registers for indirect addressing when it isn't being used. - Fix bug caused by LLVM limiting the number of SubRegIndex declarations. v4: - Fix 64-bit defines llvm-svn: 174525
-
- Jan 02, 2013
-
-
Chandler Carruth authored
utils/sort_includes.py script. Most of these are updating the new R600 target and fixing up a few regressions that have creeped in since the last time I sorted the includes. llvm-svn: 171362
-
- Dec 20, 2012
-
-
NAKAMURA Takumi authored
llvm-svn: 170620
-
- Dec 11, 2012
-
-
Tom Stellard authored
A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX llvm-svn: 169915
-