- Apr 07, 2013
-
-
Jakob Stoklund Olesen authored
Integer return values are sign or zero extended by the callee, and structs up to 32 bytes in size can be returned in registers. The CC_Sparc64 CallingConv definition is shared between LowerFormalArguments_64 and LowerReturn_64. Function arguments and return values are passed in the same registers. The inreg flag is also used for return values. This is required to handle C functions returning structs containing floats and ints: struct ifp { int i; float f; }; struct ifp f(void); LLVM IR: define inreg { i32, float } @f() { ... ret { i32, float } %retval } The ABI requires that %retval.i is returned in the high bits of %i0 while %retval.f goes in %f1. Without the inreg return value attribute, %retval.i would go in %i0 and %retval.f would go in %f3 which is a more efficient way of returning %multiple values, but it is not ABI compliant for returning C structs. llvm-svn: 178966
-
- Apr 06, 2013
-
-
Jakob Stoklund Olesen authored
64-bit SPARC v9 processes use biased stack and frame pointers, so the current function's stack frame is located at %sp+BIAS .. %fp+BIAS where BIAS = 2047. This makes more local variables directly accessible via [%fp+simm13] addressing. llvm-svn: 178965
-
Hal Finkel authored
There are certain PPC instructions into which we can fold a zero immediate operand. We can detect such cases by looking at the register class required by the using operand (so long as it is not otherwise constrained). llvm-svn: 178961
-
Hal Finkel authored
llvm-svn: 178960
-
Hal Finkel authored
This comment documents the current behavior of the ARM implementation of this callback, and also the soon-to-be-committed PPC version. llvm-svn: 178959
-
Jakob Stoklund Olesen authored
All arguments are formally assigned to stack positions and then promoted to floating point and integer registers. Since there are more floating point registers than integer registers, this can cause situations where floating point arguments are assigned to registers after integer arguments that where assigned to the stack. Use the inreg flag to indicate 32-bit fragments of structs containing both float and int members. The three-way shadowing between stack, integer, and floating point registers requires custom argument lowering. The good news is that return values are passed in the exact same way, and we can share the code. Still missing: - Update LowerReturn to handle structs returned in registers. - LowerCall. - Variadic functions. llvm-svn: 178958
-
Nadav Rotem authored
llvm-svn: 178949
-
Rafael Espindola authored
llvm-svn: 178948
-
Rafael Espindola authored
This also required not using the RegisterStringTable API, which is also a good thing. llvm-svn: 178947
-
Rafael Espindola authored
llvm-svn: 178946
-
Rafael Espindola authored
llvm-svn: 178945
-
Rafael Espindola authored
llvm-svn: 178943
-
Manman Ren authored
We used to do "SmallString += CUID", which is incorrect, since CUID will be truncated to a char. rdar://problem/13573833 llvm-svn: 178941
-
Michael Gottesman authored
llvm-svn: 178932
-
Tom Stellard authored
v2: - Use the ADDR64 bit Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178931
-
Tom Stellard authored
The code emitter knows how to encode operands whose name matches one of the encoding fields. If there is no match, the code emitter relies on the order of the operand and field definitions to determine how operands should be encoding. Matching by order makes it easy to accidentally break the instruction encodings, so we prefer to match by name. Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178930
-
Tom Stellard authored
This is an R600 GPU with double support. Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178929
-
Tom Stellard authored
Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178928
-
Tom Stellard authored
SITargetLowering::analyzeImmediate() was converting the 64-bit values to 32-bit and then checking if they were an inline immediate. Some of these conversions caused this check to succeed and produced S_MOV instructions with 64-bit immediates, which are illegal. v2: - Clean up logic Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178927
-
Hal Finkel authored
On cores for which we know the misprediction penalty, and we have the isel instruction, we can profitably perform early if conversion. This enables us to replace some small branch sequences with selects and avoid the potential stalls from mispredicting the branches. Enabling this feature required implementing canInsertSelect and insertSelect in PPCInstrInfo; isel code in PPCISelLowering was refactored to use these functions as well. llvm-svn: 178926
-
Hal Finkel authored
The manual states that there is a minimum of 13 cycles from when the mispredicted branch is issued to when the correct branch target is issued. llvm-svn: 178925
-
Michael Gottesman authored
This is the counterpart to commit r160637, except it performs the action in the bottomup portion of the data flow analysis. llvm-svn: 178922
-
Michael Gottesman authored
The normal dataflow sequence in the ARC optimizer consists of the following states: Retain -> CanRelease -> Use -> Release The optimizer before this patch stored the uses that determine the lifetime of the retainable object pointer when it bottom up hits a retain or when top down it hits a release. This is correct for an imprecise lifetime scenario since what we are trying to do is remove retains/releases while making sure that no ``CanRelease'' (which is usually a call) deallocates the given pointer before we get to the ``Use'' (since that would cause a segfault). If we are considering the precise lifetime scenario though, this is not correct. In such a situation, we *DO* care about the previous sequence, but additionally, we wish to track the uses resulting from the following incomplete sequences: Retain -> CanRelease -> Release (TopDown) Retain <- Use <- Release (BottomUp) *NOTE* This patch looks large but the most of it consists of updating test cases. Additionally this fix exposed an additional bug. I removed the test case that expressed said bug and will recommit it with the fix in a little bit. llvm-svn: 178921
-
Hal Finkel authored
This fixes PEI as previously described, but correctly handles the case where the instruction defining the virtual register to be scavenged is the first in the block. Arnold provided me with a bugpoint-reduced test case, but even that seems too large to use as a regression test. If I'm successful in cleaning it up then I'll commit that as well. Original commit message: This change fixes a bug that I introduced in r178058. After a register is scavenged using one of the available spills slots the instruction defining the virtual register needs to be moved to after the spill code. The scavenger has already processed the defining instruction so that registers killed by that instruction are available for definition in that same instruction. Unfortunately, after this, the scavenger needs to iterate through the spill code and then visit, again, the instruction that defines the now-scavenged register. In order to avoid confusion, the register scavenger needs the ability to 'back up' through the spill code so that it can again process the instructions in the appropriate order. Prior to this fix, once the scavenger reached the just-moved instruction, it would assert if it killed any registers because, having already processed the instruction, it believed they were undefined. Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar for diagnosing the problem and testing this fix. llvm-svn: 178919
-
- Apr 05, 2013
-
-
Bill Wendling authored
During LTO, the target options on functions within the same Module may change. This would necessitate resetting some of the back-end. Do this for X86, because it's a Friday afternoon. llvm-svn: 178917
-
Hal Finkel authored
Reverting because this breaks one of the LTO builders. Original commit message: This change fixes a bug that I introduced in r178058. After a register is scavenged using one of the available spills slots the instruction defining the virtual register needs to be moved to after the spill code. The scavenger has already processed the defining instruction so that registers killed by that instruction are available for definition in that same instruction. Unfortunately, after this, the scavenger needs to iterate through the spill code and then visit, again, the instruction that defines the now-scavenged register. In order to avoid confusion, the register scavenger needs the ability to 'back up' through the spill code so that it can again process the instructions in the appropriate order. Prior to this fix, once the scavenger reached the just-moved instruction, it would assert if it killed any registers because, having already processed the instruction, it believed they were undefined. Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar for diagnosing the problem and testing this fix. llvm-svn: 178916
-
Jim Grosbach authored
llvm-svn: 178915
-
Shuxin Yang authored
This optimization is unstable at this moment; it 1) block us on a very important application 2) PR15200 3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll (the CHECK command compare the output against wrong result) I personally believe this optimization should not have any impact on the autovectorized code, as auto-vectorizer is supposed to put gather/scatter in a "right" way. Although in theory downstream optimizaters might reveal some gather/scatter optimization opportunities, the chance is quite slim. For the hand-crafted vectorizing code, in term of redundancy elimination, load-CSE, copy-propagation and DSE can collectively achieve the same result, but in much simpler way. On the other hand, these optimizers are able to improve the code in a incremental way; in contrast, SROA is sort of all-or-none approach. However, SROA might slighly win in stack size, as it tries to figure out a stretch of memory tightenly cover the area accessed by the dynamic index. rdar://13174884 PR15200 llvm-svn: 178912
-
Akira Hatanaka authored
llvm-mips-linux green. llvm-mips-linux runs on a big endian machine. This test passes if I change 'e' to 'E' in the target data layout string. llvm-svn: 178910
-
rdar://problem/13551789Douglas Gregor authored
It's possible for the lock file to disappear and the owning process to return before we're able to see the generated file. Spin for a little while to see if it shows up before failing. llvm-svn: 178909
-
rdar://problem/13551789Douglas Gregor authored
If the directory that will contain the unique file doesn't exist when we tried to create the file, but another process creates it before we get a chance to try creating it, we would bail out rather than try to create the unique file. llvm-svn: 178908
-
Michael J. Spencer authored
llvm-svn: 178905
-
Rafael Espindola authored
llvm-svn: 178904
-
Rafael Espindola authored
These should really be templated like ELF, but this is a start. llvm-svn: 178896
-
Michael Gottesman authored
llvm-svn: 178895
-
Rafael Espindola authored
llvm-svn: 178894
-
Michael Gottesman authored
llvm-svn: 178893
-
Timur Iskhodzhanov authored
llvm-svn: 178885
-
Renato Golin authored
llvm-svn: 178883
-
Chad Rosier authored
memory operands. Essentially, this layers an infix calculator on top of the parsing state machine. The scale on the index register is still expected to be an immediate __asm mov eax, [eax + ebx*4] and will not work with more complex expressions. For example, __asm mov eax, [eax + ebx*(2*2)] The plus and minus binary operators assume the numeric value of a register is zero so as to not change the displacement. Register operands should never be an operand for a multiply or divide operation; the scale*indexreg expression is always replaced with a zero on the operand stack to prevent such a case. rdar://13521380 llvm-svn: 178881
-