- Apr 06, 2013
-
-
Jakob Stoklund Olesen authored
64-bit SPARC v9 processes use biased stack and frame pointers, so the current function's stack frame is located at %sp+BIAS .. %fp+BIAS where BIAS = 2047. This makes more local variables directly accessible via [%fp+simm13] addressing. llvm-svn: 178965
-
Hal Finkel authored
There are certain PPC instructions into which we can fold a zero immediate operand. We can detect such cases by looking at the register class required by the using operand (so long as it is not otherwise constrained). llvm-svn: 178961
-
Hal Finkel authored
llvm-svn: 178960
-
Jakob Stoklund Olesen authored
All arguments are formally assigned to stack positions and then promoted to floating point and integer registers. Since there are more floating point registers than integer registers, this can cause situations where floating point arguments are assigned to registers after integer arguments that where assigned to the stack. Use the inreg flag to indicate 32-bit fragments of structs containing both float and int members. The three-way shadowing between stack, integer, and floating point registers requires custom argument lowering. The good news is that return values are passed in the exact same way, and we can share the code. Still missing: - Update LowerReturn to handle structs returned in registers. - LowerCall. - Variadic functions. llvm-svn: 178958
-
Tom Stellard authored
v2: - Use the ADDR64 bit Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178931
-
Tom Stellard authored
The code emitter knows how to encode operands whose name matches one of the encoding fields. If there is no match, the code emitter relies on the order of the operand and field definitions to determine how operands should be encoding. Matching by order makes it easy to accidentally break the instruction encodings, so we prefer to match by name. Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178930
-
Tom Stellard authored
This is an R600 GPU with double support. Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178929
-
Tom Stellard authored
Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178928
-
Tom Stellard authored
SITargetLowering::analyzeImmediate() was converting the 64-bit values to 32-bit and then checking if they were an inline immediate. Some of these conversions caused this check to succeed and produced S_MOV instructions with 64-bit immediates, which are illegal. v2: - Clean up logic Reviewed-by:
Christian König <christian.koenig@amd.com> llvm-svn: 178927
-
Hal Finkel authored
On cores for which we know the misprediction penalty, and we have the isel instruction, we can profitably perform early if conversion. This enables us to replace some small branch sequences with selects and avoid the potential stalls from mispredicting the branches. Enabling this feature required implementing canInsertSelect and insertSelect in PPCInstrInfo; isel code in PPCISelLowering was refactored to use these functions as well. llvm-svn: 178926
-
Hal Finkel authored
The manual states that there is a minimum of 13 cycles from when the mispredicted branch is issued to when the correct branch target is issued. llvm-svn: 178925
-
- Apr 05, 2013
-
-
Bill Wendling authored
During LTO, the target options on functions within the same Module may change. This would necessitate resetting some of the back-end. Do this for X86, because it's a Friday afternoon. llvm-svn: 178917
-
Renato Golin authored
llvm-svn: 178883
-
Chad Rosier authored
memory operands. Essentially, this layers an infix calculator on top of the parsing state machine. The scale on the index register is still expected to be an immediate __asm mov eax, [eax + ebx*4] and will not work with more complex expressions. For example, __asm mov eax, [eax + ebx*(2*2)] The plus and minus binary operators assume the numeric value of a register is zero so as to not change the displacement. Register operands should never be an operand for a multiply or divide operation; the scale*indexreg expression is always replaced with a zero on the operand stack to prevent such a case. rdar://13521380 llvm-svn: 178881
-
Stepan Dyatkovskiy authored
llvm-svn: 178854
-
Stepan Dyatkovskiy authored
Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position". Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass. It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4 But it also tracks conflicts between different register classes (e.g. D2 and S5). For more details see: Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824 LLVM Commits discussion: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html llvm-svn: 178851
-
Hal Finkel authored
llvm-svn: 178850
-
Hal Finkel authored
llvm-svn: 178848
-
Arnold Schwaighofer authored
descriptions for compares llvm-svn: 178844
-
Arnold Schwaighofer authored
llvm-svn: 178842
-
Arnold Schwaighofer authored
SSE2 has efficient support for shifts by a scalar. My previous change of making shifts expensive did not take this into account marking all shifts as expensive. This would prevent vectorization from happening where it is actually beneficial. With this change we differentiate between shifts of constants and other shifts. radar://13576547 llvm-svn: 178808
-
Arnold Schwaighofer authored
On certain architectures we can support efficient vectorized version of instructions if the operand value is uniform (splat) or a constant scalar. An example of this is a vector shift on x86. We can efficiently support for (i = 0 ; i < ; i += 4) w[0:3] = v[0:3] << <2, 2, 2, 2> but not for (i = 0; i < ; i += 4) w[0:3] = v[0:3] << x[0:3] This patch adds a parameter to getArithmeticInstrCost to further qualify operand values as uniform or uniform constant. Targets can then choose to return a different cost for instructions with such operand values. A follow-up commit will test this feature on x86. radar://13576547 llvm-svn: 178807
-
Hal Finkel authored
BCL is normally a conditional branch-and-link instruction, but has an unconditional form (which is used in the SjLj code, for example). To make clear that this BCL instruction definition is specifically the special unconditional form (which does not meaningfully take a condition-register input), rename it to BCLalways. No functionality change intended. llvm-svn: 178803
-
Hal Finkel authored
The DAGCombine logic that recognized a/sqrt(b) and transformed it into a multiplication by the reciprocal sqrt did not handle cases where the sqrt and the division were separated by an fpext or fptrunc. llvm-svn: 178801
-
- Apr 04, 2013
-
-
Jyotsna Verma authored
It fixes following tests for Hexagon: CodeGen/Generic/2003-07-29-BadConstSbyte.ll CodeGen/Generic/2005-10-21-longlonggtu.ll CodeGen/Generic/2009-04-28-i128-cmp-crash.ll CodeGen/Generic/MachineBranchProb.ll CodeGen/Generic/builtin-expect.ll CodeGen/Generic/pr12507.ll llvm-svn: 178794
-
Richard Osborne authored
llvm-svn: 178783
-
Richard Osborne authored
At the time when the XCore backend was added there were some issues with with overlapping register classes but these all seem to be fixed now. Describing the register classes correctly allow us to get rid of a codegen only instruction (LDAWSP_lru6_RRegs) and it means we can disassemble ru6 instructions that use registers above r11. llvm-svn: 178782
-
Jakob Stoklund Olesen authored
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it still aggressively creates tMOVi8 instructions because they are so common. Avoid creating false CPSR dependencies even for tMOVi8 instructions when the the CPSR flags are known to have high latency. This allows integer computation to overlap floating point computations. Also process blocks in a reverse post-order and propagate high-latency flags to successors. <rdar://problem/13468102> llvm-svn: 178773
-
Vincent Lejeune authored
llvm-svn: 178763
-
Vincent Lejeune authored
llvm-svn: 178762
-
Vincent Lejeune authored
llvm-svn: 178761
-
Jakob Stoklund Olesen authored
This requires v9 cmov instructions using the %xcc flags instead of the %icc flags. Still missing: - Select floats on %xcc flags. - Select i64 on %fcc flags. llvm-svn: 178737
-
- Apr 03, 2013
-
-
Arnold Schwaighofer authored
The default logic does not correctly identify costs of casts because they are marked as custom on x86. For some cases, where the shift amount is a scalar we would be able to generate better code. Unfortunately, when this is the case the value (the splat) will get hoisted out of the loop, thereby making it invisible to ISel. radar://13130673 radar://13537826 llvm-svn: 178703
-
Vincent Lejeune authored
llvm-svn: 178675
-
Hal Finkel authored
Incorporating review feedback from Bill Schmidt on r178617. No functionality change intended. llvm-svn: 178672
-
Vincent Lejeune authored
llvm-svn: 178667
-
Vincent Lejeune authored
llvm-svn: 178665
-
Vincent Lejeune authored
Mesa does not override llvm behavior wrt KILLGT anymore so llvm has to handle KILLGT on its own. llvm-svn: 178664
-
Hal Finkel authored
I discussed this with Bill Schmidt on IRC, and it was decided that this is a safe and reasonable default. llvm-svn: 178659
-
Hal Finkel authored
llvm-svn: 178658
-