Commits · 03d9f7fda6cde43f88accbe7700f47f5515dfb47 · Roger Ferrer / llvm-epi-0.8

Apr 06, 2013

Jakob Stoklund Olesen authored Apr 06, 2013

64-bit SPARC v9 processes use biased stack and frame pointers, so the
current function's stack frame is located at %sp+BIAS .. %fp+BIAS where
BIAS = 2047.

This makes more local variables directly accessible via [%fp+simm13]
addressing.

llvm-svn: 178965

03d9f7fd

Implement PPCInstrInfo::FoldImmediate · d61d4f80

Hal Finkel authored Apr 06, 2013

There are certain PPC instructions into which we can fold a zero immediate
operand. We can detect such cases by looking at the register class required
by the using operand (so long as it is not otherwise constrained).

llvm-svn: 178961

d61d4f80

PPC ISEL is a select and never has side effects · 8fc33e5d
Hal Finkel authored Apr 06, 2013
```
llvm-svn: 178960
```
8fc33e5d

Complete formal arguments for the SPARC v9 64-bit ABI. · 1c9a95ab

Jakob Stoklund Olesen authored Apr 06, 2013

All arguments are formally assigned to stack positions and then promoted
to floating point and integer registers. Since there are more floating
point registers than integer registers, this can cause situations where
floating point arguments are assigned to registers after integer
arguments that where assigned to the stack.

Use the inreg flag to indicate 32-bit fragments of structs containing
both float and int members.

The three-way shadowing between stack, integer, and floating point
registers requires custom argument lowering. The good news is that
return values are passed in the exact same way, and we can share the
code.

Still missing:

 - Update LowerReturn to handle structs returned in registers.
 - LowerCall.
 - Variadic functions.

llvm-svn: 178958

1c9a95ab

R600/SI: Add support for buffer stores v2 · 754f80ff

Tom Stellard authored Apr 05, 2013



v2:
  - Use the ADDR64 bit

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178931

754f80ff

R600/SI: Use same names for corresponding MUBUF operands and encoding fields · 6db08eb4

Tom Stellard authored Apr 05, 2013



The code emitter knows how to encode operands whose name matches one of
the encoding fields.  If there is no match, the code emitter relies on
the order of the operand and field definitions to determine how operands
should be encoding.  Matching by order makes it easy to accidentally break
the instruction encodings, so we prefer to match by name.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178930

6db08eb4

R600: Add RV670 processor · 60174bb9

Tom Stellard authored Apr 05, 2013



This is an R600 GPU with double support.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178929

60174bb9

R600/SI: Add processor types for each SI variant · 2f21c7e5
Tom Stellard authored Apr 05, 2013
```
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178928
```
2f21c7e5

R600/SI: Avoid generating S_MOVs with 64-bit immediates v2 · edbf1eb4

Tom Stellard authored Apr 05, 2013



SITargetLowering::analyzeImmediate() was converting the 64-bit values
to 32-bit and then checking if they were an inline immediate.  Some
of these conversions caused this check to succeed and produced
S_MOV instructions with 64-bit immediates, which are illegal.

v2:
  - Clean up logic

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178927

edbf1eb4

Enable early if conversion on PPC · ed6a2859

Hal Finkel authored Apr 05, 2013

On cores for which we know the misprediction penalty, and we have
the isel instruction, we can profitably perform early if conversion.
This enables us to replace some small branch sequences with selects
and avoid the potential stalls from mispredicting the branches.

Enabling this feature required implementing canInsertSelect and
insertSelect in PPCInstrInfo; isel code in PPCISelLowering was
refactored to use these functions as well.

llvm-svn: 178926

ed6a2859

Correct the PPC A2 misprediction penalty · 85526f2e

Hal Finkel authored Apr 05, 2013

The manual states that there is a minimum of 13 cycles from when the
mispredicted branch is issued to when the correct branch target is
issued.

llvm-svn: 178925

85526f2e

Apr 05, 2013

Use the target options specified on a function to reset the back-end. · eb108bad

Bill Wendling authored Apr 05, 2013

During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.

llvm-svn: 178917

eb108bad

Reverting 178851 as it broke buildbots · 91de828f
Renato Golin authored Apr 05, 2013
```
llvm-svn: 178883
```
91de828f

[ms-inline asm] Add support for numeric displacement expressions in bracketed · 4a7005e9

Chad Rosier authored Apr 05, 2013

memory operands.

Essentially, this layers an infix calculator on top of the parsing state
machine.  The scale on the index register is still expected to be an immediate

 __asm mov eax, [eax + ebx*4]

and will not work with more complex expressions.  For example,

 __asm mov eax, [eax + ebx*(2*2)]

The plus and minus binary operators assume the numeric value of a register is
zero so as to not change the displacement.  Register operands should never
be an operand for a multiply or divide operation; the scale*indexreg
expression is always replaced with a zero on the operand stack to prevent
such a case.
rdar://13521380

llvm-svn: 178881

4a7005e9

Buildbot fix for r178851: mistake was in wrong TargetRegisterInfo::getRegClass usage. · 6b53a2f5
Stepan Dyatkovskiy authored Apr 05, 2013
```
llvm-svn: 178854
```
6b53a2f5

Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated... · b309b3b3

Stepan Dyatkovskiy authored Apr 05, 2013

Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position".
Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass.
It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4
But it also tracks conflicts between different register classes (e.g. D2 and S5).
For more details see:
Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824
LLVM Commits discussion:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html

llvm-svn: 178851

b309b3b3

Add a SchedMachineModel for the PPC G5 · 1a958cf3
Hal Finkel authored Apr 05, 2013
```
llvm-svn: 178850
```
1a958cf3
Add a SchedMachineModel for the PPC A2 · 5fde1b03
Hal Finkel authored Apr 05, 2013
```
llvm-svn: 178848
```
5fde1b03
ARM scheduler model: Add scheduler info to more instructions and resource · fb6b9f48
Arnold Schwaighofer authored Apr 05, 2013
```
descriptions for compares

llvm-svn: 178844
```
fb6b9f48
ARM scheduler model: Swift has varying latencies, uops for simple ALU ops · 5dde1f39
Arnold Schwaighofer authored Apr 05, 2013
```
llvm-svn: 178842
```
5dde1f39

X86 cost model: Differentiate cost for vector shifts of constants · 44f902ed

Arnold Schwaighofer authored Apr 04, 2013

SSE2 has efficient support for shifts by a scalar. My previous change of making
shifts expensive did not take this into account marking all shifts as expensive.
This would prevent vectorization from happening where it is actually beneficial.

With this change we differentiate between shifts of constants and other shifts.

radar://13576547

llvm-svn: 178808

44f902ed

CostModel: Add parameter to instruction cost to further classify operand values · b9773871

Arnold Schwaighofer authored Apr 04, 2013

On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.

We can efficiently support

for (i = 0 ; i < ; i += 4)
  w[0:3] = v[0:3] << <2, 2, 2, 2>

but not

for (i = 0; i < ; i += 4)
  w[0:3] = v[0:3] << x[0:3]

This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.

Targets can then choose to return a different cost for instructions with such
operand values.

A follow-up commit will test this feature on x86.

radar://13576547

llvm-svn: 178807

b9773871

Rename the current PPC BCL definition to BCLalways · e5680b3c

Hal Finkel authored Apr 04, 2013

BCL is normally a conditional branch-and-link instruction, but has
an unconditional form (which is used in the SjLj code, for example).
To make clear that this BCL instruction definition is specifically
the special unconditional form (which does not meaningfully take
a condition-register input), rename it to BCLalways.

No functionality change intended.

llvm-svn: 178803

e5680b3c

PPC: Improve code generation for mixed-precision reciprocal sqrt · f96c18e3

Hal Finkel authored Apr 04, 2013

The DAGCombine logic that recognized a/sqrt(b) and transformed it into
a multiplication by the reciprocal sqrt did not handle cases where the
sqrt and the division were separated by an fpext or fptrunc.

llvm-svn: 178801

f96c18e3

Apr 04, 2013

Hexagon: Expand br_cc. · a929ab58

Jyotsna Verma authored Apr 04, 2013

It fixes following tests for Hexagon:

CodeGen/Generic/2003-07-29-BadConstSbyte.ll
CodeGen/Generic/2005-10-21-longlonggtu.ll
CodeGen/Generic/2009-04-28-i128-cmp-crash.ll
CodeGen/Generic/MachineBranchProb.ll
CodeGen/Generic/builtin-expect.ll
CodeGen/Generic/pr12507.ll

llvm-svn: 178794

a929ab58

[XCore] Add bru instruction. · 0c12d185
Richard Osborne authored Apr 04, 2013
```
llvm-svn: 178783
```
0c12d185

[XCore] The RRegs register class is a superset of GRRegs. · f18d95f7

Richard Osborne authored Apr 04, 2013

At the time when the XCore backend was added there were some issues with
with overlapping register classes but these all seem to be fixed now.
Describing the register classes correctly allow us to get rid of a
codegen only instruction (LDAWSP_lru6_RRegs) and it means we can
disassemble ru6 instructions that use registers above r11.

llvm-svn: 178782

f18d95f7

Avoid high-latency false CPSR dependencies even for tMOVSi. · 299475e0

Jakob Stoklund Olesen authored Apr 04, 2013

The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

llvm-svn: 178773

299475e0

R600: Use a mask for offsets when encoding instructions · bcbb13d6
Vincent Lejeune authored Apr 04, 2013
```
llvm-svn: 178763
```
bcbb13d6
R600: Fix wrong address when substituting ENDIF · 8e377fdb
Vincent Lejeune authored Apr 04, 2013
```
llvm-svn: 178762
```
8e377fdb
R600: Take export into account when computing cf address · c44fa997
Vincent Lejeune authored Apr 04, 2013
```
llvm-svn: 178761
```
c44fa997

Add SPARC v9 support for select on 64-bit compares. · 8cfaffaa

Jakob Stoklund Olesen authored Apr 04, 2013

This requires v9 cmov instructions using the %xcc flags instead of the
%icc flags.

Still missing:
- Select floats on %xcc flags.
- Select i64 on %fcc flags.

llvm-svn: 178737

8cfaffaa

Apr 03, 2013
- X86 cost model: Vector shifts are expensive in most cases · e9b50164
  Arnold Schwaighofer authored Apr 03, 2013
```
The default logic does not correctly identify costs of casts because they are
marked as custom on x86.

For some cases, where the shift amount is a scalar we would be able to generate
better code. Unfortunately, when this is the case the value (the splat) will get
hoisted out of the loop, thereby making it invisible to ISel.

radar://13130673
radar://13537826

llvm-svn: 178703
```
  e9b50164
- R600: Fix last ALU of a clause being emitted in a separate clause · c3d3f9b6
  Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178675
```
  c3d3f9b6
- Cleanup PPC reciprocal-estimate functionality · b0c810ff
  Hal Finkel authored Apr 03, 2013
```
Incorporating review feedback from Bill Schmidt on r178617. No functionality
change intended.

llvm-svn: 178672
```
  b0c810ff
- R600: Factorize maximum alu per clause in a single location · 80031d9f
  Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178667
```
  80031d9f
- R600: Simplify data structure and add DEBUG to R600ControlFlowFinalizer · b6d6c0d4
  Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178665
```
  b6d6c0d4
- R600: Consider KILLGT as an ALU instruction · 9931298b
  Vincent Lejeune authored Apr 03, 2013
```
Mesa does not override llvm behavior wrt KILLGT anymore so llvm
has to handle KILLGT on its own.

llvm-svn: 178664
```
  9931298b
- PPC: Enable FRES and FRSQRTE on the default PPC64 description · 7ac4592e
  Hal Finkel authored Apr 03, 2013
```
I discussed this with Bill Schmidt on IRC, and it was decided that this is a
safe and reasonable default.

llvm-svn: 178659
```
  7ac4592e
- PPC: Add a FIXME regarding the non-working fma+fneg Altivec pattern · 0c6d2193
  Hal Finkel authored Apr 03, 2013
```
llvm-svn: 178658
```
  0c6d2193