Commits · 419d4917473564dad3e8ca660bbd79c32a6322c2 · Roger Ferrer / llvm-epi-0.8

Apr 05, 2013

MachineScheduler: format DEBUG output. · 419d4917

Andrew Trick authored Apr 05, 2013

I'm getting more serious about tuning and enabling on x86/ARM. Start
by making the trace readable.

llvm-svn: 178821

419d4917

LoopVectorizer: Pass OperandValueKind information to the cost model · df6f67ed

Arnold Schwaighofer authored Apr 04, 2013

Pass down the fact that an operand is going to be a vector of constants.

This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86
back. It had degraded to scalar performance due to my pervious shift cost change
that made all shifts expensive on x86.

radar://13576547

llvm-svn: 178809

df6f67ed

X86 cost model: Differentiate cost for vector shifts of constants · 44f902ed

Arnold Schwaighofer authored Apr 04, 2013

SSE2 has efficient support for shifts by a scalar. My previous change of making
shifts expensive did not take this into account marking all shifts as expensive.
This would prevent vectorization from happening where it is actually beneficial.

With this change we differentiate between shifts of constants and other shifts.

radar://13576547

llvm-svn: 178808

44f902ed

CostModel: Add parameter to instruction cost to further classify operand values · b9773871

Arnold Schwaighofer authored Apr 04, 2013

On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.

We can efficiently support

for (i = 0 ; i < ; i += 4)
  w[0:3] = v[0:3] << <2, 2, 2, 2>

but not

for (i = 0; i < ; i += 4)
  w[0:3] = v[0:3] << x[0:3]

This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.

Targets can then choose to return a different cost for instructions with such
operand values.

A follow-up commit will test this feature on x86.

radar://13576547

llvm-svn: 178807

b9773871

Debug Info: revert 178722 for now. · bdcb4464

Manman Ren authored Apr 04, 2013

There is a difference for FORM_ref_addr between DWARF 2 and DWARF 3+.
Since Eric is against guarding DWARF 2 ref_addr with DarwinGDBCompat, we are
still in discussion on how to handle this.

The correct solution is to update our header to say version 4 instead of version
2 and update tool chains as well.

rdar://problem/13559431

llvm-svn: 178806

bdcb4464

typo · 322f41d0
Adrian Prantl authored Apr 04, 2013
```
llvm-svn: 178804
```
322f41d0

Rename the current PPC BCL definition to BCLalways · e5680b3c

Hal Finkel authored Apr 04, 2013

BCL is normally a conditional branch-and-link instruction, but has
an unconditional form (which is used in the SjLj code, for example).
To make clear that this BCL instruction definition is specifically
the special unconditional form (which does not meaningfully take
a condition-register input), rename it to BCLalways.

No functionality change intended.

llvm-svn: 178803

e5680b3c

PPC: Improve code generation for mixed-precision reciprocal sqrt · f96c18e3

Hal Finkel authored Apr 04, 2013

The DAGCombine logic that recognized a/sqrt(b) and transformed it into
a multiplication by the reciprocal sqrt did not handle cases where the
sqrt and the division were separated by an fpext or fptrunc.

llvm-svn: 178801

f96c18e3

Apr 04, 2013

Hexagon: Expand br_cc. · a929ab58

Jyotsna Verma authored Apr 04, 2013

It fixes following tests for Hexagon:

CodeGen/Generic/2003-07-29-BadConstSbyte.ll
CodeGen/Generic/2005-10-21-longlonggtu.ll
CodeGen/Generic/2009-04-28-i128-cmp-crash.ll
CodeGen/Generic/MachineBranchProb.ll
CodeGen/Generic/builtin-expect.ll
CodeGen/Generic/pr12507.ll

llvm-svn: 178794

a929ab58

Reassociate: Avoid iterator invalidation. · dd67654a

Benjamin Kramer authored Apr 04, 2013

OpndPtrs stored pointers into the Opnd vector that became invalid when the
vector grows. Store indices instead. Sadly I only have a large testcase that
only triggers under valgrind, so I didn't include it.

llvm-svn: 178793

dd67654a

[XCore] Add bru instruction. · 0c12d185
Richard Osborne authored Apr 04, 2013
```
llvm-svn: 178783
```
0c12d185

[XCore] The RRegs register class is a superset of GRRegs. · f18d95f7

Richard Osborne authored Apr 04, 2013

At the time when the XCore backend was added there were some issues with
with overlapping register classes but these all seem to be fixed now.
Describing the register classes correctly allow us to get rid of a
codegen only instruction (LDAWSP_lru6_RRegs) and it means we can
disassemble ru6 instructions that use registers above r11.

llvm-svn: 178782

f18d95f7

Avoid high-latency false CPSR dependencies even for tMOVSi. · 299475e0

Jakob Stoklund Olesen authored Apr 04, 2013

The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

llvm-svn: 178773

299475e0

Formatting · fc186358
Eli Bendersky authored Apr 04, 2013
```
llvm-svn: 178771
```
fc186358
R600: Use a mask for offsets when encoding instructions · bcbb13d6
Vincent Lejeune authored Apr 04, 2013
```
llvm-svn: 178763
```
bcbb13d6
R600: Fix wrong address when substituting ENDIF · 8e377fdb
Vincent Lejeune authored Apr 04, 2013
```
llvm-svn: 178762
```
8e377fdb
R600: Take export into account when computing cf address · c44fa997
Vincent Lejeune authored Apr 04, 2013
```
llvm-svn: 178761
```
c44fa997

Add SPARC v9 support for select on 64-bit compares. · 8cfaffaa

Jakob Stoklund Olesen authored Apr 04, 2013

This requires v9 cmov instructions using the %xcc flags instead of the
%icc flags.

Still missing:
- Select floats on %xcc flags.
- Select i64 on %fcc flags.

llvm-svn: 178737

8cfaffaa

Debug Info: according to DWARF 2, FORM_ref_addr the same size as an address on · 5a15c9ed

Manman Ren authored Apr 04, 2013

the target system.

It was hard-coded to 4 bytes before. I can't get llvm to generate a
ref_addr on a reasonably sized testing case.

rdar://problem/13559431

llvm-svn: 178722

5a15c9ed

Refactored out the helper method FindPredecessorAutoreleaseWithSafePath from... · 21a4ed32

Michael Gottesman authored Apr 03, 2013

Refactored out the helper method FindPredecessorAutoreleaseWithSafePath from ObjCARCOpt::OptimizeReturns.

Now ObjCARCOpt::OptimizeReturns is easy to read and reason about.

llvm-svn: 178715

21a4ed32

Refactored out the helper function FindPredecessorRetainWithSafePath from... · 6908db14
Michael Gottesman authored Apr 03, 2013
```
Refactored out the helper function FindPredecessorRetainWithSafePath from ObjCARCOpt::OptimizeReturns.

llvm-svn: 178714
```
6908db14

Small cleanups. · c2d5bf5c

Michael Gottesman authored Apr 03, 2013

Cleaned up trailing whitespace and added extra slashes in front of a
function level comment so that it follow the convention of having 3
slashes.

llvm-svn: 178712

c2d5bf5c

Refactored out a part of ObjCARCOpt::OptimizeReturns into its own method... · 54dc7fde
Michael Gottesman authored Apr 03, 2013
```
Refactored out a part of ObjCARCOpt::OptimizeReturns into its own method HasSafePathToPredecessorCall.

llvm-svn: 178710
```
54dc7fde
Removed an old comment. · 0a1748bb
Michael Gottesman authored Apr 03, 2013
```
llvm-svn: 178709
```
0a1748bb

Clean up arc annotations by moving the top/bottom BB annotations into... · 43e7e00a

Michael Gottesman authored Apr 03, 2013

Clean up arc annotations by moving the top/bottom BB annotations into conditional macros that no-op in Release mode instead of #ifdef sections of the code.

This is to follow the example of the DEBUG macro.

llvm-svn: 178705

43e7e00a

Apr 03, 2013
- X86 cost model: Vector shifts are expensive in most cases · e9b50164
  Arnold Schwaighofer authored Apr 03, 2013
```
The default logic does not correctly identify costs of casts because they are
marked as custom on x86.

For some cases, where the shift amount is a scalar we would be able to generate
better code. Unfortunately, when this is the case the value (the splat) will get
hoisted out of the loop, thereby making it invisible to ISel.

radar://13130673
radar://13537826

llvm-svn: 178703
```
  e9b50164
- R600: Fix last ALU of a clause being emitted in a separate clause · c3d3f9b6
  Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178675
```
  c3d3f9b6
- Ensuring that both bits are set, and not just a combination of one or the other. · 5e6d2052
  Aaron Ballman authored Apr 03, 2013
```
llvm-svn: 178674
```
  5e6d2052
- Cleanup PPC reciprocal-estimate functionality · b0c810ff
  Hal Finkel authored Apr 03, 2013
```
Incorporating review feedback from Bill Schmidt on r178617. No functionality
change intended.

llvm-svn: 178672
```
  b0c810ff
- R600: Factorize maximum alu per clause in a single location · 80031d9f
  Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178667
```
  80031d9f
- Testing for Visual Studio 2010 SP1 or greater before calling the _xgetbv... · edc03c66
  Aaron Ballman authored Apr 03, 2013
```
Testing for Visual Studio 2010 SP1 or greater before calling the _xgetbv intrinsic.  This also fixes a minor code formatting issue.

llvm-svn: 178666
```
  edc03c66
- R600: Simplify data structure and add DEBUG to R600ControlFlowFinalizer · b6d6c0d4
  Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178665
```
  b6d6c0d4
- R600: Consider KILLGT as an ALU instruction · 9931298b
  Vincent Lejeune authored Apr 03, 2013
```
Mesa does not override llvm behavior wrt KILLGT anymore so llvm
has to handle KILLGT on its own.

llvm-svn: 178664
```
  9931298b
- Measure time that IR parsing took as part of the -time-passes measurement. · b35a211f
  Eli Bendersky authored Apr 03, 2013
```
llvm-svn: 178662
```
  b35a211f
- PPC: Enable FRES and FRSQRTE on the default PPC64 description · 7ac4592e
  Hal Finkel authored Apr 03, 2013
```
I discussed this with Bill Schmidt on IRC, and it was decided that this is a
safe and reasonable default.

llvm-svn: 178659
```
  7ac4592e
- PPC: Add a FIXME regarding the non-working fma+fneg Altivec pattern · 0c6d2193
  Hal Finkel authored Apr 03, 2013
```
llvm-svn: 178658
```
  0c6d2193
- Remove some obsolete PowerPC/README entries · 2ed21a8c
  Hal Finkel authored Apr 03, 2013
```
llvm-svn: 178657
```
  2ed21a8c
- · 084ff8e8
  Ulrich Weigand authored Apr 03, 2013
```
More direct types in PowerPC AltiVec intrinsics.

This patch follows up on work done by Bill Schmidt in r178277,
and replaces most of the remaining uses of VRRC in ISEL DAG patterns.

The resulting .inc files are identical except for comments, so
no change in code generation is expected.

llvm-svn: 178656
```
  084ff8e8
- Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC. · 92e26646
  Bill Schmidt authored Apr 03, 2013
```
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639
```
  92e26646
- AArch64: implement ETMv4 trace system registers. · 5816ca11
  Tim Northover authored Apr 03, 2013
```
llvm-svn: 178637
```
  5816ca11