Commits · f92a574246b391e905847e74b5e0283589c8741f · Roger Ferrer / llvm-epi-0.8

Dec 12, 2013

Resubmit r196544: Apply transformation on OS X 10.9+ and iOS 7.0+: pow(10, x) ―> __exp10(x) · f92a5742
Yi Jiang authored Dec 12, 2013
```
llvm-svn: 197109
```
f92a5742
Remove unused multiclass from PPCInstrInfo.td · fa50630e
Hal Finkel authored Dec 12, 2013
```
llvm-svn: 197100
```
fa50630e

Improve instruction scheduling for the PPC POWER7 · ceb1f12d

Hal Finkel authored Dec 12, 2013

Aside from a few minor latency corrections, the major change here is a new
hazard recognizer which focuses on better dispatch-group formation on the
POWER7. As with the PPC970's hazard recognizer, the most important thing it
does is avoid load-after-store hazards within the same dispatch group. It uses
the POWER7's special dispatch-group-terminating nop instruction (instead of
inserting multiple regular nop instructions). This new hazard recognizer makes
use of the scheduling dependency graph itself, built using AA information, to
robustly detect the possibility of load-after-store hazards.

significant test-suite performance changes (the error bars are 99.5% confidence
intervals based on 5 test-suite runs both with and without the change --
speedups are negative):

speedups:

MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
	-0.55171% +/- 0.333168%

MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl
	-17.5576% +/- 14.598%

MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl
	-29.5708% +/- 7.09058%

MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt
	-34.9471% +/- 11.4391%

SingleSource/Benchmarks/BenchmarkGame/puzzle
	-25.1347% +/- 11.0104%

SingleSource/Benchmarks/Misc/flops-8
	-17.7297% +/- 9.79061%

SingleSource/Benchmarks/Shootout-C++/ary3
	-35.5018% +/- 23.9458%

SingleSource/Regression/C/uint64_to_float
	-56.3165% +/- 25.4234%

SingleSource/UnitTests/Vectorizer/gcc-loops
	-18.5309% +/- 6.8496%

regressions:

MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000
	18.351% +/- 12.156%

SingleSource/Benchmarks/Shootout-C++/methcall
	27.3086% +/- 14.4733%

llvm-svn: 197099

ceb1f12d

[AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across vector AArch64 · 446d8ea0
Chad Rosier authored Dec 11, 2013
```
intrinsics to use f32 types, rather than their vector equivalents.

llvm-svn: 197090
```
446d8ea0

Fix the PPC subsumes-predicate check · 94a6f380

Hal Finkel authored Dec 11, 2013

For one predicate to subsume another, they must both check the same condition
register. Failure to check this prerequisite was causing miscompiles.

Fixes PR18003.

llvm-svn: 197089

94a6f380

Dec 11, 2013

[AArch64] Add NEON scalar floating-point compare LLVM AArch64 intrinsics that · 088f93d4
Chad Rosier authored Dec 11, 2013
```
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197068
```
088f93d4

[AArch64] Refactor the NEON scalar floating-point reciprocal step and · 473a01e1

Chad Rosier authored Dec 11, 2013

floating-point reciprocal square root step LLVM AArch64 intrinsics to
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197067

473a01e1

[AArch64] Refactor the NEON scalar floating-point reciprocal estimate, floating- · 7098fcc0

Chad Rosier authored Dec 11, 2013

point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.

llvm-svn: 197066

7098fcc0

Don't set unused variable. · 009e7586
Rafael Espindola authored Dec 11, 2013
```
llvm-svn: 197064
```
009e7586

R600: Re-format Processors.td · d7e146ed

Tom Stellard authored Dec 11, 2013

This makes it a little easier to read.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197058

d7e146ed

R600: Register AMDGPUCFGStructurizer pass · f2ba972a

Tom Stellard authored Dec 11, 2013

This enables -print-before-all to dump MachineInstrs after it is run.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197057

f2ba972a

R600: Register R600EmitClauseMarkers pass · 1de5582d

Tom Stellard authored Dec 11, 2013

This enables -print-before-all to dump MachineInstrs after it is run.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197056

1de5582d

[arm] Implement ARM .arch directive. · 439e8f9e
Logan Chien authored Dec 11, 2013
```
llvm-svn: 197052
```
439e8f9e

ARM: constrain register-class in fast-isel · 76fc8a4c

Tim Northover authored Dec 11, 2013

The tests were no longer using fast-isel at all (MachO needs an "ios" rather
than "darwin" triple at the moment and Linux needs ARM mode). Once that was
corrected, the verifier complained about a t2ADDri created for the alloca.

llvm-svn: 197046

76fc8a4c

AVX-512: Removed "z" suffix from AVX-512 instructions, since it is incompatible with GCC. · cf088098

Elena Demikhovsky authored Dec 11, 2013

I moved a test from avx512-vbroadcast-crash.ll to avx512-vbroadcast.ll
I defined HasAVX512 predicate as AssemblerPredicate. It means that you should invoke llvm-mc with "-mcpu=knl" to get encoding for AVX-512 instructions. I need this to let AsmMatcher to set different encoding for AVX and AVX-512 instructions that have the same mnemonic and operands (all scalar instructions).

llvm-svn: 197041

cf088098

[SystemZ] Optimize fcmp X, 0 in cases where X is also negated · 73170f84

Richard Sandiford authored Dec 11, 2013

In such cases it's often better to test the result of the negation instead,
since the negation also sets CC.

llvm-svn: 197032

73170f84

Distinguish and choose 16 or 32 bit forms of save/restore for Mips16. · 5bde5c35
Reed Kotler authored Dec 11, 2013
```
llvm-svn: 196999
```
5bde5c35
[AArch64 NEON] Get instruction BSL matched to VSELECT. · 310b6c08
Kevin Qin authored Dec 11, 2013
```
llvm-svn: 196998
```
310b6c08
Move mips' datalayout computation out of line and add comments. · b2fb78d4
Rafael Espindola authored Dec 11, 2013
```
llvm-svn: 196996
```
b2fb78d4
Move Sparc's getDataLayout out of line and add comments. · 60f48e5a
Rafael Espindola authored Dec 11, 2013
```
llvm-svn: 196990
```
60f48e5a
Prune redundant dependencies in LLVMBuild.txt. · 8bc9bfaa
NAKAMURA Takumi authored Dec 11, 2013
```
llvm-svn: 196988
```
8bc9bfaa
Move PPC's getDataLayoutString out of line and document it better. · 5b358587
Rafael Espindola authored Dec 11, 2013
```
llvm-svn: 196987
```
5b358587

Revert the backend fatal error from r196939 · ad92aca4

Reid Kleckner authored Dec 10, 2013

The combination of inline asm, stack realignment, and dynamic allocas
turns out to be too common to reject out of hand.

ASan inserts empy inline asm fragments and uses aligned allocas.
Compiling any trivial function containing a dynamic alloca with ASan is
enough to trigger the check.

XFAIL the test cases that would be miscompiled and add one that uses the
relevant functionality.

llvm-svn: 196986

ad92aca4

Dec 10, 2013
- Refactor the computation of the x86 datalayout. · 002f8aa5
  Rafael Espindola authored Dec 10, 2013
```
llvm-svn: 196976
```
  002f8aa5
- Use llvm_unreachable instead of assert(0) · eaa3a7ef
  Matt Arsenault authored Dec 10, 2013
```
llvm-svn: 196971
```
  eaa3a7ef
- on darwin<10, fallback to .weak_definition (PPC,X86) · 1b01849f
  David Fang authored Dec 10, 2013
```
.weak_def_can_be_hidden was not yet supported by the system assembler

llvm-svn: 196970
```
  1b01849f
- [AArch64] Refactor the NEON floating-point absolute difference LLVM AArch64 · f70af216
  Chad Rosier authored Dec 10, 2013
```
intrinsic to use f32/f64 types, rather than their vector equivalents.

llvm-svn: 196965
```
  f70af216
- [AArch64] Refactor the NEON signed/unsigned floating-point convert to fixed-point · 07cc3f91
  Chad Rosier authored Dec 10, 2013
```
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents.

llvm-svn: 196964
```
  07cc3f91
- [AArch64] Overload NEON signed/unsigned floating-point convert to fixed-point · 98b8baa3
  Chad Rosier authored Dec 10, 2013
```
and fixed-point convert to floating-point LLVM AArch64 intrinsics.

llvm-svn: 196963
```
  98b8baa3
- [AArch64] Overload NEON signed/unsigned integer convert to floating-point · cc34d187
  Chad Rosier authored Dec 10, 2013
```
LLVM AArch64 intrinsics.

llvm-svn: 196962
```
  cc34d187
- Reland "Fix miscompile of MS inline assembly with stack realignment" · ee08897f
  Reid Kleckner authored Dec 10, 2013
```
This re-lands commit r196876, which was reverted in r196879.

The tests have been fixed to pass on platforms with a stack alignment
larger than 4.

Update to clang side tests will land shortly.

llvm-svn: 196939
```
  ee08897f
- Make Triple's isOSBinFormatXXX functions partition triple-space. · 9653eb57
  Tim Northover authored Dec 10, 2013
```
Most users would be surprised if "isCOFF" and "isMachO" were simultaneously
true, unless they'd put the compiler in a box with a gun attached to a photon
detector.

This makes sure precisely one of the three formats is true for any triple and
simplifies some target logic based on that.

llvm-svn: 196934
```
  9653eb57
- [AArch64] Refactor the Neon vector/scalar floating-point convert intrinsics so · 7a9bba44
  Chad Rosier authored Dec 10, 2013
```
that they use float/double rather than the vector equivalents when appropriate.

llvm-svn: 196930
```
  7a9bba44
- [AArch64] Refactor the Neon vector/scalar floating-point convert implementation. · fcc4c366
  Chad Rosier authored Dec 10, 2013
```
Specifically, reuse the ARM intrinsics when possible.

llvm-svn: 196926
```
  fcc4c366
- Ensure that the backend no longer emits unnecessary vector insert instructions · f7c33c81
  Andrea Di Biagio authored Dec 10, 2013
```
immediately after SSE scalar fp instructions like addss or mulss.

Added patterns to select SSE scalar fp arithmetic instructions from a scalar
fp operation followed by a blend.

For example, given the following code:
  __m128 foo(__m128 A, __m128 B) {
    A[0] += B[0];
    return A;
  }

previously we generated:
  addss %xmm0, %xmm1
  movss %xmm1, %xmm0

now we generate:
  addss %xmm1, %xmm0

llvm-svn: 196925
```
  f7c33c81
- R600: Fix an infinite loop when trying to reorganize export/tex vector input · cc0ea74c
  Vincent Lejeune authored Dec 10, 2013
```
llvm-svn: 196923
```
  cc0ea74c
- R600: Fix input modifiers lost for Cayman · f92d64d1
  Vincent Lejeune authored Dec 10, 2013
```
llvm-svn: 196922
```
  f92d64d1
- Next step in Mips16 prologue/epilogue cleanup. · 0ff40017
  Reed Kotler authored Dec 10, 2013
```
Save S2(reg 18) only when we are calling floating point stubs that
have a return value of float or complex. Some more work to make this
better but this is the first step.

llvm-svn: 196921
```
  0ff40017
- AVX-512: changed intrinsics for mask operations · e382c3fd
  Elena Demikhovsky authored Dec 10, 2013
```
llvm-svn: 196918
```
  e382c3fd
- AVX-512: Changed intrinsics of VPCONFLICT to match GCC builtin form · 6270b388
  Elena Demikhovsky authored Dec 10, 2013
```
llvm-svn: 196914
```
  6270b388