Commits · 2798f1ef586ed45777e9dfbd0de68d61617615c1 · Roger Ferrer / llvm-epi-0.8

Dec 01, 2013

Use 'unsigned char' to get this past gcc error message: · 2798f1ef
Bill Wendling authored Dec 01, 2013
```
  error: invalid conversion from 'unsigned char' to '{anonymous}::Sequence'

llvm-svn: 196004
```
2798f1ef

Fix typo: s/Occurence/Occurrence/ · 54ee53a0

Alp Toker authored Nov 30, 2013

This is a private class member so the fix shouldn't impact external projects.

llvm-svn: 195985

54ee53a0

Nov 30, 2013

Update the LeakSanitizer documentation with a proper link. · 34225433
Sergey Matveev authored Nov 30, 2013
```
llvm-svn: 195983
```
34225433

add an additional test case for generic attributes · 73196bae

Saleem Abdulrasool authored Nov 30, 2013

gcc treats [[gnu:const]], [[gnu::__const]], and [[gnu:__const__]] as all being
equivalent.  Add an additional test case to ensure that we do not miss the last
case.

llvm-svn: 195982

73196bae

Add a scheduling model (with itinerary) for the PPC POWER7 · 42daeae9

Hal Finkel authored Nov 30, 2013

This adds a scheduling model for the POWER7 (P7) core, and enables the
machine-instruction scheduler when targeting the P7. Scheduling for the P7,
like earlier ooo PPC cores, requires considering both dispatch group hazards,
and functional unit resources and latencies. These are both modeled in a
combined itinerary. Dispatch group formation is still handled by the post-RA
scheduler (which still needs to be updated for the P7, but nevertheless does a
pretty good job).

One interesting aspect of this change is that I've also enabled to use of AA
duing CodeGen for the P7 (just as it is for the embedded cores). The benchmark
results seem to support this decision (see below), and while this is normally
useful for in-order cores, and not for ooo cores like the P7, I think that the
dispatch slot hazards are enough like in-order resources to make the AA useful.

Test suite significant performance differences (where negative is a speedup,
and positive is a regression) vs. the current situation:

MultiSource/Benchmarks/BitBench/drop3/drop3
  with AA: N/A
  without AA: -28.7614% +/- 19.8356%
(significantly against AA)

MultiSource/Benchmarks/FreeBench/neural/neural
  with AA: -17.7406% +/- 11.2712%
  without AA: N/A
(significantly in favor of AA)

MultiSource/Benchmarks/SciMark2-C/scimark2
  with AA: -11.2079% +/- 1.80543%
  without AA: -11.3263% +/- 2.79651%

MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt
  with AA: -41.8649% +/- 17.0053%
  without AA: -34.5256% +/- 23.7072%

MultiSource/Benchmarks/mafft/pairlocalalign
  with AA: 25.3016% +/- 17.8614%
  without AA: 38.6629% +/- 14.9391%
(significantly in favor of AA)

MultiSource/Benchmarks/sim/sim
  with AA: N/A
  without AA: 13.4844% +/- 7.18195%
(significantly in favor of AA)

SingleSource/Benchmarks/BenchmarkGame/Large/fasta
  with AA: 15.0664% +/- 6.70216%
  without AA: 12.7747% +/- 8.43043%

SingleSource/Benchmarks/BenchmarkGame/puzzle
  with AA: 82.2713% +/- 26.3567%
  without AA: 75.7525% +/- 41.1842%

SingleSource/Benchmarks/Misc/flops-2
  with AA: -37.1621% +/- 20.7964%
  without AA: -35.2342% +/- 20.2999%
(significantly in favor of AA)

These are 99.5% confidence intervals from 5 runs per configuration. Regarding
the choice to turn on AA during CodeGen, of these results, four seem
significantly in favor of using AA, and one seems significantly against. I'm
not making this decision based on these numbers alone, but these results
seem consistent with results I have from other tests, and so I think that, on
balance, using AA is a win.

llvm-svn: 195981

42daeae9

Split some PPC itinerary classes · 46402a42

Hal Finkel authored Nov 30, 2013

In preparation for adding scheduling definitions for the POWER7, split some PPC
itinerary classes so that the P7's latencies and hazards can be better
described. For the most part, this means differentiating indexed from non-index
pre-increment loads and stores. Also, differentiate single from
double-precision sqrt.

No functionality change intended (except for a more-specific latency for
single-precision sqrt on the A2).

llvm-svn: 195980

46402a42

Convert a PPC test from grep to FileCheck · ca93e472

Hal Finkel authored Nov 30, 2013

Convert this test to FileCheck, and improve it to check for the instructions it
is trying to exclude instead of checking for register use (especially because
grepping for r1 can be thrown off, for example, by a use of r12).

llvm-svn: 195979

ca93e472

Desensitize a couple of PPC regression tests · 2651f973

Hal Finkel authored Nov 30, 2013

Use CHECK-DAG to make these regression tests more resilient against changes in
instruction scheduling.

llvm-svn: 195978

2651f973

Update the cpu specified on some PPC regression tests · 2b655bb2

Hal Finkel authored Nov 30, 2013

Some of these tests did not specify a cpu but were also sensitive to
instruction scheduling and/or register assignment choices. A few others
similarly-sensitive tests specified a cpu (often the POWER7), and while the P7
currently uses the default model for PPC64, this will soon change. For those
tests which should not really be cpu-dependent anyway, the cpu is set to the
generic 'ppc64'.

llvm-svn: 195977

2b655bb2

Test case for issue with microMIPS long branch. · 47248671
Zoran Jovanovic authored Nov 30, 2013
```
llvm-svn: 195976
```
47248671
Fixed issue with microMIPS long branch. · 9d86e26e
Zoran Jovanovic authored Nov 30, 2013
```
llvm-svn: 195975
```
9d86e26e
Fix indentation of fields in __cxa_exception to line up · 66471118
Mark Seaborn authored Nov 30, 2013
```
Align to 8 spaces instead of an inconsistent 9.

llvm-svn: 195974
```
66471118

[mips][msa] MSA loads and stores have a 10-bit offset. Account for this when lowering FrameIndex. · 7fd68d60

Daniel Sanders authored Nov 30, 2013

This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s
when the stack frame is between 512 and 32,768 bytes in size.

llvm-svn: 195973

7fd68d60

[mips][msa] A small refactor to reduce patch noise in my next commit · 71534147
Daniel Sanders authored Nov 30, 2013
```
No functional change. An if-statement has been split into two nested if-statements.

llvm-svn: 195972
```
71534147
Force CPU type to unbreak unit tests on Haswell machines. · 5b6234dc
Juergen Ributzka authored Nov 30, 2013
```
llvm-svn: 195971
```
5b6234dc
NetBSD uses signed wchar_t on ARM platforms. · 84c7ca88
Joerg Sonnenberger authored Nov 30, 2013
```
llvm-svn: 195970
```
84c7ca88
Reverse the order of eviction checks for possible compile time savings. No functionality. · c2ab53a3
Andrew Trick authored Nov 29, 2013
```
llvm-svn: 195969
```
c2ab53a3

Nov 29, 2013

Part 1 of 3 patches that completes very long conditional branches · ad450f23

Reed Kotler authored Nov 29, 2013

in constant islands for Mips16. We introdcuce JalB16 as a synomnym
for Jal16. It makes it easier to read and is also necessary because
Jal16 is a call instruction but JalB16 is being used as a branch.
Various parts of LLVM will not work properly even in this late stage of
the backend if we use what was declared as a call instruction to function
as a branch. For one, basic block labels may not get emitted in some
situations. 

llvm-svn: 195968

ad450f23

Revert revision 195965. · 1bc3cce0
Zoran Jovanovic authored Nov 29, 2013
```
llvm-svn: 195967
```
1bc3cce0

mips: XFAIL llvm-cov test · e3e940d8

Petar Jovanovic authored Nov 29, 2013

XFAIL llvm-cov.test for MIPS until big-endian issues are fixed for llvm-cov.
The test does pass on MIPS little-endian.

llvm-svn: 195966

e3e940d8

Fixed issue with microMIPS long branch. · ff2a40ce
Zoran Jovanovic authored Nov 29, 2013
```
llvm-svn: 195965
```
ff2a40ce
Refactored the tls_model attribute to use a custom subset subject. No functional change intended. · 5b0481a3
Aaron Ballman authored Nov 29, 2013
```
llvm-svn: 195964
```
5b0481a3

Using a custom subject to reenable the Subjects line for the ns_bridged... · f7cd09a0

Aaron Ballman authored Nov 29, 2013

Using a custom subject to reenable the Subjects line for the ns_bridged attribute. No functional change intended.

llvm-svn: 195963

f7cd09a0

Fixes a possible assert in the custom SubsetSubject logic for the attr emitter. · 4cfafb9a
Aaron Ballman authored Nov 29, 2013
```
llvm-svn: 195962
```
4cfafb9a

Added LanguageStandard::LS_JavaScript to gate all JS-specific parsing. · cabdd738

Alexander Kornienko authored Nov 29, 2013

Summary:
Use LS_JavaScript for files ending with ".js". Added support for ">>>="
operator.

Reviewers: djasper, klimek

Reviewed By: djasper

CC: cfe-commits, klimek

Differential Revision: http://llvm-reviews.chandlerc.com/D2242

llvm-svn: 195961

cabdd738

Enables support for custom subject lists for attributes. As a testbed, uses... · 80469038

Aaron Ballman authored Nov 29, 2013

Enables support for custom subject lists for attributes. As a testbed, uses the custom subject for the ibaction attribute.

llvm-svn: 195960

80469038

[asan] dump coverage even if asan has reported an error · dc580902
Kostya Serebryany authored Nov 29, 2013
```
llvm-svn: 195959
```
dc580902
[sanitizer] disable shmctl intercetor in 32-bit -- it is rotten (bug filed) · 5774faf5
Kostya Serebryany authored Nov 29, 2013
```
llvm-svn: 195958
```
5774faf5
Increase the LocatePcInTrace PC threshold now that GET_STACK_TRACE_WITH_PC_AND_BP has grown · 5ca41e38
Timur Iskhodzhanov authored Nov 29, 2013
```
llvm-svn: 195957
```
5ca41e38
Fix current stack unwinding when using DRASan · a10c46f2
Timur Iskhodzhanov authored Nov 29, 2013
```
llvm-svn: 195956
```
a10c46f2
[ASan] Also print <empty stack> when size==0 · bbf2ff81
Timur Iskhodzhanov authored Nov 29, 2013
```
llvm-svn: 195955
```
bbf2ff81

clang-format: Extends formatted ranges to subsequent lines comments. · 38c82408

Daniel Jasper authored Nov 29, 2013

Before:
  int aaaa;     // This line is formatted.
                // The comment continues ..
                // .. here.

Before:
  int aaaa; // This line is formatted.
            // The comment continues ..
            // .. here.

This fixes llvm.org/PR17914.

llvm-svn: 195954

38c82408

clang-format: Correctly handle Qt's Q_SLOTS. · 1556b593
Daniel Jasper authored Nov 29, 2013
```
This should fix llvm.org/PR17241. Maybe it sticks this time :-).

llvm-svn: 195953
```
1556b593

clang-format: Fix bad indentation of nested blocks. · e40caf9a

Daniel Jasper authored Nov 29, 2013

Before:
  DEBUG(  //
  { f(); });

After:
  DEBUG(  //
      { f(); });

Also add additional test to selected formatting of individual statements
in nested blocks.

llvm-svn: 195952

e40caf9a

Adjust PPC A2 input operand latencies · 1df3205e

Hal Finkel authored Nov 29, 2013

On the PPC A2, instructions are only issued after their input operands are
ready. Model this by specifying that input operands are read at dispatch (0
cycles after issue). This changes all input operand latencies from 1 to 0.

Significant test-suite performance changes (these are 99.5% confidence
intervals on 6 runs for both before and after):

speedups:
MultiSource/Benchmarks/sim/sim
	-1.21915% +/- 0.175063%
MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt
	-1.23946% +/- 1.05133%
SingleSource/Benchmarks/Misc/flops-2
	-1.24237% +/- 0.681362%
MultiSource/Applications/JM/lencod/lencod
	-1.33992% +/- 0.757498%
MultiSource/Benchmarks/TSVC/InductionVariable-flt/InductionVariable-flt
	-1.51802% +/- 1.21468%
MultiSource/Benchmarks/TSVC/GlobalDataFlow-flt/GlobalDataFlow-flt
	-2.18818% +/- 1.28605%
MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt
	-2.21977% +/- 1.19499%
SingleSource/Benchmarks/BenchmarkGame/spectral-norm
	-2.29822% +/- 0.671871%
MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl
	-2.40975% +/- 0.355931%
SingleSource/Benchmarks/Misc/fp-convert
	-2.41899% +/- 1.04751%
MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl
	-2.50349% +/- 0.126765%
SingleSource/Benchmarks/Misc/flops-3
	-3.00214% +/- 0.700795%
MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt
	-3.56995% +/- 3.2929%
MultiSource/Applications/sgefa/sgefa
	-4.24908% +/- 2.00413%
MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk
	-18.1294% +/- 3.96489%

regressions:
MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl
	1.03249% +/- 0.178547%
MultiSource/Applications/hexxagon/hexxagon
	1.16597% +/- 0.285235%
MultiSource/Benchmarks/TSVC/IndirectAddressing-flt/IndirectAddressing-flt
	1.39576% +/- 1.07855%
SingleSource/Benchmarks/Misc-C++/stepanov_v1p2
	1.71539% +/- 0.173182%
MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1
	1.90013% +/- 0.866472%
MultiSource/Benchmarks/TSVC/Recurrences-dbl/Recurrences-dbl
	2.39854% +/- 1.05914%
MultiSource/Benchmarks/TSVC/ControlFlow-dbl/ControlFlow-dbl
	2.4402% +/- 0.817904%
MultiSource/Benchmarks/TSVC/LoopRestructuring-dbl/LoopRestructuring-dbl
	5.87997% +/- 3.3172%
MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc
	9.02643% +/- 5.79591%
MultiSource/Benchmarks/VersaBench/bmm/bmm
	10.3517% +/- 1.227%

Obviously, there are data points on both sides of this; but I think, overall,
this supports making the change.

llvm-svn: 195951

1df3205e

Teach LocalStackSlotAllocation that stackmaps/patchpoints don't have range · 7468daad
Lang Hames authored Nov 29, 2013
```
constraints on their frame offsets.

llvm-svn: 195950
```
7468daad

Create a PPC440 SchedMachineModel · 5a7162f3

Hal Finkel authored Nov 29, 2013

Some of the older PPC processor definitions don't have associated
SchedMachineModels; correct this for the PPC440.

llvm-svn: 195949

5a7162f3

Fixup PPC440 load/store operand latencies · 4035e8d8

Hal Finkel authored Nov 29, 2013

The operand latencies for loads and stores in the PPC440 itinerary were wrong
(the store operands are all inputs, and the "with update" (pre-increment)
instructions need a latency for the additional output).

llvm-svn: 195948

4035e8d8

Adjust PPC440 operand latencies · a10bd1d2

Hal Finkel authored Nov 29, 2013

The operand latencies for the PPC440 should be specified relative to dispatch,
not relative to the initial fetch-and-decode stages. Because most instructions
(ignoring bypass) wait in dispatch until their operands are ready, this is
modeled as reading input operands "at dispatch" (0 cycles after issue), and so
every input and output operand has 4 cycles subtracted from it.

This could alter scheduling slightly, but I don't expect a large effect.

llvm-svn: 195947

a10bd1d2

Don't model the fetch and decode units for the PPC440 · dd063699

Hal Finkel authored Nov 29, 2013

Modeling the fetch and decode units in the PPC440 itinerary does not add
anything to the hazard detection capability (and so modeling them just wastes
compile time).

No functionality change intended.

llvm-svn: 195946

dd063699