Commits · 75a1729c4b474e588586594ff6b12434360306ab · Roger Ferrer / llvm-epi-0.8

Dec 12, 2013

Remove unused multiclass from PPCInstrInfo.td · fa50630e
Hal Finkel authored Dec 12, 2013
```
llvm-svn: 197100
```
fa50630e

Improve instruction scheduling for the PPC POWER7 · ceb1f12d

Hal Finkel authored Dec 12, 2013

Aside from a few minor latency corrections, the major change here is a new
hazard recognizer which focuses on better dispatch-group formation on the
POWER7. As with the PPC970's hazard recognizer, the most important thing it
does is avoid load-after-store hazards within the same dispatch group. It uses
the POWER7's special dispatch-group-terminating nop instruction (instead of
inserting multiple regular nop instructions). This new hazard recognizer makes
use of the scheduling dependency graph itself, built using AA information, to
robustly detect the possibility of load-after-store hazards.

significant test-suite performance changes (the error bars are 99.5% confidence
intervals based on 5 test-suite runs both with and without the change --
speedups are negative):

speedups:

MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
	-0.55171% +/- 0.333168%

MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl
	-17.5576% +/- 14.598%

MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl
	-29.5708% +/- 7.09058%

MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt
	-34.9471% +/- 11.4391%

SingleSource/Benchmarks/BenchmarkGame/puzzle
	-25.1347% +/- 11.0104%

SingleSource/Benchmarks/Misc/flops-8
	-17.7297% +/- 9.79061%

SingleSource/Benchmarks/Shootout-C++/ary3
	-35.5018% +/- 23.9458%

SingleSource/Regression/C/uint64_to_float
	-56.3165% +/- 25.4234%

SingleSource/UnitTests/Vectorizer/gcc-loops
	-18.5309% +/- 6.8496%

regressions:

MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000
	18.351% +/- 12.156%

SingleSource/Benchmarks/Shootout-C++/methcall
	27.3086% +/- 14.4733%

llvm-svn: 197099

ceb1f12d

Fix an over-constrained assertion in MachineFunction::addLiveIn. · 18b779e3

Quentin Colombet authored Dec 12, 2013

The assertion was checking that the virtual register VReg used to represent the
physical register PReg uses the same register class as the one passed to
MachineFunction::addLiveIn.
This is over-constraining because it is sufficient to check that the register
class of VReg (VRegRC) is a subclass of the register class of PReg (PRegRC) and
that VRegRC contains PReg.
Indeed, if VReg gets constrained because of some operation constraints
between two calls of MachineFunction::addLiveIn, the original assertion
cannot match.

This fixes <rdar://problem/15633429>. 

llvm-svn: 197097

18b779e3

Expose FileCheck's AddFixedStringToRegEx as Regex::escape · 6f4f77b7

Hans Wennborg authored Dec 12, 2013

Both FileCheck and clang's -verify need to escape strings for regexes,
so let's expose this as a utility in the Regex class.

llvm-svn: 197096

6f4f77b7

[AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across vector AArch64 · 446d8ea0
Chad Rosier authored Dec 11, 2013
```
intrinsics to use f32 types, rather than their vector equivalents.

llvm-svn: 197090
```
446d8ea0

Fix the PPC subsumes-predicate check · 94a6f380

Hal Finkel authored Dec 11, 2013

For one predicate to subsume another, they must both check the same condition
register. Failure to check this prerequisite was causing miscompiles.

Fixes PR18003.

llvm-svn: 197089

94a6f380

Dec 11, 2013

Add two additional hazard recognizer functions · 4fd3b1de

Hal Finkel authored Dec 11, 2013

This adds two additional functions to the hazard recognizer interface. These
are optional (in the sense that the default implementations preserve the
current behavior), and used by the post-RA scheduler. Upcoming commits will use
this functionality in order to improve dispatch-group formation on the POWER7
and related cores. Dispatch groups are an odd construct: sometimes we need to
insert nops to force a new one to start (for performance reasons), and some
instructions need to appear in certain positions within a group, but the groups
are not fundamentally cycle based (they can contain instructions with data
dependencies with non-trivial latencies).

Motivation:

unsigned PreEmitNoops(SUnit *) - Used to force the post-RA scheduler to insert
nops to force a new dispatch group to begin. We already have a NoopHazard, and
this is also still needed. However, NoopHazard only causes a nop to be inserted
if there are no other available instructions, and so is not always sufficient.
The number of nops to insert depends on state that only the hazard recognizer
has, so a general callback is necessary.

bool ShouldPreferAnother(SUnit *) - Used to avoid scheduling instructions that
would start a new dispatch group when others are available that could be part
of the current dispatch group. In this case, we don't want to issue nops,
because the non-preferred instruction will implicitly start a new dispatch
group regardless.

Although the motivation for these functions is driven by the PowerPC backend,
they are completely general.

llvm-svn: 197084

4fd3b1de

On ELF and COFF treat linker_private like private. · 2b5a0c9e

Rafael Espindola authored Dec 11, 2013

The linkers on these systems don't have anything special to do with these
symbols. Since the intent is for them to be absent from the final object,
just treat them as private.

llvm-svn: 197080

2b5a0c9e

Revert "DebugInfo: Move type units into the debug_types section with... · 727747eb

David Blaikie authored Dec 11, 2013

Revert "DebugInfo: Move type units into the debug_types section with appropriate comdat grouping and type unit headers"

This reverts commit r197073.

The test seems to be failing on some buildbots for unknown reasons.
Reverting until I can figure that out. If anyone's got a reproduction
(.s and .o together would be great) - I'd really appreciate it.

llvm-svn: 197079

727747eb

DebugInfo: Move type units into the debug_types section with appropriate... · 4fe3c00e

David Blaikie authored Dec 11, 2013

DebugInfo: Move type units into the debug_types section with appropriate comdat grouping and type unit headers

This commit does not complete the type units feature - there are issues
around fission support (skeletal type units, pubtypes/pubnames) and
hashing of some types including those containing references to types in
other type units.

llvm-svn: 197073

4fe3c00e

DwarfUnit: LLVM_OVERRIDE and constify some functions · 3332d4c7
David Blaikie authored Dec 11, 2013
```
llvm-svn: 197072
```
3332d4c7
[AArch64] Add NEON scalar floating-point compare LLVM AArch64 intrinsics that · 088f93d4
Chad Rosier authored Dec 11, 2013
```
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197068
```
088f93d4

[AArch64] Refactor the NEON scalar floating-point reciprocal step and · 473a01e1

Chad Rosier authored Dec 11, 2013

floating-point reciprocal square root step LLVM AArch64 intrinsics to
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197067

473a01e1

[AArch64] Refactor the NEON scalar floating-point reciprocal estimate, floating- · 7098fcc0

Chad Rosier authored Dec 11, 2013

point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.

llvm-svn: 197066

7098fcc0

Don't set unused variable. · 009e7586
Rafael Espindola authored Dec 11, 2013
```
llvm-svn: 197064
```
009e7586

R600: Re-format Processors.td · d7e146ed

Tom Stellard authored Dec 11, 2013

This makes it a little easier to read.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197058

d7e146ed

R600: Register AMDGPUCFGStructurizer pass · f2ba972a

Tom Stellard authored Dec 11, 2013

This enables -print-before-all to dump MachineInstrs after it is run.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197057

f2ba972a

R600: Register R600EmitClauseMarkers pass · 1de5582d

Tom Stellard authored Dec 11, 2013

This enables -print-before-all to dump MachineInstrs after it is run.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197056

1de5582d

[arm] Implement ARM .arch directive. · 439e8f9e
Logan Chien authored Dec 11, 2013
```
llvm-svn: 197052
```
439e8f9e
SelectionDAG: Fix a typo. · 671a5962
Benjamin Kramer authored Dec 11, 2013
```
Found by "cppcheck". PR18208.

llvm-svn: 197047
```
671a5962

ARM: constrain register-class in fast-isel · 76fc8a4c

Tim Northover authored Dec 11, 2013

The tests were no longer using fast-isel at all (MachO needs an "ios" rather
than "darwin" triple at the moment and Linux needs ARM mode). Once that was
corrected, the verifier complained about a t2ADDri created for the alloca.

llvm-svn: 197046

76fc8a4c

Build fix for Android NDK which has neither futimes nor futimens · b30f01ee
Alp Toker authored Dec 11, 2013
```
Based on a patch by Neil Henning!

llvm-svn: 197045
```
b30f01ee

AVX-512: Removed "z" suffix from AVX-512 instructions, since it is incompatible with GCC. · cf088098

Elena Demikhovsky authored Dec 11, 2013

I moved a test from avx512-vbroadcast-crash.ll to avx512-vbroadcast.ll
I defined HasAVX512 predicate as AssemblerPredicate. It means that you should invoke llvm-mc with "-mcpu=knl" to get encoding for AVX-512 instructions. I need this to let AsmMatcher to set different encoding for AVX and AVX-512 instructions that have the same mnemonic and operands (all scalar instructions).

llvm-svn: 197041

cf088098

[SystemZ] Optimize fcmp X, 0 in cases where X is also negated · 73170f84

Richard Sandiford authored Dec 11, 2013

In such cases it's often better to test the result of the negation instead,
since the negation also sets CC.

llvm-svn: 197032

73170f84

Extend (truncate (load)) folding · d1093636

Richard Sandiford authored Dec 11, 2013

DAGCombiner could fold (truncate (load)) -> smaller load if the original
load was the width of the truncation result or wider.  This patch extends
it to handle cases where the original load was narrower (and so the
extension type stays the same).

llvm-svn: 197030

d1093636

Add TargetRegisterInfo::reverseLocalAssignment hook. · 2d8826a1

Andrew Trick authored Dec 11, 2013

This hook reverses the order of assignment for local live ranges. This
will generally allocate shorter local live ranges first. For targets with
many registers, this could reduce regalloc compile time by a large
factor. It should still achieve optimal coloring; however, it can change
register eviction decisions. It is disabled by default for two reasons:
(1) Top-down allocation is simpler and easier to debug for targets that
don't benefit from reversing the order.
(2) Bottom-up allocation could result in poor evicition decisions on some
targets affecting the performance of compiled code.

llvm-svn: 197001

2d8826a1

Distinguish and choose 16 or 32 bit forms of save/restore for Mips16. · 5bde5c35
Reed Kotler authored Dec 11, 2013
```
llvm-svn: 196999
```
5bde5c35
[AArch64 NEON] Get instruction BSL matched to VSELECT. · 310b6c08
Kevin Qin authored Dec 11, 2013
```
llvm-svn: 196998
```
310b6c08
Move mips' datalayout computation out of line and add comments. · b2fb78d4
Rafael Espindola authored Dec 11, 2013
```
llvm-svn: 196996
```
b2fb78d4
Move Sparc's getDataLayout out of line and add comments. · 60f48e5a
Rafael Espindola authored Dec 11, 2013
```
llvm-svn: 196990
```
60f48e5a
Prune redundant dependencies in LLVMBuild.txt. · 8bc9bfaa
NAKAMURA Takumi authored Dec 11, 2013
```
llvm-svn: 196988
```
8bc9bfaa
Move PPC's getDataLayoutString out of line and document it better. · 5b358587
Rafael Espindola authored Dec 11, 2013
```
llvm-svn: 196987
```
5b358587

Revert the backend fatal error from r196939 · ad92aca4

Reid Kleckner authored Dec 10, 2013

The combination of inline asm, stack realignment, and dynamic allocas
turns out to be too common to reject out of hand.

ASan inserts empy inline asm fragments and uses aligned allocas.
Compiling any trivial function containing a dynamic alloca with ASan is
enough to trigger the check.

XFAIL the test cases that would be miscompiled and add one that uses the
relevant functionality.

llvm-svn: 196986

ad92aca4

Dec 10, 2013
- Refactor the computation of the x86 datalayout. · 002f8aa5
  Rafael Espindola authored Dec 10, 2013
```
llvm-svn: 196976
```
  002f8aa5
- [asan] Fix the coverage.cc test broken by r196939 · 30b2a9a5
  Reid Kleckner authored Dec 10, 2013
```
It was failing because ASan was adding all of the following to one
function:
- dynamic alloca
- stack realignment
- inline asm

This patch avoids making the static alloca dynamic when coverage is
used.

ASan should probably not be inserting empty inline asm blobs to inhibit
duplicate tail elimination.

llvm-svn: 196973
```
  30b2a9a5
- Use llvm_unreachable instead of assert(0) · eaa3a7ef
  Matt Arsenault authored Dec 10, 2013
```
llvm-svn: 196971
```
  eaa3a7ef
- on darwin<10, fallback to .weak_definition (PPC,X86) · 1b01849f
  David Fang authored Dec 10, 2013
```
.weak_def_can_be_hidden was not yet supported by the system assembler

llvm-svn: 196970
```
  1b01849f
- [AArch64] Refactor the NEON floating-point absolute difference LLVM AArch64 · f70af216
  Chad Rosier authored Dec 10, 2013
```
intrinsic to use f32/f64 types, rather than their vector equivalents.

llvm-svn: 196965
```
  f70af216
- [AArch64] Refactor the NEON signed/unsigned floating-point convert to fixed-point · 07cc3f91
  Chad Rosier authored Dec 10, 2013
```
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents.

llvm-svn: 196964
```
  07cc3f91
- [AArch64] Overload NEON signed/unsigned floating-point convert to fixed-point · 98b8baa3
  Chad Rosier authored Dec 10, 2013
```
and fixed-point convert to floating-point LLVM AArch64 intrinsics.

llvm-svn: 196963
```
  98b8baa3