- Dec 12, 2013
-
-
Hal Finkel authored
llvm-svn: 197100
-
Hal Finkel authored
Aside from a few minor latency corrections, the major change here is a new hazard recognizer which focuses on better dispatch-group formation on the POWER7. As with the PPC970's hazard recognizer, the most important thing it does is avoid load-after-store hazards within the same dispatch group. It uses the POWER7's special dispatch-group-terminating nop instruction (instead of inserting multiple regular nop instructions). This new hazard recognizer makes use of the scheduling dependency graph itself, built using AA information, to robustly detect the possibility of load-after-store hazards. significant test-suite performance changes (the error bars are 99.5% confidence intervals based on 5 test-suite runs both with and without the change -- speedups are negative): speedups: MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 -0.55171% +/- 0.333168% MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl -17.5576% +/- 14.598% MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl -29.5708% +/- 7.09058% MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt -34.9471% +/- 11.4391% SingleSource/Benchmarks/BenchmarkGame/puzzle -25.1347% +/- 11.0104% SingleSource/Benchmarks/Misc/flops-8 -17.7297% +/- 9.79061% SingleSource/Benchmarks/Shootout-C++/ary3 -35.5018% +/- 23.9458% SingleSource/Regression/C/uint64_to_float -56.3165% +/- 25.4234% SingleSource/UnitTests/Vectorizer/gcc-loops -18.5309% +/- 6.8496% regressions: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000 18.351% +/- 12.156% SingleSource/Benchmarks/Shootout-C++/methcall 27.3086% +/- 14.4733% llvm-svn: 197099
-
Quentin Colombet authored
The assertion was checking that the virtual register VReg used to represent the physical register PReg uses the same register class as the one passed to MachineFunction::addLiveIn. This is over-constraining because it is sufficient to check that the register class of VReg (VRegRC) is a subclass of the register class of PReg (PRegRC) and that VRegRC contains PReg. Indeed, if VReg gets constrained because of some operation constraints between two calls of MachineFunction::addLiveIn, the original assertion cannot match. This fixes <rdar://problem/15633429>. llvm-svn: 197097
-
Hans Wennborg authored
Both FileCheck and clang's -verify need to escape strings for regexes, so let's expose this as a utility in the Regex class. llvm-svn: 197096
-
Chad Rosier authored
intrinsics to use f32 types, rather than their vector equivalents. llvm-svn: 197090
-
Hal Finkel authored
For one predicate to subsume another, they must both check the same condition register. Failure to check this prerequisite was causing miscompiles. Fixes PR18003. llvm-svn: 197089
-
- Dec 11, 2013
-
-
Hal Finkel authored
This adds two additional functions to the hazard recognizer interface. These are optional (in the sense that the default implementations preserve the current behavior), and used by the post-RA scheduler. Upcoming commits will use this functionality in order to improve dispatch-group formation on the POWER7 and related cores. Dispatch groups are an odd construct: sometimes we need to insert nops to force a new one to start (for performance reasons), and some instructions need to appear in certain positions within a group, but the groups are not fundamentally cycle based (they can contain instructions with data dependencies with non-trivial latencies). Motivation: unsigned PreEmitNoops(SUnit *) - Used to force the post-RA scheduler to insert nops to force a new dispatch group to begin. We already have a NoopHazard, and this is also still needed. However, NoopHazard only causes a nop to be inserted if there are no other available instructions, and so is not always sufficient. The number of nops to insert depends on state that only the hazard recognizer has, so a general callback is necessary. bool ShouldPreferAnother(SUnit *) - Used to avoid scheduling instructions that would start a new dispatch group when others are available that could be part of the current dispatch group. In this case, we don't want to issue nops, because the non-preferred instruction will implicitly start a new dispatch group regardless. Although the motivation for these functions is driven by the PowerPC backend, they are completely general. llvm-svn: 197084
-
Rafael Espindola authored
The linkers on these systems don't have anything special to do with these symbols. Since the intent is for them to be absent from the final object, just treat them as private. llvm-svn: 197080
-
David Blaikie authored
Revert "DebugInfo: Move type units into the debug_types section with appropriate comdat grouping and type unit headers" This reverts commit r197073. The test seems to be failing on some buildbots for unknown reasons. Reverting until I can figure that out. If anyone's got a reproduction (.s and .o together would be great) - I'd really appreciate it. llvm-svn: 197079
-
David Blaikie authored
DebugInfo: Move type units into the debug_types section with appropriate comdat grouping and type unit headers This commit does not complete the type units feature - there are issues around fission support (skeletal type units, pubtypes/pubnames) and hashing of some types including those containing references to types in other type units. llvm-svn: 197073
-
David Blaikie authored
llvm-svn: 197072
-
Chad Rosier authored
use f32/f64 types, rather than their vector equivalents. llvm-svn: 197068
-
Chad Rosier authored
floating-point reciprocal square root step LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector equivalents. llvm-svn: 197067
-
Chad Rosier authored
point reciprocal exponent, and floating-point reciprocal square root estimate LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector equivalents. llvm-svn: 197066
-
Rafael Espindola authored
llvm-svn: 197064
-
Tom Stellard authored
This makes it a little easier to read. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197058
-
Tom Stellard authored
This enables -print-before-all to dump MachineInstrs after it is run. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197057
-
Tom Stellard authored
This enables -print-before-all to dump MachineInstrs after it is run. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 197056
-
Logan Chien authored
llvm-svn: 197052
-
Benjamin Kramer authored
Found by "cppcheck". PR18208. llvm-svn: 197047
-
Tim Northover authored
The tests were no longer using fast-isel at all (MachO needs an "ios" rather than "darwin" triple at the moment and Linux needs ARM mode). Once that was corrected, the verifier complained about a t2ADDri created for the alloca. llvm-svn: 197046
-
Alp Toker authored
Based on a patch by Neil Henning! llvm-svn: 197045
-
Elena Demikhovsky authored
I moved a test from avx512-vbroadcast-crash.ll to avx512-vbroadcast.ll I defined HasAVX512 predicate as AssemblerPredicate. It means that you should invoke llvm-mc with "-mcpu=knl" to get encoding for AVX-512 instructions. I need this to let AsmMatcher to set different encoding for AVX and AVX-512 instructions that have the same mnemonic and operands (all scalar instructions). llvm-svn: 197041
-
Richard Sandiford authored
In such cases it's often better to test the result of the negation instead, since the negation also sets CC. llvm-svn: 197032
-
Richard Sandiford authored
DAGCombiner could fold (truncate (load)) -> smaller load if the original load was the width of the truncation result or wider. This patch extends it to handle cases where the original load was narrower (and so the extension type stays the same). llvm-svn: 197030
-
Andrew Trick authored
This hook reverses the order of assignment for local live ranges. This will generally allocate shorter local live ranges first. For targets with many registers, this could reduce regalloc compile time by a large factor. It should still achieve optimal coloring; however, it can change register eviction decisions. It is disabled by default for two reasons: (1) Top-down allocation is simpler and easier to debug for targets that don't benefit from reversing the order. (2) Bottom-up allocation could result in poor evicition decisions on some targets affecting the performance of compiled code. llvm-svn: 197001
-
Reed Kotler authored
llvm-svn: 196999
-
Kevin Qin authored
llvm-svn: 196998
-
Rafael Espindola authored
llvm-svn: 196996
-
Rafael Espindola authored
llvm-svn: 196990
-
NAKAMURA Takumi authored
llvm-svn: 196988
-
Rafael Espindola authored
llvm-svn: 196987
-
Reid Kleckner authored
The combination of inline asm, stack realignment, and dynamic allocas turns out to be too common to reject out of hand. ASan inserts empy inline asm fragments and uses aligned allocas. Compiling any trivial function containing a dynamic alloca with ASan is enough to trigger the check. XFAIL the test cases that would be miscompiled and add one that uses the relevant functionality. llvm-svn: 196986
-
- Dec 10, 2013
-
-
Rafael Espindola authored
llvm-svn: 196976
-
Reid Kleckner authored
It was failing because ASan was adding all of the following to one function: - dynamic alloca - stack realignment - inline asm This patch avoids making the static alloca dynamic when coverage is used. ASan should probably not be inserting empty inline asm blobs to inhibit duplicate tail elimination. llvm-svn: 196973
-
Matt Arsenault authored
llvm-svn: 196971
-
David Fang authored
.weak_def_can_be_hidden was not yet supported by the system assembler llvm-svn: 196970
-
Chad Rosier authored
intrinsic to use f32/f64 types, rather than their vector equivalents. llvm-svn: 196965
-
Chad Rosier authored
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents. llvm-svn: 196964
-
Chad Rosier authored
and fixed-point convert to floating-point LLVM AArch64 intrinsics. llvm-svn: 196963
-