- Dec 02, 2013
-
-
NAKAMURA Takumi authored
llvm-svn: 196093
-
Robert Lytton authored
eliminateFrameIndex() has been reworked to handle both small & large frames with either a FP or SP. An additional Slot is required for Scavenging spills when not using FP for large frames. Reworked the handling of Register Scavenging. Whether we are using an FP or not, whether it is a large frame or not, and whether we are using a large code model or not are now independent. llvm-svn: 196091
-
Tim Northover authored
These are used by MachO only at the moment, and (much like the existing MOVW/MOVT set) work around the fact that the labels used in the actual instructions often contain PC-dependent components, which means that repeatedly materialising the same global can't be CSEed. With small modifications, it could be adapted to how ELF finds the address of _GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there. llvm-svn: 196090
-
Benjamin Kramer authored
llvm-svn: 196089
-
Robert Lytton authored
llvm-svn: 196088
-
Robert Lytton authored
When using large code model: Global objects larger than 'CodeModelLargeSize' bytes are placed in sections named with a trailing ".large" The folded global address of such objects are lowered into the const pool. During inspection it was noted that LowerConstantPool() was using a default offset of zero. A fix was made, but due to only offsets of zero being generated, testing only verifies the change is not detrimental. Correct the flags emitted for explicitly specified sections. We assume the size of the object queried by getSectionForConstant() is never greater than CodeModelLargeSize. To handle greater than CodeModelLargeSize, changes to AsmPrinter would be required. llvm-svn: 196087
-
Robert Lytton authored
llvm-svn: 196086
-
Robert Lytton authored
Large frame offsets are loaded from the ConstantPool. Where possible, offsets are encoded using the smaller MKMSK instruction. Large frame offsets can only be used when there is a frame-pointer. llvm-svn: 196085
-
Robert Lytton authored
llvm-svn: 196084
-
Kostya Serebryany authored
[tsan] fix instrumentation of vector vptr updates (https://code.google.com/p/thread-sanitizer/issues/detail?id=43) llvm-svn: 196079
-
Alp Toker authored
* Update build instructions to reflect the current source tree layout. * Don't inflict CVS on readers; there's a perfectly good git mirror. * configure with --disable-werror making it possible to build using clang. * ar and nm-new now support the -plugin option. llvm-svn: 196069
-
Rafael Espindola authored
llvm-svn: 196068
-
Rafael Espindola authored
llvm-svn: 196067
-
Rafael Espindola authored
llvm-svn: 196066
-
Rafael Espindola authored
llvm-svn: 196065
-
Alp Toker authored
llvm-svn: 196064
-
Rafael Espindola authored
llvm-svn: 196063
-
Rafael Espindola authored
The PPC GetSymbolFromOperand already prefixed stubs of MO_ExternalSymbol, so this should be a nop. llvm-svn: 196059
-
- Dec 01, 2013
-
-
Rafael Espindola authored
llvm-svn: 196052
-
Andrew Trick authored
llvm-svn: 196051
-
Tim Northover authored
Previously, we clobbered callee-saved registers when folding an "add sp, #N" into a "pop {rD, ...}" instruction. This change checks whether a register we're going to add to the "pop" could actually be live outside the function before doing so and should fix the issue. This should fix PR18081. llvm-svn: 196046
-
Benjamin Kramer authored
- Actually abort when an error occurred. - Check that the frontend lookup worked when parsing length/size/type operators. Tested by a clang test. PR18096. llvm-svn: 196044
-
Michael Kuperstein authored
llvm-svn: 196042
-
Bill Wendling authored
llvm-svn: 196006
-
Bill Wendling authored
error: invalid conversion from 'unsigned char' to '{anonymous}::Sequence' llvm-svn: 196004
-
- Nov 30, 2013
-
-
Hal Finkel authored
This adds a scheduling model for the POWER7 (P7) core, and enables the machine-instruction scheduler when targeting the P7. Scheduling for the P7, like earlier ooo PPC cores, requires considering both dispatch group hazards, and functional unit resources and latencies. These are both modeled in a combined itinerary. Dispatch group formation is still handled by the post-RA scheduler (which still needs to be updated for the P7, but nevertheless does a pretty good job). One interesting aspect of this change is that I've also enabled to use of AA duing CodeGen for the P7 (just as it is for the embedded cores). The benchmark results seem to support this decision (see below), and while this is normally useful for in-order cores, and not for ooo cores like the P7, I think that the dispatch slot hazards are enough like in-order resources to make the AA useful. Test suite significant performance differences (where negative is a speedup, and positive is a regression) vs. the current situation: MultiSource/Benchmarks/BitBench/drop3/drop3 with AA: N/A without AA: -28.7614% +/- 19.8356% (significantly against AA) MultiSource/Benchmarks/FreeBench/neural/neural with AA: -17.7406% +/- 11.2712% without AA: N/A (significantly in favor of AA) MultiSource/Benchmarks/SciMark2-C/scimark2 with AA: -11.2079% +/- 1.80543% without AA: -11.3263% +/- 2.79651% MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt with AA: -41.8649% +/- 17.0053% without AA: -34.5256% +/- 23.7072% MultiSource/Benchmarks/mafft/pairlocalalign with AA: 25.3016% +/- 17.8614% without AA: 38.6629% +/- 14.9391% (significantly in favor of AA) MultiSource/Benchmarks/sim/sim with AA: N/A without AA: 13.4844% +/- 7.18195% (significantly in favor of AA) SingleSource/Benchmarks/BenchmarkGame/Large/fasta with AA: 15.0664% +/- 6.70216% without AA: 12.7747% +/- 8.43043% SingleSource/Benchmarks/BenchmarkGame/puzzle with AA: 82.2713% +/- 26.3567% without AA: 75.7525% +/- 41.1842% SingleSource/Benchmarks/Misc/flops-2 with AA: -37.1621% +/- 20.7964% without AA: -35.2342% +/- 20.2999% (significantly in favor of AA) These are 99.5% confidence intervals from 5 runs per configuration. Regarding the choice to turn on AA during CodeGen, of these results, four seem significantly in favor of using AA, and one seems significantly against. I'm not making this decision based on these numbers alone, but these results seem consistent with results I have from other tests, and so I think that, on balance, using AA is a win. llvm-svn: 195981
-
Hal Finkel authored
In preparation for adding scheduling definitions for the POWER7, split some PPC itinerary classes so that the P7's latencies and hazards can be better described. For the most part, this means differentiating indexed from non-index pre-increment loads and stores. Also, differentiate single from double-precision sqrt. No functionality change intended (except for a more-specific latency for single-precision sqrt on the A2). llvm-svn: 195980
-
Hal Finkel authored
Convert this test to FileCheck, and improve it to check for the instructions it is trying to exclude instead of checking for register use (especially because grepping for r1 can be thrown off, for example, by a use of r12). llvm-svn: 195979
-
Hal Finkel authored
Use CHECK-DAG to make these regression tests more resilient against changes in instruction scheduling. llvm-svn: 195978
-
Hal Finkel authored
Some of these tests did not specify a cpu but were also sensitive to instruction scheduling and/or register assignment choices. A few others similarly-sensitive tests specified a cpu (often the POWER7), and while the P7 currently uses the default model for PPC64, this will soon change. For those tests which should not really be cpu-dependent anyway, the cpu is set to the generic 'ppc64'. llvm-svn: 195977
-
Zoran Jovanovic authored
llvm-svn: 195976
-
Zoran Jovanovic authored
llvm-svn: 195975
-
Daniel Sanders authored
This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s when the stack frame is between 512 and 32,768 bytes in size. llvm-svn: 195973
-
Daniel Sanders authored
No functional change. An if-statement has been split into two nested if-statements. llvm-svn: 195972
-
Juergen Ributzka authored
llvm-svn: 195971
-
Andrew Trick authored
llvm-svn: 195969
-
- Nov 29, 2013
-
-
Reed Kotler authored
in constant islands for Mips16. We introdcuce JalB16 as a synomnym for Jal16. It makes it easier to read and is also necessary because Jal16 is a call instruction but JalB16 is being used as a branch. Various parts of LLVM will not work properly even in this late stage of the backend if we use what was declared as a call instruction to function as a branch. For one, basic block labels may not get emitted in some situations. llvm-svn: 195968
-
Zoran Jovanovic authored
llvm-svn: 195967
-
Petar Jovanovic authored
XFAIL llvm-cov.test for MIPS until big-endian issues are fixed for llvm-cov. The test does pass on MIPS little-endian. llvm-svn: 195966
-
Zoran Jovanovic authored
llvm-svn: 195965
-