- Jan 19, 2014
-
-
Saleem Abdulrasool authored
Update names for the names as per the current ABI errata. Mark deprecated tags as such. llvm-svn: 199576
-
- Jan 16, 2014
-
-
Amara Emerson authored
The encoding of build attributes is already tested in CodeGen/ARM/build-attributes-encoding.s llvm-svn: 199393
-
Jiangning Liu authored
llvm-svn: 199369
-
- Jan 15, 2014
-
-
Weiming Zhao authored
When expanding neon pseudo stores, it may miss the implicit uses of sub regs, which may cause post RA scheduler reorder instructions that breakes anti dependency. For example: VST1d64QPseudo %R0<kill>, 16, %Q9_Q10, pred:14, pred:%noreg will be expanded to VST1d64Q %R0<kill>, 16, %D18, pred:14, pred:%noreg; An instruction that defines %D20 may be scheduled before the store by mistake. This patches adds implicit uses for such case. For the example above, it emits: VST1d64Q %R0<kill>, 8, %D18, pred:14, pred:%noreg, %Q9_Q10<imp-use> llvm-svn: 199282
-
- Jan 14, 2014
-
-
Tim Northover authored
The changes caused by folding an sp-adjustment into a "pop" previously disrupted the forward search for the final real instruction in a terminating block. This switches to a backward search (skipping debug instrs). This fixes PR18399. Patch by Zhaoshi. llvm-svn: 199266
-
Tim Northover authored
llvm-svn: 199212
-
- Jan 13, 2014
-
-
Tim Northover authored
rdar://problem/15800156 llvm-svn: 199109
-
- Jan 11, 2014
-
-
Benjamin Kramer authored
llvm-svn: 199016
-
- Jan 10, 2014
-
-
Artyom Skrobov authored
llvm-svn: 198945
-
- Jan 07, 2014
-
-
Saleem Abdulrasool authored
Parse tag names as well as expressions. The former is part of the specification, the latter is for improved compatibility with the GNU assembler. Fix attribute value handling to be comformant to the specification. llvm-svn: 198662
-
- Jan 06, 2014
-
-
Tim Northover authored
The ARM backend has been using most of the MachO related subtarget checks almost interchangeably, and since the only target it's had to run on has been IOS (which is all three of MachO, Darwin and IOS) it's worked out OK so far. But we'd like to support embedded targets under the "*-*-none-macho" triple, which means everything starts falling apart and inconsistent behaviours emerge. This patch should pick a reasonably sensible set of behaviours for the new triple (and any others that come along, with luck). Some choices were debatable (notably FP == r7 or r11), but we can revisit those later when deficiencies become apparent. llvm-svn: 198617
-
Tim Northover authored
Longer term, we want to move users to "*-*-*-macho" for embedded work, but for now people are relying on the last thing we told them, which is unfortunately "*-*-darwin-eabi". rdar://problem/15703934 llvm-svn: 198602
-
- Jan 02, 2014
-
-
Rafael Espindola authored
This patch makes it possible to select the ABI with -mattr. It will be used to forward clang's -target-abi option to llvm's CodeGen. llvm-svn: 198304
-
- Dec 30, 2013
-
-
Bill Wendling authored
llvm-svn: 198184
-
- Dec 28, 2013
-
-
Andrew Trick authored
Schedule more conservatively to account for stalls on floating point resources and latency. Use the AGU resource to model latency stalls since it's shared between FP and LD/ST instructions. This might not be completely accurate but should work well in practice. llvm-svn: 198125
-
- Dec 19, 2013
-
-
Josh Magee authored
llvm-svn: 197709
-
Rafael Espindola authored
I am surprised I am the first one to notice this. llvm-svn: 197689
-
Josh Magee authored
[stackprotector] Use analysis from the StackProtector pass for stack layout in PEI a nd LocalStackSlot passes. This changes the MachineFrameInfo API to use the new SSPLayoutKind information produced by the StackProtector pass (instead of a boolean flag) and updates a few pass dependencies (to preserve the SSP analysis). The stack layout follows the same approach used prior to this change - i.e., only LargeArray stack objects will be placed near the canary and everything else will be laid out normally. After this change, structures containing large arrays will also be placed near the canary - a case previously missed by the old implementation. Out of tree targets will need to update their usage of MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument. The next patch will implement the rules for sspstrong and sspreq. The end goal is to support ssp-strong stack layout rules. WIP. Differential Revision: http://llvm-reviews.chandlerc.com/D2158 llvm-svn: 197653
-
- Dec 18, 2013
-
-
Weiming Zhao authored
Given vsel_cc, op1, op2, since vsel has no LE/LT, to generate vsel for such selection, it needs to inverse cc and swap op1 and op2. To inverse cc, both L/G and E bits should be flipped. llvm-svn: 197615
-
Tim Northover authored
This should fix the ARM bots. llvm-svn: 197555
-
Tim Northover authored
Clang sets the float-abi target option manually, but no longer annotates each function with its ABI. This can lead to confusing mistmatch between "clang -emit-llvm | llc" and normal clang invocations. Besides which, gnueabihf actually *is* hard-float. Defaulting to soft was just perverse. llvm-svn: 197554
-
- Dec 17, 2013
-
-
Quentin Colombet authored
This reapplies r197438 and fixes the link-time circular dependency between IR and Support. The fix consists in moving the diagnostic support into IR. The patch adds a new LLVMContext::diagnose that can be used to communicate to the front-end, if any, that something of interest happened. The diagnostics are supported by a new abstraction, the DiagnosticInfo class. The base class contains the following information: - The kind of the report: What this is about. - The severity of the report: How bad this is. This patch also adds 2 classes: - DiagnosticInfoInlineAsm: For inline asm reporting. Basically, this diagnostic will be used to switch to the new diagnostic API for LLVMContext::emitError. - DiagnosticStackSize: For stack size reporting. Comes as a replacement of the hard coded warning in PEI. This patch also features dynamic diagnostic identifiers. In other words plugins can use this infrastructure for their own diagnostics (for more details, see getNextAvailablePluginDiagnosticKind). This patch introduces a new DiagnosticHandlerTy and a new DiagnosticContext in the LLVMContext that should be set by the front-end to be able to map these diagnostics in its own system. http://llvm-reviews.chandlerc.com/D2376 <rdar://problem/15515174> llvm-svn: 197508
-
Quentin Colombet authored
llvm-svn: 197451
-
Quentin Colombet authored
The patch adds a new LLVMContext::diagnose that can be used to communicate to the front-end, if any, that something of interest happened. The diagnostics are supported by a new abstraction, the DiagnosticInfo class. The base class contains the following information: - The kind of the report: What this is about. - The severity of the report: How bad this is. This patch also adds 2 classes: - DiagnosticInfoInlineAsm: For inline asm reporting. Basically, this diagnostic will be used to switch to the new diagnostic API for LLVMContext::emitError. - DiagnosticStackSize: For stack size reporting. Comes as a replacement of the hard coded warning in PEI. This patch also features dynamic diagnostic identifiers. In other words plugins can use this infrastructure for their own diagnostics (for more details, see getNextAvailablePluginDiagnosticKind). This patch introduces a new DiagnosticHandlerTy and a new DiagnosticContext in the LLVMContext that should be set by the front-end to be able to map these diagnostics in its own system. http://llvm-reviews.chandlerc.com/D2376 <rdar://problem/15515174> llvm-svn: 197438
-
- Dec 16, 2013
-
-
Joerg Sonnenberger authored
llvm-svn: 197405
-
- Dec 13, 2013
-
-
Joerg Sonnenberger authored
with a temporary assertion and adjust the various test cases. llvm-svn: 197224
-
- Dec 11, 2013
-
-
Tim Northover authored
The tests were no longer using fast-isel at all (MachO needs an "ios" rather than "darwin" triple at the moment and Linux needs ARM mode). Once that was corrected, the verifier complained about a t2ADDri created for the alloca. llvm-svn: 197046
-
- Dec 08, 2013
-
-
Tim Northover authored
When trying to eliminate an "sub sp, sp, #N" instruction by folding it into an existing push/pop using dummy registers, we need to account for the fact that this might affect precisely how "fp" gets set in the prologue. We were attempting this, but assuming that *whenever* we performed a fold it would make a difference. This is false, for example, in: push {r4, r7, lr} add fp, sp, #4 vpush {d8} sub sp, sp, #8 we can fold the "sub" into the "vpush", forming "vpush {d7, d8}". However, in that case the "add fp" instruction mustn't change, which we were getting wrong before. Should fix PR18160. llvm-svn: 196725
-
- Dec 06, 2013
-
-
Weiming Zhao authored
The current peephole optimizing for compare inst assumes an instr that uses CPSR has an MO for ARM Cond code.However, for VSEL instructions (vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do they support the modification of Cond Code. llvm-svn: 196588
-
- Dec 05, 2013
-
-
Andrew Trick authored
The per-operand machine model allows the target to define "unbuffered" processor resources. This change is a quick, cheap way to model stalls caused by the latency of operations that use such resources. This only applies when the processor's micro-op buffer size is non-zero (Out-of-Order). We can't precisely model in-order stalls during out-of-order execution, but this is an easy and effective heuristic. It benefits cortex-a9 scheduling when using the new machine model, which is not yet on by default. MI-Sched for armv7 was evaluated on Swift (and only not enabled because of a performance bug related to predication). However, we never evaluated Cortex-A9 performance on MI-Sched in its current form. This change adds MI-Sched functionality to reach performance goals on A9. The only remaining change is to allow MI-Sched to run as a PostRA pass. I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7: -mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results: (min run time over 2 runs, filtering tiny changes) Speedups: | Benchmarks/BenchmarkGame/recursive | 52.39% | | Benchmarks/VersaBench/beamformer | 20.80% | | Benchmarks/Misc/pi | 19.97% | | Benchmarks/Misc/mandel-2 | 19.95% | | SPEC/CFP2000/188.ammp | 18.72% | | Benchmarks/McCat/08-main/main | 18.58% | | Benchmarks/Misc-C++/Large/sphereflake | 18.46% | | Benchmarks/Olden/power | 17.11% | | Benchmarks/Misc-C++/mandel-text | 16.47% | | Benchmarks/Misc/oourafft | 15.94% | | Benchmarks/Misc/flops-7 | 14.99% | | Benchmarks/FreeBench/distray | 14.26% | | SPEC/CFP2006/470.lbm | 14.00% | | mediabench/mpeg2/mpeg2dec/mpeg2decode | 12.28% | | Benchmarks/SmallPT/smallpt | 10.36% | | Benchmarks/Misc-C++/Large/ray | 8.97% | | Benchmarks/Misc/fp-convert | 8.75% | | Benchmarks/Olden/perimeter | 7.10% | | Benchmarks/Bullet/bullet | 7.03% | | Benchmarks/Misc/mandel | 6.75% | | Benchmarks/Olden/voronoi | 6.26% | | Benchmarks/Misc/flops-8 | 5.77% | | Benchmarks/Misc/matmul_f64_4x4 | 5.19% | | Benchmarks/MiBench/security-rijndael | 5.15% | | Benchmarks/Misc/flops-6 | 5.10% | | Benchmarks/Olden/tsp | 4.46% | | Benchmarks/MiBench/consumer-lame | 4.28% | | Benchmarks/Misc/flops-5 | 4.27% | | Benchmarks/mafft/pairlocalalign | 4.19% | | Benchmarks/Misc/himenobmtxpa | 4.07% | | Benchmarks/Misc/lowercase | 4.06% | | SPEC/CFP2006/433.milc | 3.99% | | Benchmarks/tramp3d-v4 | 3.79% | | Benchmarks/FreeBench/pifft | 3.66% | | Benchmarks/Ptrdist/ks | 3.21% | | Benchmarks/Adobe-C++/loop_unroll | 3.12% | | SPEC/CINT2000/175.vpr | 3.12% | | Benchmarks/nbench | 2.98% | | SPEC/CFP2000/183.equake | 2.91% | | Benchmarks/Misc/perlin | 2.85% | | Benchmarks/Misc/flops-1 | 2.82% | | Benchmarks/Misc-C++-EH/spirit | 2.80% | | Benchmarks/Misc/flops-2 | 2.77% | | Benchmarks/NPB-serial/is | 2.42% | | Benchmarks/ASC_Sequoia/CrystalMk | 2.33% | | Benchmarks/BenchmarkGame/n-body | 2.28% | | Benchmarks/SciMark2-C/scimark2 | 2.27% | | Benchmarks/Olden/bh | 2.03% | | skidmarks10/skidmarks | 1.81% | | Benchmarks/Misc/flops | 1.72% | Slowdowns: | Benchmarks/llubenchmark/llu | -14.14% | | Benchmarks/Polybench/stencils/seidel-2d | -5.67% | | Benchmarks/Adobe-C++/functionobjects | -5.25% | | Benchmarks/Misc-C++/oopack_v1p8 | -5.00% | | Benchmarks/Shootout/hash | -2.35% | | Benchmarks/Prolangs-C++/ocean | -2.01% | | Benchmarks/Polybench/medley/floyd-warshall | -1.98% | | Polybench/linear-algebra/kernels/3mm | -1.95% | | Benchmarks/McCat/09-vor/vor | -1.68% | llvm-svn: 196516
-
Tim Northover authored
We were trying to fold the stack adjustment into the wrong instruction in the situation where the entire basic-block was epilogue code. Really, it can only ever be valid to do the folding precisely where the "add sp, ..." would be placed so there's no need for a separate iterator to track that. Should fix PR18136. llvm-svn: 196493
-
- Dec 04, 2013
-
-
David Peixotto authored
ARM symbol variants are written with parens instead of @ like this: .word __GLOBAL_I_a(target1) This commit adds support for parsing these symbol variants in expressions. We introduce a new flag to MCAsmInfo that indicates the parser should use parens to parse the symbol variant. The expression parser is modified to look for symbol variants using parens instead of @ when the corresponding MCAsmInfo flag is true. The MCAsmInfo parens flag is enabled only for ARM on ELF. By adding this flag to MCAsmInfo, we are able to get rid of redundant ARM-specific symbol variants and use the generic variants instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new UseParensForSymbolVariant attribute in MCAsmInfo to correctly print the symbol variants for arm. To achive this we need to keep a handle to the MCAsmInfo in the MCSymbolRefExpr class that we can check when printing the symbol variant. Updated Tests: Changed case of symbol variant to match the generic kind. test/CodeGen/ARM/tls-models.ll test/CodeGen/ARM/tls1.ll test/CodeGen/ARM/tls2.ll test/CodeGen/Thumb2/tls1.ll test/CodeGen/Thumb2/tls2.ll PR18080 llvm-svn: 196424
-
- Dec 03, 2013
-
-
James Molloy authored
Testcase added. llvm-svn: 196269
-
- Dec 02, 2013
-
-
Tim Northover authored
llvm-svn: 196102
-
Tim Northover authored
These are used by MachO only at the moment, and (much like the existing MOVW/MOVT set) work around the fact that the labels used in the actual instructions often contain PC-dependent components, which means that repeatedly materialising the same global can't be CSEed. With small modifications, it could be adapted to how ELF finds the address of _GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there. llvm-svn: 196090
-
- Dec 01, 2013
-
-
Tim Northover authored
Previously, we clobbered callee-saved registers when folding an "add sp, #N" into a "pop {rD, ...}" instruction. This change checks whether a register we're going to add to the "pop" could actually be live outside the function before doing so and should fix the issue. This should fix PR18081. llvm-svn: 196046
-
- Nov 26, 2013
-
-
Tim Northover authored
llvm-svn: 195759
-
- Nov 25, 2013
-
-
Amara Emerson authored
Patch by Oliver Stannard. llvm-svn: 195640
-
- Nov 23, 2013
-
-
Manman Ren authored
We are going to drop debug info without a version number or with a different version number, to make sure we don't crash when we see bitcode files with different debug info metadata format. Make tests more robust by removing hard-coded metadata numbers in CHECK lines. llvm-svn: 195535
-
- Nov 22, 2013
-
-
Manman Ren authored
We are going to drop debug info without a version number or with a different version number, to make sure we don't crash when we see bitcode files with different debug info metadata format. llvm-svn: 195504
-