- Aug 05, 2015
-
-
Tanya Lattner authored
llvm-svn: 243999
-
- Aug 04, 2015
-
-
Sanjay Patel authored
Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
-
Saleem Abdulrasool authored
This adds the software division routines for the Windows RTABI. These are not expected to be used often though as most modern Windows ARM capable targets support hardware division. In the case that the target CPU doesnt support hardware division, this will be the fallback. llvm-svn: 243952
-
Saleem Abdulrasool authored
Make the libcall updating table driven similar to the approach that the Linux and Windows codepath does below. NFC. llvm-svn: 243951
-
Tim Northover authored
llvm-svn: 243907
-
- Aug 03, 2015
-
-
Tim Northover authored
This is necessary for WatchOS support, where the compact unwind format assumes this kind of layout. For now we only want this on Swift-like CPUs though, where it's been the Xcode behaviour for ages. Also, since it can expand the prologue we don't want it at -Oz. llvm-svn: 243884
-
John Brawn authored
Enabling merging of extern globals appears to be generally either beneficial or harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57) it gives improvements in the 1-5% range, but in the rest the overall effect is zero. Differential Revision: http://reviews.llvm.org/D10966 llvm-svn: 243874
-
James Molloy authored
In http://reviews.llvm.org/rL215382, IT forming was made more conservative under the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M. But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block as long as the flags register is dead afterwards. This gives significant performance improvements in a variety of MPEG based workloads. Differential revision: http://reviews.llvm.org/D11680 llvm-svn: 243869
-
- Aug 02, 2015
-
-
Craig Topper authored
This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
-
- Aug 01, 2015
-
-
David Blaikie authored
Various targets use std::swap on specific MCAsmOperands (ARM and possibly Hexagon as well). It might be helpful to mark those subclasses as final, to ensure that the availability of move/copy operations can't lead to slicing. (same sort of requirements as the non-vitual dtor - protected or a final class) llvm-svn: 243820
-
- Jul 31, 2015
-
-
Sumanth Gundapaneni authored
For a modulo (reminder) operation, clang -target armv7-none-linux-gnueabi generates "__modsi3" clang -target armv7-none-eabi generates "__aeabi_idivmod" clang -target armv7-linux-androideabi generates "__modsi3" Android bionic libc doesn't provide a __modsi3, instead it provides a "__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate the correct call when ever there is a modulo operation. Differential Revision: http://reviews.llvm.org/D11661 llvm-svn: 243717
-
- Jul 30, 2015
-
-
Sanjay Patel authored
Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693
-
Nick Lewycky authored
Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.) Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h. llvm-svn: 243585
-
- Jul 29, 2015
-
-
Akira Hatanaka authored
This commit defines subtarget feature strict-align and uses it instead of cl::opt -arm-strict-align to decide whether strict alignment should be forced. Also, remove the logic that was checking the OS and architecture as clang is now responsible for setting strict-align based on the command line options specified and the target architecute and OS. rdar://problem/21529937 http://reviews.llvm.org/D11470 llvm-svn: 243493
-
- Jul 28, 2015
-
-
Chih-Hung Hsieh authored
The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438
-
Alexandros Lamprineas authored
- Architecture extensions are represented as a bitmap. Phabricator: http://reviews.llvm.org/D11457 llvm-svn: 243335
-
- Jul 27, 2015
-
-
Colin LeMahieu authored
llvm-svn: 243334
-
Silviu Baranga authored
Summary: Fix the cost of interleaved accesses for ARM/AArch64. We were calling getTypeAllocSize and using it to check the number of bits, when we should have called getTypeAllocSizeInBits instead. This would pottentially cause the vectorizer to generate loads/stores and shuffles which cannot be matched with an interleaved access instruction. No performance changes are expected for now since matching/generating interleaved accesses is still disabled by default. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11524 llvm-svn: 243270
-
- Jul 24, 2015
-
-
Luke Cheeseman authored
Some shufflevectors are currently being incorrectly lowered in the AArch32 backend as the existing checks for detecting the NEON operations from the shufflevector instruction expects the shuffle mask and the vector operands to be of the same length. This is not always the case as the mask may be twice as long as the operand; here only the lower half of the shufflemask gets checked, so provided the lower half of the shufflemask looks like a vector transpose (or even is just all -1 for undef) then the intrinsics may get incorrectly lowered into a vector transpose (VTRN) instruction. This patch fixes this by accommodating for both cases and adds regression tests. Differential Revision: http://reviews.llvm.org/D11407 llvm-svn: 243103
-
Luke Cheeseman authored
is an immediate, in this check the value is negated and stored in and int64_t. The value can be -2^63 yet the result cannot be stored in an int64_t and this gives some undefined behaviour causing failures. The negation is only necessary when the values is within a certain range and so it should not need to negate -2^63, this patch introduces this and also a regression test. Differential Revision: http://reviews.llvm.org/D11408 llvm-svn: 243100
-
David Gross authored
Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass. Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11373 llvm-svn: 243052
-
- Jul 23, 2015
-
-
David Gross authored
llvm-svn: 243046
-
- Jul 22, 2015
-
-
Quentin Colombet authored
Shrink-wrapping can now be tested on ARM with -enable-shrink-wrap. Related to <rdar://problem/20821730> llvm-svn: 242908
-
- Jul 21, 2015
-
-
Akira Hatanaka authored
whether register r9 should be reserved. This recommits r242737, which broke bots because the number of subtarget features went over the limit of 64. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242756
-
Matthias Braun authored
Re-apply of r241928 which had to be reverted because of the r241926 revert. This commit factors out common code from MergeBaseUpdateLoadStore() and MergeBaseUpdateLSMultiple() and introduces a new function MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a strd/ldrd instruction into an strd/ldrd instruction with writeback where possible. Differential Revision: http://reviews.llvm.org/D10676 llvm-svn: 242743
-
Matthias Braun authored
Re-apply r241926 with an additional check that r13 and r15 are not used for LDRD/STRD. See http://llvm.org/PR24190. This also already includes the fix from r241951. Differential Revision: http://reviews.llvm.org/D10623 llvm-svn: 242742
-
Akira Hatanaka authored
This caused builds to fail with the following error message: error:Too many subtarget features! Bump MAX_SUBTARGET_FEATURES. llvm-svn: 242740
-
Akira Hatanaka authored
whether register r9 should be reserved. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242737
-
Matthias Braun authored
This reverts commit r241926. This caused http://llvm.org/PR24190 llvm-svn: 242735
-
Matthias Braun authored
This reverts commit r241928. This caused http://llvm.org/PR24190 llvm-svn: 242734
-
Matthias Braun authored
This reverts commit r241951. It caused http://llvm.org/PR24190 llvm-svn: 242733
-
JF Bastien authored
This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`. * Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute. Multiple targets duplicated the same `needsStackRealignment` code: - Aarch64. - ARM. - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has. - PowerPC. - WebAssembly. - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has. The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects: - AMDGPU - BPF - CppBackend - MSP430 - NVPTX - Sparc - SystemZ - XCore - Out-of-tree targets This is a breaking change! `make check` passes. The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation. `needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone. Reviewers: sunfish Subscribers: aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11160 llvm-svn: 242727
-
- Jul 20, 2015
-
-
Quentin Colombet authored
This is the first step toward supporting shrink-wrapping for this target. The changes could be summarized by these items: - Expand the tail-call return as part of the expand pseudo pass. - Get rid of the assumptions that the epilogue is the exit block: * Do not assume which registers are free in the epilogue. (This indirectly improve the lowering of the code for the segmented stacks, see the test cases.) * Take into account that the basic block can be empty. Related to <rdar://problem/20821730> llvm-svn: 242714
-
- Jul 18, 2015
-
-
Matthias Braun authored
Reapply r242500 now that the swift schedmodel includes LDRLIT. This is mostly done to disable the PostRAScheduler which optimizes for instruction latencies which isn't a good fit for out-of-order architectures. This also allows to leave out the itinerary table in swift in favor of the SchedModel ones. This change leads to performance improvements/regressions by as much as 10% in some benchmarks, in fact we loose 0.4% performance over the llvm-testsuite for reasons that appear to be unknown or out of the compilers control. rdar://20803802 documents the investigation of these effects. While it is probably a good idea to perform the same switch for the other ARM out-of-order CPUs, I limited this change to swift as I cannot perform the benchmark verification on the other CPUs. Differential Revision: http://reviews.llvm.org/D10513 llvm-svn: 242588
-
Matthias Braun authored
These pseudo instructions are only lowered after register allocation and are therefore still present when the machine scheduler runs. Add a run: line to a testcase that uses the uncommon flags necessary to actually produce a LDRLIT instruction on swift. llvm-svn: 242587
-
- Jul 17, 2015
-
-
Adam Nemet authored
This reverts commit r242500. It broke some internal tests and Matthias asked me to revert it while he is investigating. llvm-svn: 242553
-
James Molloy authored
No functional change, but it preps codegen for the future when SABSDIFF will start getting generated in anger. llvm-svn: 242546
-
Matthias Braun authored
This is mostly done to disable the PostRAScheduler which optimizes for instruction latencies which isn't a good fit for out-of-order architectures. This also allows to leave out the itinerary table in swift in favor of the SchedModel ones. This change leads to performance improvements/regressions by as much as 10% in some benchmarks, in fact we loose 0.4% performance over the llvm-testsuite for reasons that appear to be unknown or out of the compilers control. rdar://20803802 documents the investigation of these effects. While it is probably a good idea to perform the same switch for the other ARM out-of-order CPUs, I limited this change to swift as I cannot perform the benchmark verification on the other CPUs. Differential Revision: http://reviews.llvm.org/D10513 llvm-svn: 242500
-
Matthias Braun authored
Constructing a name based on the function name didn't give us a unique symbol if we had more than one setjmp in a function. Using MCContext::createTempSymbol() always gives us a unique name. Differential Revision: http://reviews.llvm.org/D9314 llvm-svn: 242482
-
Matthias Braun authored
llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling style but is also used in clang to implement __builtin_setjmp. The ARM backend needs to output additional dispatch tables for the SjLj exception handling style, these tables however can't be emitted if llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual landing pad blocks exist. To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj exception handling lowering, so we can differentiate between the case where we actually need to setup a dispatch table and the case where we just need the __builtin_setjmp semantic. Differential Revision: http://reviews.llvm.org/D9313 llvm-svn: 242481
-