- Jan 16, 2015
-
-
Sumanth Gundapaneni authored
llvm-svn: 226302
-
Adam Nemet authored
Similar to the unaligned cases. Test was generated with update_llc_test_checks.py. Part of <rdar://problem/17688758> llvm-svn: 226296
-
Adam Nemet authored
llvm-svn: 226295
-
Duncan P. N. Exon Smith authored
Raise the limit for column information from 8 bits to 16 bits. llvm-svn: 226291
-
Duncan P. N. Exon Smith authored
Line/column fixups already exist in `MDLocation`. Delete the duplicated logic in `DebugLoc`. llvm-svn: 226290
-
Colin LeMahieu authored
llvm-svn: 226288
-
Andrea Di Biagio authored
This patch disables target specific combine on X86ISD::INSERTPS dag nodes if optlevel is CodeGenOpt::None. The backend currently implements a target specific combine rule that converts a vector load used by an INSERTPS dag node into a scalar load plus a scalar_to_vector. This allows ISel to select a single INSERTPSrm instead of two instructions (i.e. a vector load plus INSERTPSrr). However, the existing target combine rule on INSERTPS nodes only works under the assumption that ISel will always be able to match an INSERTPSrm. This is not true in general at -O0, since the backend only allows folding a load into the memory operand of an instruction if the optimization level is not CodeGenOpt::None. In the example below: // __m128 test(__m128 a, __m128 *b) { __m128 c = _mm_insert_ps(a, *b, 1 << 6); return c; } // Before this patch, at -O0, the backend would have canonicalized the load to 'b' into a scalar load plus scalar_to_vector. Later on, ISel would have selected an INSERTPSrr leaving the insertps mask in an inconsistent state: movss 4(%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # xmm0 = xmm1[1],xmm0[1,2,3]. With this patch, the backend avoids folding the vector load into the operand of the INSERTPS. The new codegen at -O0 is: movaps (%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # %xmm1[1],xmm0[1,2,3]. llvm-svn: 226277
-
Toma Tabacu authored
llvm-svn: 226269
-
Simon Pilgrim authored
The current 'big vectors' stack folded reload testing pattern is very bulky and makes it difficult to test all instructions as big vectors will tend to use only the ymm instruction implementations. This patch changes the tests to use a nop call that lists explicit xmm registers as sideeffects, with this we can force a partial register spill of the relevant registers and then check that the reload is correctly folded. The asm generated only adds the forced spill, a nop instruction and a couple of extra labels (a fraction of the current approach). More exhaustive tests will follow shortly, I've added some extra tests (the xmm versions of some of the existing folding tests) as a starting point. Differential Revision: http://reviews.llvm.org/D6932 llvm-svn: 226264
-
Timur Iskhodzhanov authored
This breaks AddressSanitizer (ninja check-asan) on Windows llvm-svn: 226251
-
Filipe Cabecinhas authored
llvm-svn: 226248
-
Hal Finkel authored
Bill Schmidt pointed out that some adjustments would be needed to properly support powerpc64le (using the ELF V2 ABI). For one thing, R11 is not available as a scratch register, so we need to use R12. R12 is also available under ELF V1, so to maintain consistency, I flipped the order to make R12 the first scratch register in the array under both ABIs. llvm-svn: 226247
-
Mehdi Amini authored
http://reviews.llvm.org/D6993 llvm-svn: 226245
-
Rafael Espindola authored
This reverts commit r226173, adding r226038 back. No change in this commit, but clang was changed to also produce trivial comdats for costructors, destructors and vtables when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226242
-
Kevin Enderby authored
removing the macho-archive-headers.test added with r226228 that it is failing on for now while I try to figure out what is going on. llvm-svn: 226241
-
Kevin Enderby authored
the macho-archive-headers.test added with r226228. llvm-svn: 226239
-
Sanjoy Das authored
IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is *experimental*, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. This pass was originally r226201. It was reverted because it used C++ features not supported by MSVC 2012. Differential Revision: http://reviews.llvm.org/D6693 llvm-svn: 226238
-
Kevin Enderby authored
the macho-archive-headers.test added with r226228. llvm-svn: 226232
-
Matt Arsenault authored
llvm-svn: 226230
-
Filipe Cabecinhas authored
llvm-svn: 226229
-
Kevin Enderby authored
Add the option, -archive-headers, used with -macho to print the Mach-O archive headers to llvm-objdump. llvm-svn: 226228
-
Matt Arsenault authored
Instructions with 1 operand can still use source modifiers, so make sure we don't print an extra comma afterwards. llvm-svn: 226226
-
Colin LeMahieu authored
llvm-svn: 226224
-
- Jan 15, 2015
-
-
Filipe Cabecinhas authored
Report fatal errors instead of segfaulting/asserting on a few invalid accesses while reading MachO files. Summary: Shift an older “invalid file” test to get a consistent naming for these tests. Bugs found by afl-fuzz Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6945 llvm-svn: 226219
-
Lang Hames authored
be exported from a dylib if their containing object file were linked into one. No test case: No command line tools query this flag, and there are no Object unit tests. llvm-svn: 226217
-
Sanjoy Das authored
The change used C++11 features not supported by MSVC 2012. I will fix the change to use things supported MSVC 2012 and recommit shortly. llvm-svn: 226216
-
David Majnemer authored
This silences a GCC warning. llvm-svn: 226215
-
Andrew Kaylor authored
llvm-svn: 226214
-
Colin LeMahieu authored
[Hexagon] Fix 226206 by uncommenting required pattern and changing patterns for simple load-extends. llvm-svn: 226210
-
Hal Finkel authored
Function pointers under PPC64 ELFv1 (which is used on PPC64/Linux on the POWER7, A2 and earlier cores) are really pointers to a function descriptor, a structure with three pointers: the actual pointer to the code to which to jump, the pointer to the TOC needed by the callee, and an environment pointer. We used to chain these loads, and make them opaque to the rest of the optimizer, so that they'd always occur directly before the call. This is not necessary, and in fact, highly suboptimal on embedded cores. Once the function pointer is known, the loads can be performed ahead of time; in fact, they can be hoisted out of loops. Now these function descriptors are almost always generated by the linker, and thus the contents of the descriptors are invariant. As a result, by default, we'll mark the associated loads as invariant (allowing them to be hoisted out of loops). I've added a target feature to turn this off, however, just in case someone needs that option (constructing an on-stack descriptor, casting it to a function pointer, and then calling it cannot be well-defined C/C++ code, but I can imagine some JIT-compilation system doing so). Consider this simple test: $ cat call.c typedef void (*fp)(); void bar(fp x) { for (int i = 0; i < 1600000000; ++i) x(); } $ cat main.c typedef void (*fp)(); void bar(fp x); void foo() {} int main() { bar(foo); } On the PPC A2 (the BG/Q supercomputer), marking the function-descriptor loads as invariant brings the execution time down to ~8 seconds from ~32 seconds with the loads in the loop. The difference on the POWER7 is smaller. Compiling with: gcc -std=c99 -O3 -mcpu=native call.c main.c : ~6 seconds [this is 4.8.2] clang -O3 -mcpu=native call.c main.c : ~5.3 seconds clang -O3 -mcpu=native call.c main.c -mno-invariant-function-descriptors : ~4 seconds (looks like we'd benefit from additional loop unrolling here, as a first guess, because this is faster with the extra loads) The -mno-invariant-function-descriptors will be added to Clang shortly. llvm-svn: 226207
-
Colin LeMahieu authored
llvm-svn: 226206
-
Sanjoy Das authored
IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is *experimental*, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. Differential Revision: http://reviews.llvm.org/D6693 llvm-svn: 226201
-
Hal Finkel authored
Reapply r226071 with fixes. Two fixes: 1. We need to manually remove the old and create the new 'deaf defs' associated with physical register definitions when we move the definition of the physical register from the copy point to the point of the original vreg def. This problem was picked up by the machinstr verifier, and could trigger a verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've turned on the verifier in the tests. 2. When moving the def point of the phys reg up, we need to make sure that it is neither defined nor read in between the two instructions. We don't, however, extend the live ranges of phys reg defs to cover uses, so just checking for live-range overlap between the pair interval and the phys reg aliases won't pick up reads. As a result, we manually iterate over the range and check for reads. A test soon to be committed to the PowerPC backend will test this change. Original commit message: [RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying *from* a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226200
-
Philip Reames authored
Use static functions for helpers rather than static member functions. a) this changes the linking (minor at best), and b) this makes it obvious no object state is involved. llvm-svn: 226198
-
Matt Arsenault authored
llvm-svn: 226197
-
Philip Reames authored
llvm-svn: 226196
-
Philip Reames authored
This preparation for an update to http://reviews.llvm.org/D6811. GCStrategy.cpp will hopefully be moving into IR/, where as the lowering logic needs to stay in CodeGen/ llvm-svn: 226195
-
Colin LeMahieu authored
[Hexagon] Removing old versions of vsplice, valign, cl0, ct0 and updating references to new versions. llvm-svn: 226194
-
Marek Olsak authored
This removes some duplicated classes and definitions. These instructions are defined: _e32 // pseudo _e32_si _e64 // pseudo _e64_si _e64_vi llvm-svn: 226191
-
Marek Olsak authored
llvm-svn: 226190
-