- Jan 15, 2015
-
-
Andrew Kaylor authored
llvm-svn: 226214
-
Rafael Espindola authored
When combined with llvm not producing implicit comdats, not doing this would cause code bloat on ELF and link errors on COFF. llvm-svn: 226211
-
Colin LeMahieu authored
[Hexagon] Fix 226206 by uncommenting required pattern and changing patterns for simple load-extends. llvm-svn: 226210
-
Hal Finkel authored
The PPC backend will now assume that PPC64 ELFv1 function descriptors are invariant. This must be true for well-defined C/C++ code, but I'm providing an option to disable this assumption in case someone's JIT-engine needs it. llvm-svn: 226209
-
Hans Wennborg authored
Clang would previously become confused and crash here. It does not make a lot of sense to export these, so warning seems appropriate. MSVC will export some member functions for this kind of specializations, whereas MinGW ignores the dllexport-edness. The latter behaviour seems better. Differential Revision: http://reviews.llvm.org/D6984 llvm-svn: 226208
-
Hal Finkel authored
Function pointers under PPC64 ELFv1 (which is used on PPC64/Linux on the POWER7, A2 and earlier cores) are really pointers to a function descriptor, a structure with three pointers: the actual pointer to the code to which to jump, the pointer to the TOC needed by the callee, and an environment pointer. We used to chain these loads, and make them opaque to the rest of the optimizer, so that they'd always occur directly before the call. This is not necessary, and in fact, highly suboptimal on embedded cores. Once the function pointer is known, the loads can be performed ahead of time; in fact, they can be hoisted out of loops. Now these function descriptors are almost always generated by the linker, and thus the contents of the descriptors are invariant. As a result, by default, we'll mark the associated loads as invariant (allowing them to be hoisted out of loops). I've added a target feature to turn this off, however, just in case someone needs that option (constructing an on-stack descriptor, casting it to a function pointer, and then calling it cannot be well-defined C/C++ code, but I can imagine some JIT-compilation system doing so). Consider this simple test: $ cat call.c typedef void (*fp)(); void bar(fp x) { for (int i = 0; i < 1600000000; ++i) x(); } $ cat main.c typedef void (*fp)(); void bar(fp x); void foo() {} int main() { bar(foo); } On the PPC A2 (the BG/Q supercomputer), marking the function-descriptor loads as invariant brings the execution time down to ~8 seconds from ~32 seconds with the loads in the loop. The difference on the POWER7 is smaller. Compiling with: gcc -std=c99 -O3 -mcpu=native call.c main.c : ~6 seconds [this is 4.8.2] clang -O3 -mcpu=native call.c main.c : ~5.3 seconds clang -O3 -mcpu=native call.c main.c -mno-invariant-function-descriptors : ~4 seconds (looks like we'd benefit from additional loop unrolling here, as a first guess, because this is faster with the extra loads) The -mno-invariant-function-descriptors will be added to Clang shortly. llvm-svn: 226207
-
Colin LeMahieu authored
llvm-svn: 226206
-
Rui Ueyama authored
llvm-svn: 226205
-
Vince Harron authored
Switched from ::strtoul to StringConvert::ToUInt32 Changed port output parameter to be -1 if port is unspecified llvm-svn: 226204
-
Hal Finkel authored
This test casts 0x4 to a function pointer and calls it. Unfortunately, the faulting address may not exactly be 0x4 on PPC64 ELFv1 systems. The LLVM PPC backend used to always generate the loads "in order", so we'd fault at 0x4 anyway. However, at upcoming change to loosen that ordering, and we'll pick a different order on some targets. As a result, as explained in the comment, we need to allow for certain nearby addresses as well. llvm-svn: 226202
-
Sanjoy Das authored
IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is *experimental*, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. Differential Revision: http://reviews.llvm.org/D6693 llvm-svn: 226201
-
Hal Finkel authored
Reapply r226071 with fixes. Two fixes: 1. We need to manually remove the old and create the new 'deaf defs' associated with physical register definitions when we move the definition of the physical register from the copy point to the point of the original vreg def. This problem was picked up by the machinstr verifier, and could trigger a verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've turned on the verifier in the tests. 2. When moving the def point of the phys reg up, we need to make sure that it is neither defined nor read in between the two instructions. We don't, however, extend the live ranges of phys reg defs to cover uses, so just checking for live-range overlap between the pair interval and the phys reg aliases won't pick up reads. As a result, we manually iterate over the range and check for reads. A test soon to be committed to the PowerPC backend will test this change. Original commit message: [RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying *from* a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226200
-
Vince Harron authored
The refactor was motivated by some comments that Greg made http://reviews.llvm.org/D6918 and also to break a dependency cascade that caused functions linking in string->int conversion functions to pull in most of lldb llvm-svn: 226199
-
Philip Reames authored
Use static functions for helpers rather than static member functions. a) this changes the linking (minor at best), and b) this makes it obvious no object state is involved. llvm-svn: 226198
-
Matt Arsenault authored
llvm-svn: 226197
-
Philip Reames authored
llvm-svn: 226196
-
Philip Reames authored
This preparation for an update to http://reviews.llvm.org/D6811. GCStrategy.cpp will hopefully be moving into IR/, where as the lowering logic needs to stay in CodeGen/ llvm-svn: 226195
-
Colin LeMahieu authored
[Hexagon] Removing old versions of vsplice, valign, cl0, ct0 and updating references to new versions. llvm-svn: 226194
-
Dan Albert authored
Fixes issue in r226185. llvm-svn: 226192
-
Marek Olsak authored
This removes some duplicated classes and definitions. These instructions are defined: _e32 // pseudo _e32_si _e64 // pseudo _e64_si _e64_vi llvm-svn: 226191
-
Marek Olsak authored
llvm-svn: 226190
-
Marek Olsak authored
These are VOP3-only on VI. The new multiclass doesn't define VOP3 versions of VOP2 instructions. llvm-svn: 226189
-
Marek Olsak authored
v2: modify hasVALU32BitEncoding instead v3: - add pseudoToMCOpcode helper to AMDGPUInstInfo, which is used by both hasVALU32BitEncoding and AMDGPUMCInstLower::lower - report an error if a pseudo can't be lowered llvm-svn: 226188
-
Marek Olsak authored
llvm-svn: 226187
-
Marek Olsak authored
llvm-svn: 226186
-
Dan Albert authored
llvm-svn: 226185
-
Colin LeMahieu authored
llvm-svn: 226184
-
Ramkumar Ramachandra authored
Mechanical conversion of statepoint tests to use the example-statepoint gc. llvm-svn: 226183
-
Joerg Sonnenberger authored
llvm-svn: 226182
-
Nico Weber authored
llvm-svn: 226181
-
Nico Weber authored
llvm-svn: 226180
-
Colin LeMahieu authored
llvm-svn: 226179
-
Nathan Sidwell authored
reject CV void return type on C definitions per 6.9.1/3 llvm-svn: 226178
-
Evgeniy Stepanov authored
llvm-svn: 226177
-
Colin LeMahieu authored
llvm-svn: 226176
-
Evgeniy Stepanov authored
Allows loading sanitizer options from file. llvm-svn: 226175
-
Jon Roelofs authored
llvm-svn: 226174
-
Timur Iskhodzhanov authored
It breaks AddressSanitizer on Windows. llvm-svn: 226173
-
Alexander Kornienko authored
We are porting some of the checkers at a company we developed to the Clang Tidy infrastructure. We would like to open source the checkers that may be useful for the community as well. This patch is the first checker that is being ported to Clang Tidy. We also added fix-it hints, and applied them to LLVM: http://reviews.llvm.org/D6924 The code compiled and the unit tests are passed after the fixits was applied. The documentation of the checker: /// The emptiness of a container should be checked using the empty method /// instead of the size method. It is not guaranteed that size is a /// constant-time function, and it is generally more efficient and also shows /// clearer intent to use empty. Furthermore some containers may implement the /// empty method but not implement the size method. Using empty whenever /// possible makes it easier to switch to another container in the future. It also uses some custom ASTMatchers. In case you find them useful I can submit them as separate patches to clang. I will apply your suggestions to this patch. http://reviews.llvm.org/D6925 Patch by Gábor Horváth! llvm-svn: 226172
-
Daniel Sanders authored
Summary: The patterns intended for the SETLE node were actually matching the SETLT node. Reviewers: atanasyan, sstankovic, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6997 llvm-svn: 226171
-