- Apr 23, 2016
-
-
Craig Topper authored
[CodeGen] When promoting CTTZ operations to larger type, don't insert a select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part. llvm-svn: 267280
-
Duncan P. N. Exon Smith authored
Since forward references for uniqued node operands are expensive (and those for distinct node operands are cheap due to DistinctMDOperandPlaceholder), minimize forward references in uniqued node operands. Moreover, guarantee that when a cycle is broken by a distinct node, none of the uniqued nodes have any forward references. In ValueEnumerator::EnumerateMetadata, enumerate uniqued node subgraphs first, delaying distinct nodes until all uniqued nodes have been handled. This guarantees that uniqued nodes only have forward references when there is a uniquing cycle (since r267276 changed ValueEnumerator::organizeMetadata to partition distinct nodes in front of uniqued nodes as a post-pass). Note that a single uniqued subgraph can hit multiple distinct nodes at its leaves. Ideally these would themselves be emitted in post-order, but this commit doesn't attempt that; I think it requires an extra pass through the edges, which I'm not convinced is worth it (since DistinctMDOperandPlaceholder makes forward references quite cheap between distinct nodes). I've added two testcases: - test/Bitcode/mdnodes-distinct-in-post-order.ll is just like test/Bitcode/mdnodes-in-post-order.ll, except with distinct nodes instead of uniqued ones. This confirms that, in the absence of uniqued nodes, distinct nodes are still emitted in post-order. - test/Bitcode/mdnodes-distinct-nodes-break-cycles.ll is the minimal example where a naive post-order traversal would cause one uniqued node to forward-reference another. IOW, it's the motivating test. llvm-svn: 267278
-
Duncan P. N. Exon Smith authored
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11700 llvm-svn: 267277
-
Duncan P. N. Exon Smith authored
When an operand of a distinct node hasn't been read yet, the reader can use a DistinctMDOperandPlaceholder. This is much cheaper than forward referencing from a uniqued node. Change ValueEnumerator::organizeMetadata to partition distinct nodes and uniqued nodes to reduce the overhead of cycles broken by distinct nodes. Mehdi measured this for me; this removes most of the RAUW from the importing step of -flto=thin, even after a WIP patch that removes string-based DITypeRefs (introducing many more cycles to the metadata graph). llvm-svn: 267276
-
Teresa Johnson authored
llvm-svn: 267274
-
Teresa Johnson authored
Summary: As discussed in on the mailing list yesterday, I have refactored BitcodeWriter.cpp to use classes to manage the bitcode writing process, instead of passing around long lists of parameters between static functions. See: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098610.html I created a parent BitcodeWriter class to own the BitstreamWriter, write the header, and contain the main entry point into the writing process. There are two derived classes, one for writing a module and one for writing a combined index file (for ThinLTO), which manage the writing process specific to those bitcode file types. I also changed the functions to conform to LLVM coding standards (lowercase function name first letter). The only two routines that still start with an uppercase letter are the two external interfaces, which can be fixed as a follow-on (I wanted to keep this round just within BitcodeWriter.cpp). Reviewers: dexonsmith, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19447 llvm-svn: 267273
-
Duncan P. N. Exon Smith authored
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/36074 llvm-svn: 267272
-
Duncan P. N. Exon Smith authored
Mehdi's pattern recognition pulled this one out. This is cleaner with std::find_if than with the strange helper function that took an iterator by reference and updated it. llvm-svn: 267271
-
Duncan P. N. Exon Smith authored
Each reference to an unresolved MDNode is expensive, since the RAUW support in MDNode uses a separate allocation and side map. Since a distinct MDNode doesn't require its operands on creation (unlike uniuqed nodes, there's no need to check for structural equivalence), use nullptr for any of its unresolved operands. Besides reducing the burden on MDNode maps, this can avoid allocating temporary MDNodes in the first place. We need some way to track operands. Invent DistinctMDOperandPlaceholder for this purpose, which is a Metadata subclass that holds an ID and points at its single user. DistinctMDOperandPlaceholder::replaceUseWith is just like RAUW, but its name highlights that there is only ever exactly one use. There is no support for moving (or, obviously, copying) these. Move support would be possible but expensive; leaving it unimplemented prevents user error. In the BitcodeReader I originally considered allocating on a BumpPtrAllocator and keeping a vector of pointers to them, and then I realized that std::deque implements exactly this. A couple of obvious follow-ups: - Change ValueEnumerator to emit distinct nodes first to take more advantage of this optimization. (How convenient... I think I might have a couple of patches for this.) - Change DIBuilder and its consumers (like CGDebugInfo in clang) to use something like this when constructing debug info in the first place. llvm-svn: 267270
-
Duncan P. N. Exon Smith authored
Consistently use the IsDistinct variable and start relying on it in GET_OR_DISTINCT. This change has NFC, but prepares for using IsDistinct to optimize the behaviour of the getMD() and getMDOrNull() helpers. llvm-svn: 267268
-
Duncan P. N. Exon Smith authored
The only functionality change was removing an error check from the BitcodeReader (and an assertion from DILocation::getImpl) that is already caught by Verifier::visitDILocation. The Verifier is a better place for this anyway, and being inconsistent with other subclasses of MDNode isn't serving anyone. llvm-svn: 267267
-
Craig Topper authored
[Hexagon] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG will convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC llvm-svn: 267266
-
Craig Topper authored
[NVPTX] Set ctlz_zero_undef to Expand so LegalizeDAG will convert it to ctlz. Remove the now unneccessary isel patterns. NFC llvm-svn: 267265
-
Craig Topper authored
[WebAssembly] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG will convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC llvm-svn: 267264
-
Amaury Sechet authored
llvm-svn: 267257
-
Tim Northover authored
Only one consumer (llvm-objdump) actually cared about the fact that there were two triples. Others were actively working around the fact that the Triple returned by getArch might have been invalid. As for llvm-objdump, it needs to be acutely aware of both Triples anyway, so being generic in the exposed API is no benefit. Also rename the version of getArch returning a Triple. Users were having to pass an unwanted nullptr to disambiguate the two, which was nasty. The only functional change here is that armv7m and armv7em object files no longer crash llvm-objdump. llvm-svn: 267249
-
Matt Arsenault authored
llvm-svn: 267244
-
David Blaikie authored
The dwo_name was added to dwo files to improve diagnostics in dwp, but it confuses tools that attempt to load any dwo named by a dwo_name, even ones inside dwos. Avoid this by keeping track of whether a unit is already a dwo unit, and if so, not loading further dwos. llvm-svn: 267241
-
Matt Arsenault authored
This fixes test regressions when i64 loads/stores are made promote. llvm-svn: 267240
-
Andrew Kaylor authored
llvm-svn: 267236
-
Nico Weber authored
llvm-svn: 267232
-
Andrew Kaylor authored
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
-
- Apr 22, 2016
-
-
-
David Blaikie authored
Rather than relying on the gmlt-like data emitted into the .o/executable which only contains the simple name of any inlined functions, use the .dwo file if present. Test symbolication with/without a .dwo, and the old test that was testing behavior when no gmlt-like data was present. (I haven't included a test of non-gmlt-like data + no .dwo (that would be akin to symbolication with no debug info) but we could add one for completeness) The test was simplified a bit to be a little clearer (unoptimized, force inline, using a function call as the inlined entity) and regenerated with ToT clang. For the no-gmlt-like-data case, I modified Clang back to its old behavior temporarily & the .dwo file is identical so it is shared between the two executables. llvm-svn: 267227
-
Dehao Chen authored
Summary: The clang assembler assumes that the discriminator remains the same when there is source line change. The correct behavior is that when there is line change, discriminator will automatically reset to 0. Reviewers: dnovillo, davidxl, echristo Subscribers: echristo, llvm-commits Differential Revision: http://reviews.llvm.org/D19436 llvm-svn: 267226
-
Peter Collingbourne authored
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223
-
Matt Arsenault authored
Instead of iterating over all vectors and skipping integers. llvm-svn: 267220
-
Matt Arsenault authored
If the target allows the alignment, this should be OK. llvm-svn: 267217
-
Rong Xu authored
This patch changes the interface for createPGOFuncNameMetadata() where we add another PGOFuncName argument. Differential Revision: http://reviews.llvm.org/D19433 llvm-svn: 267216
-
Philip Reames authored
The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests. llvm-svn: 267215
-
Sanjoy Das authored
Summary: We can fold compares to false when two distinct allocations within a function are compared for equality. Patch by Anna Thomas! Reviewers: majnemer, reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19390 llvm-svn: 267214
-
Peter Collingbourne authored
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual functions defined in other DSOs. The unnamed_addr attribute means that the function's address is not significant, so we're allowed to substitute it with the address of a PLT entry. Also includes a bonus feature: addends for COFF image-relative references. Differential Revision: http://reviews.llvm.org/D17938 llvm-svn: 267211
-
Philip Reames authored
Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. llvm-svn: 267210
-
Matt Arsenault authored
If the target allows the alignment, this should still be OK. llvm-svn: 267209
-
Quentin Colombet authored
The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267206
-
Justin Bogner authored
llvm-svn: 267199
-
Justin Bogner authored
This will make the port to the new PM easier to follow. llvm-svn: 267198
-
Jun Bum Lim authored
Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197
-
Justin Bogner authored
Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196
-
Matthias Braun authored
llvm-svn: 267191
-