Commits · d2d1504805f9bf2a9925dd4f09bfe5329f819030 · Lorenzo Albano / LLVM bpEVL

Apr 23, 2016

[CodeGen] When promoting CTTZ operations to larger type, don't insert a select... · 7e5fad66

Craig Topper authored Apr 23, 2016

[CodeGen] When promoting CTTZ operations to larger type, don't insert a select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part.

llvm-svn: 267280

7e5fad66

BitcodeWriter: Emit uniqued subgraphs after all distinct nodes · 30805b24

Duncan P. N. Exon Smith authored Apr 23, 2016

Since forward references for uniqued node operands are expensive (and
those for distinct node operands are cheap due to
DistinctMDOperandPlaceholder), minimize forward references in uniqued
node operands.

Moreover, guarantee that when a cycle is broken by a distinct node, none
of the uniqued nodes have any forward references.  In
ValueEnumerator::EnumerateMetadata, enumerate uniqued node subgraphs
first, delaying distinct nodes until all uniqued nodes have been
handled.  This guarantees that uniqued nodes only have forward
references when there is a uniquing cycle (since r267276 changed
ValueEnumerator::organizeMetadata to partition distinct nodes in front
of uniqued nodes as a post-pass).

Note that a single uniqued subgraph can hit multiple distinct nodes at
its leaves.  Ideally these would themselves be emitted in post-order,
but this commit doesn't attempt that; I think it requires an extra pass
through the edges, which I'm not convinced is worth it (since
DistinctMDOperandPlaceholder makes forward references quite cheap
between distinct nodes).

I've added two testcases:

  - test/Bitcode/mdnodes-distinct-in-post-order.ll is just like
    test/Bitcode/mdnodes-in-post-order.ll, except with distinct nodes
    instead of uniqued ones.  This confirms that, in the absence of
    uniqued nodes, distinct nodes are still emitted in post-order.

  - test/Bitcode/mdnodes-distinct-nodes-break-cycles.ll is the minimal
    example where a naive post-order traversal would cause one uniqued
    node to forward-reference another.  IOW, it's the motivating test.

llvm-svn: 267278

30805b24

Avoid MSVC failure with default arguments in lambdas from r267270 · 498b4977
Duncan P. N. Exon Smith authored Apr 23, 2016
```
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11700

llvm-svn: 267277
```
498b4977

BitcodeWriter: Emit distinct nodes before uniqued nodes · 1483fff2

Duncan P. N. Exon Smith authored Apr 23, 2016

When an operand of a distinct node hasn't been read yet, the reader can
use a DistinctMDOperandPlaceholder.  This is much cheaper than forward
referencing from a uniqued node.  Change
ValueEnumerator::organizeMetadata to partition distinct nodes and
uniqued nodes to reduce the overhead of cycles broken by distinct nodes.

Mehdi measured this for me; this removes most of the RAUW from the
importing step of -flto=thin, even after a WIP patch that removes
string-based DITypeRefs (introducing many more cycles to the metadata
graph).

llvm-svn: 267276

1483fff2

Address comments. · c814e0c1
Teresa Johnson authored Apr 23, 2016
```
llvm-svn: 267274
```
c814e0c1

Refactor bitcode writer into classes (NFC) · 37687f39

Teresa Johnson authored Apr 23, 2016

Summary:
As discussed in on the mailing list yesterday, I have refactored
BitcodeWriter.cpp to use classes to manage the bitcode writing process,
instead of passing around long lists of parameters between static
functions. See:
  http://lists.llvm.org/pipermail/llvm-dev/2016-April/098610.html

I created a parent BitcodeWriter class to own the BitstreamWriter,
write the header, and contain the main entry point into the writing
process. There are two derived classes, one for writing a module and one
for writing a combined index file (for ThinLTO), which manage the
writing process specific to those bitcode file types.

I also changed the functions to conform to LLVM coding standards
(lowercase function name first letter). The only two routines that still
start with an uppercase letter are the two external interfaces, which
can be fixed as a follow-on (I wanted to keep this round just within
BitcodeWriter.cpp).

Reviewers: dexonsmith, joker.eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19447

llvm-svn: 267273

37687f39

Avoid ternery statement to please g++ after r267270, NFC · e9f85c48
Duncan P. N. Exon Smith authored Apr 23, 2016
```
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/36074

llvm-svn: 267272
```
e9f85c48

ValueEnumerator: Use std::find_if, NFC · d9bbdce7

Duncan P. N. Exon Smith authored Apr 23, 2016

Mehdi's pattern recognition pulled this one out.  This is cleaner with
std::find_if than with the strange helper function that took an iterator
by reference and updated it.

llvm-svn: 267271

d9bbdce7

BitcodeReader: Avoid referencing unresolved nodes from distinct ones · 4b1bc647

Duncan P. N. Exon Smith authored Apr 23, 2016

Each reference to an unresolved MDNode is expensive, since the RAUW
support in MDNode uses a separate allocation and side map.  Since
a distinct MDNode doesn't require its operands on creation (unlike
uniuqed nodes, there's no need to check for structural equivalence),
use nullptr for any of its unresolved operands.  Besides reducing the
burden on MDNode maps, this can avoid allocating temporary MDNodes in
the first place.

We need some way to track operands.  Invent DistinctMDOperandPlaceholder
for this purpose, which is a Metadata subclass that holds an ID and
points at its single user.  DistinctMDOperandPlaceholder::replaceUseWith
is just like RAUW, but its name highlights that there is only ever
exactly one use.

There is no support for moving (or, obviously, copying) these.  Move
support would be possible but expensive; leaving it unimplemented
prevents user error.  In the BitcodeReader I originally considered
allocating on a BumpPtrAllocator and keeping a vector of pointers to
them, and then I realized that std::deque implements exactly this.

A couple of obvious follow-ups:

  - Change ValueEnumerator to emit distinct nodes first to take more
    advantage of this optimization.  (How convenient... I think I might
    have a couple of patches for this.)

  - Change DIBuilder and its consumers (like CGDebugInfo in clang) to
    use something like this when constructing debug info in the first
    place.

llvm-svn: 267270

4b1bc647

BitcodeReader: Consistently use IsDistinct, NFC · 30ab4b47

Duncan P. N. Exon Smith authored Apr 23, 2016

Consistently use the IsDistinct variable and start relying on it in
GET_OR_DISTINCT.  This change has NFC, but prepares for using IsDistinct
to optimize the behaviour of the getMD() and getMDOrNull() helpers.

llvm-svn: 267268

30ab4b47

BitcodeReader: Use getMD/getMDOrNull helpers consistently, almost NFC · 004509dc

Duncan P. N. Exon Smith authored Apr 23, 2016

The only functionality change was removing an error check from the
BitcodeReader (and an assertion from DILocation::getImpl) that is
already caught by Verifier::visitDILocation.  The Verifier is a better
place for this anyway, and being inconsistent with other subclasses of
MDNode isn't serving anyone.

llvm-svn: 267267

004509dc

[Hexagon] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG will... · 6e6a1f0a

Craig Topper authored Apr 23, 2016

[Hexagon] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG will convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC

llvm-svn: 267266

6e6a1f0a

[NVPTX] Set ctlz_zero_undef to Expand so LegalizeDAG will convert it to ctlz.... · 6f8b8e4c

Craig Topper authored Apr 23, 2016

[NVPTX] Set ctlz_zero_undef to Expand so LegalizeDAG will convert it to ctlz. Remove the now unneccessary isel patterns. NFC

llvm-svn: 267265

6f8b8e4c

[WebAssembly] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG... · b297b6b0

Craig Topper authored Apr 23, 2016

[WebAssembly] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG will convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC

llvm-svn: 267264

b297b6b0

Style fix in Core.h / Core.cpp. NFC · b130f43b
Amaury Sechet authored Apr 23, 2016
```
llvm-svn: 267257
```
b130f43b

MachO: remove weird ARM/Thumb interface from MachOObjectFile · 9e8eb418

Tim Northover authored Apr 22, 2016

Only one consumer (llvm-objdump) actually cared about the fact that there were
two triples. Others were actively working around the fact that the Triple
returned by getArch might have been invalid. As for llvm-objdump, it needs to
be acutely aware of both Triples anyway, so being generic in the exposed API is
no benefit.

Also rename the version of getArch returning a Triple. Users were having to
pass an unwanted nullptr to disambiguate the two, which was nasty.

The only functional change here is that armv7m and armv7em object files no
longer crash llvm-objdump.

llvm-svn: 267249

9e8eb418

AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size · 7e8de01f
Matt Arsenault authored Apr 22, 2016
```
llvm-svn: 267244
```
7e8de01f

llvm-symbolizer: Avoid infinite recursion walking dwos where the dwo contains a dwo_name attribute · e438cff4

David Blaikie authored Apr 22, 2016

The dwo_name was added to dwo files to improve diagnostics in dwp, but
it confuses tools that attempt to load any dwo named by a dwo_name, even
ones inside dwos. Avoid this by keeping track of whether a unit is
already a dwo unit, and if so, not loading further dwos.

llvm-svn: 267241

e438cff4

AMDGPU: Re-visit nodes in performAndCombine · efa3fe14
Matt Arsenault authored Apr 22, 2016
```
This fixes test regressions when i64 loads/stores are made promote.

llvm-svn: 267240
```
efa3fe14
Removing unused function. · 79a933a6
Andrew Kaylor authored Apr 22, 2016
```
llvm-svn: 267236
```
79a933a6
Revert r267210, it makes clang assert (PR27490). · 0aa9845d
Nico Weber authored Apr 22, 2016
```
llvm-svn: 267232
```
0aa9845d

Re-commit optimization bisect support (r267022) without new pass manager support. · aa641a51

Andrew Kaylor authored Apr 22, 2016

The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231

aa641a51

Apr 22, 2016

Differential Revision: http://reviews.llvm.org/D19040 · 3cb77343
Sriraman Tallam authored Apr 22, 2016
```
llvm-svn: 267229
```
3cb77343

llvm-symbolizer: prefer .dwo contents over fission-gmlt-like-data when .dwo file is present · 9a4f3cb2

David Blaikie authored Apr 22, 2016

Rather than relying on the gmlt-like data emitted into the .o/executable
which only contains the simple name of any inlined functions, use the
.dwo file if present.

Test symbolication with/without a .dwo, and the old test that was
testing behavior when no gmlt-like data was present. (I haven't included
a test of non-gmlt-like data + no .dwo (that would be akin to
symbolication with no debug info) but we could add one for completeness)

The test was simplified a bit to be a little clearer (unoptimized, force
inline, using a function call as the inlined entity) and regenerated
with ToT clang. For the no-gmlt-like-data case, I modified Clang back to
its old behavior temporarily & the .dwo file is identical so it is
shared between the two executables.

llvm-svn: 267227

9a4f3cb2

Update discriminator assignment algorithm in clang assembler. · 18ce9d82

Dehao Chen authored Apr 22, 2016

Summary: The clang assembler assumes that the discriminator remains the same when there is source line change. The correct behavior is that when there is line change, discriminator will automatically reset to 0.

Reviewers: dnovillo, davidxl, echristo

Subscribers: echristo, llvm-commits

Differential Revision: http://reviews.llvm.org/D19436

llvm-svn: 267226

18ce9d82

Introduce llvm.load.relative intrinsic. · 7dd8dbf4

Peter Collingbourne authored Apr 22, 2016

This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.

LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.

Differential Revision: http://reviews.llvm.org/D18367

llvm-svn: 267223

7dd8dbf4

TLI: Only iterate over integer vector types · 940d19a0
Matt Arsenault authored Apr 22, 2016
```
Instead of iterating over all vectors and skipping integers.

llvm-svn: 267220
```
940d19a0
DAGCombiner: Relax alignment restriction when changing store type · 3b748d76
Matt Arsenault authored Apr 22, 2016
```
If the target allows the alignment, this should be OK.

llvm-svn: 267217
```
3b748d76

[PGO] change the interface for createPGOFuncNameMetadata() · f8f051cb

Rong Xu authored Apr 22, 2016

This patch changes the interface for createPGOFuncNameMetadata() where we add
another PGOFuncName argument.

Differential Revision: http://reviews.llvm.org/D19433

llvm-svn: 267216

f8f051cb

[unordered] sink unordered stores at end of blocks · 5f0e3694

Philip Reames authored Apr 22, 2016

The existing code turned out to be completely correct when auditted.  Thus, only minor code changes and adding a couple of tests.

llvm-svn: 267215

5f0e3694

Fold compares for distinct allocations · f97229d6

Sanjoy Das authored Apr 22, 2016

Summary:
We can fold compares to false when two distinct allocations within a
function are compared for equality.

Patch by Anna Thomas!

Reviewers: majnemer, reames, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19390

llvm-svn: 267214

f97229d6

CodeGen: Use PLT relocations for relative references to unnamed_addr functions. · 265ebd7d

Peter Collingbourne authored Apr 22, 2016

The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.

Also includes a bonus feature: addends for COFF image-relative references.

Differential Revision: http://reviews.llvm.org/D17938

llvm-svn: 267211

265ebd7d

[unordered] Extend load/store type canonicalization to handle unordered operations · eedef73b

Philip Reames authored Apr 22, 2016

Extend the type canonicalization logic to work for unordered atomic loads and stores.  Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before.  Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered.  If you see problems, feel free to revert this change, but please make sure you collect a test case.  

llvm-svn: 267210

eedef73b

DAGCombiner: Relax alignment restriction when changing load type · 629d12de
Matt Arsenault authored Apr 22, 2016
```
If the target allows the alignment, this should still be OK.

llvm-svn: 267209
```
629d12de

[AArch64] Fix optimizeCondBranch logic. · 10768ab0

Quentin Colombet authored Apr 22, 2016

The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267206

10768ab0

PM: Port SinkingPass to the new pass manager · b9394908
Justin Bogner authored Apr 22, 2016
```
llvm-svn: 267199
```
b9394908
PM: Reorder the functions used for SinkingPass. NFC · 82077c4a
Justin Bogner authored Apr 22, 2016
```
This will make the port to the new PM easier to follow.

llvm-svn: 267198
```
82077c4a

[DeadStoreElimination] Shorten beginning of memset overwritten by later stores · d29a24e4

Jun Bum Lim authored Apr 22, 2016

Summary: This change will shorten memset if the beginning of memset is overwritten by later stores.

Reviewers: hfinkel, eeckstein, dberlin, mcrosier

Subscribers: mgrang, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18906

llvm-svn: 267197

d29a24e4

PM: Port DCE to the new pass manager · 395c2127

Justin Bogner authored Apr 22, 2016

Also add a very basic test, since apparently there aren't any tests
for DCE whatsoever to add the new pass version to.

llvm-svn: 267196

395c2127

MachineScheduler: Move code to initialize a Candidate out of tryCandidate(); NFC · 4f57377c
Matthias Braun authored Apr 22, 2016
```
llvm-svn: 267191
```
4f57377c