Commits · 0ed151a1cab01dd6dd70c7f8fd5c51df00755cfa · Roger Ferrer / llvm-epi

Sep 08, 2014
- Add .clang-tidy configuration file to provide LLVM-optimized defaults for · 0ed151a1
  Alexander Kornienko authored Sep 08, 2014
```
clang-tidy.

Reviewers: chandlerc, djasper

Reviewed By: djasper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5236

llvm-svn: 217365
```
  0ed151a1
- Spelling correction · ac3e325d
  Sid Manning authored Sep 08, 2014
```
Another trivial spelling change.

llvm-svn: 217364
```
  ac3e325d
- Add a comment to getNewAlignmentDiff. · 8fc3c6c0
  Andrew Trick authored Sep 07, 2014
```
llvm-svn: 217350
```
  8fc3c6c0
Sep 07, 2014

Make use @llvm.assume for loop guards in ScalarEvolution · cebf0cc2

Hal Finkel authored Sep 07, 2014

This adds a basic (but important) use of @llvm.assume calls in ScalarEvolution.
When SE is attempting to validate a condition guarding a loop (such as whether
or not the loop count can be zero), this check should also include dominating
assumptions.

llvm-svn: 217348

cebf0cc2

Check for all known bits on ret in InstCombine · 93873cc1

Hal Finkel authored Sep 07, 2014

From a combination of @llvm.assume calls (and perhaps through other means, such
as range metadata), it is possible that all bits of a return value might be
known. Previously, InstCombine did not check for this (which is understandable
given assumptions of constant propagation), but means that we'd miss simple
cases where assumptions are involved.

llvm-svn: 217346

93873cc1

Make use of @llvm.assume from LazyValueInfo · 7e184494

Hal Finkel authored Sep 07, 2014

This change teaches LazyValueInfo to use the @llvm.assume intrinsic. Like with
the known-bits change (r217342), this requires feeding a "context" instruction
pointer through many functions. Aside from a little refactoring to reuse the
logic that turns predicates into constant ranges in LVI, the only new code is
that which can 'merge' the range from an assumption into that otherwise
computed. There is also a small addition to JumpThreading so that it can have
LVI use assumptions in the same block as the comparison feeding a conditional
branch.

With this patch, we can now simplify this as expected:
int foo(int a) {
  __builtin_assume(a > 5);
  if (a > 3) {
    bar();
    return 1;
  }
  return 0;
}

llvm-svn: 217345

7e184494

Add an AlignmentFromAssumptions Pass · d67e4639

Hal Finkel authored Sep 07, 2014

This adds a ScalarEvolution-powered transformation that updates load, store and
memory intrinsic pointer alignments based on invariant((a+q) & b == 0)
expressions. Many of the simple cases we can get with ValueTracking, but we
still need something like this for the more complicated cases (such as those
with an offset) that require some algebra. Note that gcc's
__builtin_assume_aligned's optional third argument provides exactly for this
kind of 'misalignment' offset for which this kind of logic is necessary.

The primary motivation is to fixup alignments for vector loads/stores after
vectorization (and unrolling). This pass is added to the optimization pipeline
just after the SLP vectorizer runs (which, admittedly, does not preserve SE,
although I imagine it could).  Regardless, I actually don't think that the
preservation matters too much in this case: SE computes lazily, and this pass
won't issue any SE queries unless there are any assume intrinsics, so there
should be no real additional cost in the common case (SLP does preserve DT and
LoopInfo).

llvm-svn: 217344

d67e4639

Add additional patterns for @llvm.assume in ValueTracking · 15aeaaf2

Hal Finkel authored Sep 07, 2014

This builds on r217342, which added the infrastructure to compute known bits
using assumptions (@llvm.assume calls). That original commit added only a few
patterns (to catch common cases related to determining pointer alignment); this
change adds several other patterns for simple cases.

r217342 contained that, for assume(v & b = a), bits in the mask
that are known to be one, we can propagate known bits from the a to v. It also
had a known-bits transfer for assume(a = b). This patch adds:

assume(~(v & b) = a) : For those bits in the mask that are known to be one, we
can propagate inverted known bits from the a to v.

assume(v | b = a) : For those bits in b that are known to be zero, we can
propagate known bits from the a to v.

assume(~(v | b) = a): For those bits in b that are known to be zero, we can
propagate inverted known bits from the a to v.

assume(v ^ b = a) : For those bits in b that are known to be zero, we can
propagate known bits from the a to v. For those bits in
b that are known to be one, we can propagate inverted
known bits from the a to v.

assume(~(v ^ b) = a) : For those bits in b that are known to be zero, we can
propagate inverted known bits from the a to v. For those
bits in b that are known to be one, we can propagate
known bits from the a to v.

assume(v << c = a) : For those bits in a that are known, we can propagate them
to known bits in v shifted to the right by c.

assume(~(v << c) = a) : For those bits in a that are known, we can propagate
them inverted to known bits in v shifted to the right by c.

assume(v >> c = a) : For those bits in a that are known, we can propagate them
to known bits in v shifted to the right by c.

assume(~(v >> c) = a) : For those bits in a that are known, we can propagate
them inverted to known bits in v shifted to the right by c.

assume(v >=_s c) where c is non-negative: The sign bit of v is zero

assume(v >_s c) where c is at least -1: The sign bit of v is zero

assume(v <=_s c) where c is negative: The sign bit of v is one

assume(v <_s c) where c is non-positive: The sign bit of v is one

assume(v <=_u c): Transfer the known high zero bits

assume(v <_u c): Transfer the known high zero bits (if c is know to be a power
of 2, transfer one more)

A small addition to InstCombine was necessary for some of the test cases. The
problem is that when InstCombine was simplifying and, or, etc. it would fail to
check the 'do I know all of the bits' condition before checking less specific
conditions and would not fully constant-fold the result. I'm not sure how to
trigger this aside from using assumptions, so I've just included the change
here.

llvm-svn: 217343

15aeaaf2

Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.) · 60db0589

Hal Finkel authored Sep 07, 2014

This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.

As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.

The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.

Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.

This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).

llvm-svn: 217342

60db0589

DebugInfo: Do not use DW_FORM_GNU_addr_index in skeleton CUs, GDB 7.8 errors on this. · c42f9ac0

David Blaikie authored Sep 07, 2014

It's probably not a huge deal to not do this - if we could, maybe the
address could be reused by a subprogram low_pc and avoid an extra
relocation, but it's just one per CU at best.

llvm-svn: 217338

c42f9ac0

Add functions for finding ephemeral values · 57f03dda

Hal Finkel authored Sep 07, 2014

This adds a set of utility functions for collecting 'ephemeral' values. These
are LLVM IR values that are used only by @llvm.assume intrinsics (directly or
indirectly), and thus will be removed prior to code generation, implying that
they should be considered free for certain purposes (like inlining). The
inliner's cost analysis, and a few other passes, have been updated to account
for ephemeral values using the provided functionality.

This functionality is important for the usability of @llvm.assume, because it
limits the "non-local" side-effects of adding llvm.assume on inlining, loop
unrolling, etc. (these are hints, and do not generate code, so they should not
directly contribute to estimates of execution cost).

llvm-svn: 217335

57f03dda

Add an Assumption-Tracking Pass · 74c2f355

Hal Finkel authored Sep 07, 2014

This adds an immutable pass, AssumptionTracker, which keeps a cache of
@llvm.assume call instructions within a module. It uses callback value handles
to keep stale functions and intrinsics out of the map, and it relies on any
code that creates new @llvm.assume calls to notify it of the new instructions.
The benefit is that code needing to find @llvm.assume intrinsics can do so
directly, without scanning the function, thus allowing the cost of @llvm.assume
handling to be negligible when none are present.

The current design is intended to be lightweight. We don't keep track of
anything until we need a list of assumptions in some function. The first time
this happens, we scan the function. After that, we add/remove @llvm.assume
calls from the cache in response to registration calls and ValueHandle
callbacks.

There are no new direct test cases for this pass, but because it calls it
validation function upon module finalization, we'll pick up detectable
inconsistencies from the other tests that touch @llvm.assume calls.

This pass will be used by follow-up commits that make use of @llvm.assume.

llvm-svn: 217334

74c2f355

[x86] Revert my over-eager commit in r217332. · 0a8151e6

Chandler Carruth authored Sep 07, 2014

I hadn't actually run all the tests yet and these combines have somewhat
surprisingly far reaching effects.

llvm-svn: 217333

0a8151e6

[x86] Tweak the rules surrounding 0,0 and 1,1 v2f64 shuffles and add · 8405e8ff

Chandler Carruth authored Sep 07, 2014

support for MOVDDUP which is really important for matrix multiply style
operations that do lots of non-vector-aligned load and splats.

The original motivation was to add support for MOVDDUP as the lack of it
regresses matmul_f64_4x4 by 5% or so. However, all of the rules here
were somewhat suspicious.

First, we should always be using the floating point domain shuffles,
regardless of how many copies we have to make as a movapd is *crazy*
faster than the domain switching cost on some chips. (Mostly because
movapd is crazy cheap.) Because SHUFPD can't do the copy-for-free trick
of the PSHUF instructions, there is no need to avoid canonicalizing on
UNPCK variants, so do that canonicalizing. This also ensures we have the
chance to form MOVDDUP. =]

Second, we assume SSE2 support when doing any vector lowering, and given
that we should just use UNPCKLPD and UNPCKHPD as they can operate on
registers or memory. If vectors get spilled or come from memory at all
this is going to allow the load to be folded into the operation. If we
want to optimize for encoding size (the only difference, and only
a 2 byte difference) it should be done *much* later, likely after RA.

llvm-svn: 217332

8405e8ff

Try to unflake AllocatorTest.TestAlignmentPastSlab · e5a96a5c
Hans Wennborg authored Sep 07, 2014
```
llvm-svn: 217331
```
e5a96a5c

BumpPtrAllocator: do the size check without moving any pointers · 44e27464

Hans Wennborg authored Sep 07, 2014

Instead of aligning and moving the CurPtr forward, and then comparing
with End, simply calculate how much space is needed, and compare that
to how much is available.

Hopefully this avoids any doubts about comparing addresses possibly
derived from past the end of the slab array, overflowing, etc.

Also add a test where aligning CurPtr would move it past End.

llvm-svn: 217330

44e27464

[MCJIT] Revert partial RuntimeDyldELF cleanup that was prematurely committed in · 9a891052
Lang Hames authored Sep 07, 2014
```
r217328.

llvm-svn: 217329
```
9a891052

[MCJIT] Rewrite RuntimeDyldMachO and its derived classes to use the 'Offset' · ca279c22

Lang Hames authored Sep 07, 2014

field of RelocationValueRef, rather than the 'Addend' field.

This is consistent with RuntimeDyldELF's use of RelocationValueRef, and more
consistent with the semantics of the data being stored (the offset from the
start of a section or symbol).

llvm-svn: 217328

ca279c22

[MCJIT] Fix a bug RuntimeDyldImpl's read/writeBytesUnaligned methods. · 69abd72e

Lang Hames authored Sep 07, 2014

The previous implementation was writing to the high-bytes of integers on BE
targets (when run on LE hosts).

http://llvm.org/PR20640

llvm-svn: 217325

69abd72e

R600/SI: Fix register class for some 64-bit atomics · 76803bd3
Matt Arsenault authored Sep 07, 2014
```
llvm-svn: 217323
```
76803bd3

Sep 06, 2014

R600/SI: Relax a few tests to help enable scheduler · 7b46a59b
Matt Arsenault authored Sep 06, 2014
```
llvm-svn: 217320
```
7b46a59b
R600/SI: Fix broken check lines. · a9fcf62a
Matt Arsenault authored Sep 06, 2014
```
Fix missing check, and hardcoded register numbers.

llvm-svn: 217318
```
a9fcf62a

MC: correct DWARF line info for PE/COFF · fcefa21b

Saleem Abdulrasool authored Sep 06, 2014

DWARF address ranges contain a reference to the debug_info section.  This offset
is an absolute relocation except on non-PE/COFF targets where it is section
relative.  We would emit this incorrectly, and trying to map the debug info from
the address would fail.

llvm-svn: 217317

fcefa21b

[x86] Fix a pretty horrible bug and inconsistency in the x86 asm · 373b2b17

Chandler Carruth authored Sep 06, 2014

parsing (and latent bug in the instruction definitions).

This is effectively a revert of r136287 which tried to address
a specific and narrow case of immediate operands failing to be accepted
by x86 instructions with a pretty heavy hammer: it introduced a new kind
of operand that behaved differently. All of that is removed with this
commit, but the test cases are both preserved and enhanced.

The core problem that r136287 and this commit are trying to handle is
that gas accepts both of the following instructions:

  insertps $192, %xmm0, %xmm1
  insertps $-64, %xmm0, %xmm1

These will encode to the same byte sequence, with the immediate
occupying an 8-bit entry. The first form was fixed by r136287 but that
broke the prior handling of the second form! =[ Ironically, we would
still emit the second form in some cases and then be unable to
re-assemble the output.

The reason why the first instruction failed to be handled is because
prior to r136287 the operands ere marked 'i32i8imm' which forces them to
be sign-extenable. Clearly, that won't work for 192 in a single byte.
However, making thim zero-extended or "unsigned" doesn't really address
the core issue either because it breaks negative immediates. The correct
fix is to make these operands 'i8imm' reflecting that they can be either
signed or unsigned but must be 8-bit immediates. This patch backs out
r136287 and then changes those places as well as some others to use
'i8imm' rather than one of the extended variants.

Naturally, this broke something else. The custom DAG nodes had to be
updated to have a much more accurate type constraint of an i8 node, and
a bunch of Pat immediates needed to be specified as i8 values.

The fallout didn't end there though. We also then ceased to be able to
match the instruction-specific intrinsics to the instructions so
modified. Digging, this is because they too used i32 rather than i8 in
their signature. So I've also switched those intrinsics to i8 arguments
in line with the instructions.

In order to make the intrinsic adjustments of course, I also had to add
auto upgrading for the intrinsics.

I suspect that the intrinsic argument types may have led everything down
this rabbit hole. Pretty happy with the result.

llvm-svn: 217310

373b2b17

Check whether the iterator p == the end iterator before trying to dereference... · 095b92e5

Nick Lewycky authored Sep 06, 2014

Check whether the iterator p == the end iterator before trying to dereference it. This is a speculative fix for a failure found on the valgrind buildbot triggered by a clang test.

llvm-svn: 217295

095b92e5

Fix right shift by 64 bits detected on CXX/lex/lex.literal/lex.ext/p4.cpp · ba1ecbc7
Alexey Samsonov authored Sep 06, 2014
```
test case on UBSan bootstrap bot.

This fixes the last failure of "check-clang" in UBSan bootstrap bot.

llvm-svn: 217294
```
ba1ecbc7
[docs] Document what "NFC" means in a commit message. · 5e44ffdb
Sean Silva authored Sep 06, 2014
```
llvm-svn: 217292
```
5e44ffdb

[MCJIT] Fix an iterator invalidation bug in MCJIT::finalizeObject. · 018452e6

Lang Hames authored Sep 05, 2014

The finalizeObject method calls generateCodeForModule on each of the currently
'added' objects, but generateCodeForModule moves objects out of the 'added'
set as it's called. To avoid iterator invalidation issues, the added set is
copied out before any calls to generateCodeForModule.

This should fix http://llvm.org/PR20851 .

llvm-svn: 217291

018452e6

[x86] Fix an embarressing bug in the INSERTPS formation code. The mask · 21d27ee9

Chandler Carruth authored Sep 05, 2014

computation was totally wrong, but somehow it didn't really show up with
llc.

I've added an assert that triggers on multiple existing test cases and
updated one of them to show the correct value.

There appear to still be more bugs lurking around insertps's mask. =/
However, note that this only really impacts the new vector shuffle
lowering.

llvm-svn: 217289

21d27ee9

[inline asm] Add a check in InlineAsm::ConstraintInfo::Parse to make sure '{' · 489decec

Akira Hatanaka authored Sep 05, 2014

follows '~' in a clobber constraint string.

Previously llc would hit an llvm_unreachable when compiling an inline-asm
instruction with malformed constraint string "~x{21}". This commit enables
LLParser to catch the error earlier and print a more helpful diagnostic.

rdar://problem/14206559

llvm-svn: 217288

489decec

Allow vector fsub ops with constants to get the same optimizations as scalars. · 75cc90ed

Sanjay Patel authored Sep 05, 2014

This problem is bigger than just fsub, but this is the minimum fix to solve
fneg for PR20556 ( http://llvm.org/bugs/show_bug.cgi?id=20556 ), and we solve
zero subtraction with the same change.

llvm-svn: 217286

75cc90ed

Sep 05, 2014

Fix pr20078. · d31dc048

Rafael Espindola authored Sep 05, 2014

When linking llvm.global_ctors with the optional third element we have to handle
it specially and only copy the elements whose keys were also copied.

llvm-svn: 217281

d31dc048

Restore the ability to check if LLVMCreateObjectFile was successful · 5a121b2e

Bjorn Steinbrink authored Sep 05, 2014

Summary:
Until r216870 LLVMCreateObjectFile returned nullptr in case of an error,
so callers could check if the call was successful. Now, it always
returns an OwningBinary wrapped as an LLVMObjectFileRef, so callers
can't check if the call was successul.

This results in a segfault running e.g.

 llvm-c-test --object-list-sections < /dev/null

So the old behaviour should be restored.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5143

llvm-svn: 217279

5a121b2e

clean up; NFC · f4b8debc
Sanjay Patel authored Sep 05, 2014
```
llvm-svn: 217278
```
f4b8debc

[DWARF parser] Fix nasty memory corruption in .dwo files handling. · d3e12133

Alexey Samsonov authored Sep 05, 2014

Forge a test case where llvm-symbolizer has to use external .dwo
file to produce the inlining information.

llvm-svn: 217270

d3e12133

The gold tests also require ppc to be compiled in. · 262b9032

Rafael Espindola authored Sep 05, 2014

We could create a tools/gold/PowerPC and a tools/gold/X86, but it doesn't seem
worth it.

llvm-svn: 217267

262b9032

Revert "Disable the fix for pr20793 because of a gnu ld bug." · b582372e

Rafael Espindola authored Sep 05, 2014

This reverts commit r217211.

Both the bfd ld and gold outputs were valid. They were using a Rela relocation,
so the value present in the relocated location was not used, which caused me
to misread the output.

llvm-svn: 217264

b582372e

[MCJIT] Const-ify the symbol lookup operations on RuntimeDyld. · 3e930a3e
Lang Hames authored Sep 05, 2014
```
llvm-svn: 217263
```
3e930a3e

Set the parent pointer of cloned DBG_VALUE instructions correctly. · e5e8ce64

Adrian Prantl authored Sep 05, 2014

Fixes PR20523.

When spilling variables onto the stack, spillVirtReg() is setting the
parent pointer of the cloned DBG_VALUE intrinsic for the stack location
to the parent pointer of the original intrinsic. MachineInstr parent
pointers should however always point to the parent basic block.

MBB is shadowing the MBB member variable. The instruction still ends up
being inserted into the right basic block, because it's inserted after MI
which serves as the iterator.

I failed at constructing a reliable testcase for this, see
http://llvm.org/bugs/show_bug.cgi?id=20523 for a large testcases.

llvm-svn: 217260

e5e8ce64

[mips] Change Feature-related types from unsigned to uint64_t in... · 901ba6ea

Toma Tabacu authored Sep 05, 2014

[mips] Change Feature-related types from unsigned to uint64_t in MipsAsmParser. No functional changes.

Summary: Found a couple of cases where unsigned was still being used. These two should be the last ones in the (entire) Mips backend.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D5028

llvm-svn: 217257

901ba6ea