Skip to content
  1. Sep 08, 2014
  2. Sep 07, 2014
    • Hal Finkel's avatar
      Make use @llvm.assume for loop guards in ScalarEvolution · cebf0cc2
      Hal Finkel authored
      This adds a basic (but important) use of @llvm.assume calls in ScalarEvolution.
      When SE is attempting to validate a condition guarding a loop (such as whether
      or not the loop count can be zero), this check should also include dominating
      assumptions.
      
      llvm-svn: 217348
      cebf0cc2
    • Hal Finkel's avatar
      Check for all known bits on ret in InstCombine · 93873cc1
      Hal Finkel authored
      From a combination of @llvm.assume calls (and perhaps through other means, such
      as range metadata), it is possible that all bits of a return value might be
      known. Previously, InstCombine did not check for this (which is understandable
      given assumptions of constant propagation), but means that we'd miss simple
      cases where assumptions are involved.
      
      llvm-svn: 217346
      93873cc1
    • Hal Finkel's avatar
      Make use of @llvm.assume from LazyValueInfo · 7e184494
      Hal Finkel authored
      This change teaches LazyValueInfo to use the @llvm.assume intrinsic. Like with
      the known-bits change (r217342), this requires feeding a "context" instruction
      pointer through many functions. Aside from a little refactoring to reuse the
      logic that turns predicates into constant ranges in LVI, the only new code is
      that which can 'merge' the range from an assumption into that otherwise
      computed. There is also a small addition to JumpThreading so that it can have
      LVI use assumptions in the same block as the comparison feeding a conditional
      branch.
      
      With this patch, we can now simplify this as expected:
      int foo(int a) {
        __builtin_assume(a > 5);
        if (a > 3) {
          bar();
          return 1;
        }
        return 0;
      }
      
      llvm-svn: 217345
      7e184494
    • Hal Finkel's avatar
      Add an AlignmentFromAssumptions Pass · d67e4639
      Hal Finkel authored
      This adds a ScalarEvolution-powered transformation that updates load, store and
      memory intrinsic pointer alignments based on invariant((a+q) & b == 0)
      expressions. Many of the simple cases we can get with ValueTracking, but we
      still need something like this for the more complicated cases (such as those
      with an offset) that require some algebra. Note that gcc's
      __builtin_assume_aligned's optional third argument provides exactly for this
      kind of 'misalignment' offset for which this kind of logic is necessary.
      
      The primary motivation is to fixup alignments for vector loads/stores after
      vectorization (and unrolling). This pass is added to the optimization pipeline
      just after the SLP vectorizer runs (which, admittedly, does not preserve SE,
      although I imagine it could).  Regardless, I actually don't think that the
      preservation matters too much in this case: SE computes lazily, and this pass
      won't issue any SE queries unless there are any assume intrinsics, so there
      should be no real additional cost in the common case (SLP does preserve DT and
      LoopInfo).
      
      llvm-svn: 217344
      d67e4639
    • Hal Finkel's avatar
      Add additional patterns for @llvm.assume in ValueTracking · 15aeaaf2
      Hal Finkel authored
      This builds on r217342, which added the infrastructure to compute known bits
      using assumptions (@llvm.assume calls). That original commit added only a few
      patterns (to catch common cases related to determining pointer alignment); this
      change adds several other patterns for simple cases.
      
      r217342 contained that, for assume(v & b = a), bits in the mask
      that are known to be one, we can propagate known bits from the a to v. It also
      had a known-bits transfer for assume(a = b). This patch adds:
      
      assume(~(v & b) = a) : For those bits in the mask that are known to be one, we
                             can propagate inverted known bits from the a to v.
      
      assume(v | b = a) :    For those bits in b that are known to be zero, we can
                             propagate known bits from the a to v.
      
      assume(~(v | b) = a):  For those bits in b that are known to be zero, we can
                             propagate inverted known bits from the a to v.
      
      assume(v ^ b = a) :    For those bits in b that are known to be zero, we can
      		       propagate known bits from the a to v. For those bits in
      		       b that are known to be one, we can propagate inverted
                             known bits from the a to v.
      
      assume(~(v ^ b) = a) : For those bits in b that are known to be zero, we can
      		       propagate inverted known bits from the a to v. For those
      		       bits in b that are known to be one, we can propagate
                             known bits from the a to v.
      
      assume(v << c = a) :   For those bits in a that are known, we can propagate them
                             to known bits in v shifted to the right by c.
      
      assume(~(v << c) = a) : For those bits in a that are known, we can propagate
                              them inverted to known bits in v shifted to the right by c.
      
      assume(v >> c = a) :   For those bits in a that are known, we can propagate them
                             to known bits in v shifted to the right by c.
      
      assume(~(v >> c) = a) : For those bits in a that are known, we can propagate
                              them inverted to known bits in v shifted to the right by c.
      
      assume(v >=_s c) where c is non-negative: The sign bit of v is zero
      
      assume(v >_s c) where c is at least -1: The sign bit of v is zero
      
      assume(v <=_s c) where c is negative: The sign bit of v is one
      
      assume(v <_s c) where c is non-positive: The sign bit of v is one
      
      assume(v <=_u c): Transfer the known high zero bits
      
      assume(v <_u c): Transfer the known high zero bits (if c is know to be a power
                       of 2, transfer one more)
      
      A small addition to InstCombine was necessary for some of the test cases. The
      problem is that when InstCombine was simplifying and, or, etc. it would fail to
      check the 'do I know all of the bits' condition before checking less specific
      conditions and would not fully constant-fold the result. I'm not sure how to
      trigger this aside from using assumptions, so I've just included the change
      here.
      
      llvm-svn: 217343
      15aeaaf2
    • Hal Finkel's avatar
      Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.) · 60db0589
      Hal Finkel authored
      This change, which allows @llvm.assume to be used from within computeKnownBits
      (and other associated functions in ValueTracking), adds some (optional)
      parameters to computeKnownBits and friends. These functions now (optionally)
      take a "context" instruction pointer, an AssumptionTracker pointer, and also a
      DomTree pointer, and most of the changes are just to pass this new information
      when it is easily available from InstSimplify, InstCombine, etc.
      
      As explained below, the significant conceptual change is that known properties
      of a value might depend on the control-flow location of the use (because we
      care that the @llvm.assume dominates the use because assumptions have
      control-flow dependencies). This means that, when we ask if bits are known in a
      value, we might get different answers for different uses.
      
      The significant changes are all in ValueTracking. Two main changes: First, as
      with the rest of the code, new parameters need to be passed around. To make
      this easier, I grouped them into a structure, and I made internal static
      versions of the relevant functions that take this structure as a parameter. The
      new code does as you might expect, it looks for @llvm.assume calls that make
      use of the value we're trying to learn something about (often indirectly),
      attempts to pattern match that expression, and uses the result if successful.
      By making use of the AssumptionTracker, the process of finding @llvm.assume
      calls is not expensive.
      
      Part of the structure being passed around inside ValueTracking is a set of
      already-considered @llvm.assume calls. This is to prevent a query using, for
      example, the assume(a == b), to recurse on itself. The context and DT params
      are used to find applicable assumptions. An assumption needs to dominate the
      context instruction, or come after it deterministically. In this latter case we
      only handle the specific case where both the assumption and the context
      instruction are in the same block, and we need to exclude assumptions from
      being used to simplify their own ephemeral values (those which contribute only
      to the assumption) because otherwise the assumption would prove its feeding
      comparison trivial and would be removed.
      
      This commit adds the plumbing and the logic for a simple masked-bit propagation
      (just enough to write a regression test). Future commits add more patterns
      (and, correspondingly, more regression tests).
      
      llvm-svn: 217342
      60db0589
    • David Blaikie's avatar
      DebugInfo: Do not use DW_FORM_GNU_addr_index in skeleton CUs, GDB 7.8 errors on this. · c42f9ac0
      David Blaikie authored
      It's probably not a huge deal to not do this - if we could, maybe the
      address could be reused by a subprogram low_pc and avoid an extra
      relocation, but it's just one per CU at best.
      
      llvm-svn: 217338
      c42f9ac0
    • Hal Finkel's avatar
      Add functions for finding ephemeral values · 57f03dda
      Hal Finkel authored
      This adds a set of utility functions for collecting 'ephemeral' values. These
      are LLVM IR values that are used only by @llvm.assume intrinsics (directly or
      indirectly), and thus will be removed prior to code generation, implying that
      they should be considered free for certain purposes (like inlining). The
      inliner's cost analysis, and a few other passes, have been updated to account
      for ephemeral values using the provided functionality.
      
      This functionality is important for the usability of @llvm.assume, because it
      limits the "non-local" side-effects of adding llvm.assume on inlining, loop
      unrolling, etc. (these are hints, and do not generate code, so they should not
      directly contribute to estimates of execution cost).
      
      llvm-svn: 217335
      57f03dda
    • Hal Finkel's avatar
      Add an Assumption-Tracking Pass · 74c2f355
      Hal Finkel authored
      This adds an immutable pass, AssumptionTracker, which keeps a cache of
      @llvm.assume call instructions within a module. It uses callback value handles
      to keep stale functions and intrinsics out of the map, and it relies on any
      code that creates new @llvm.assume calls to notify it of the new instructions.
      The benefit is that code needing to find @llvm.assume intrinsics can do so
      directly, without scanning the function, thus allowing the cost of @llvm.assume
      handling to be negligible when none are present.
      
      The current design is intended to be lightweight. We don't keep track of
      anything until we need a list of assumptions in some function. The first time
      this happens, we scan the function. After that, we add/remove @llvm.assume
      calls from the cache in response to registration calls and ValueHandle
      callbacks.
      
      There are no new direct test cases for this pass, but because it calls it
      validation function upon module finalization, we'll pick up detectable
      inconsistencies from the other tests that touch @llvm.assume calls.
      
      This pass will be used by follow-up commits that make use of @llvm.assume.
      
      llvm-svn: 217334
      74c2f355
    • Chandler Carruth's avatar
      [x86] Revert my over-eager commit in r217332. · 0a8151e6
      Chandler Carruth authored
      I hadn't actually run all the tests yet and these combines have somewhat
      surprisingly far reaching effects.
      
      llvm-svn: 217333
      0a8151e6
    • Chandler Carruth's avatar
      [x86] Tweak the rules surrounding 0,0 and 1,1 v2f64 shuffles and add · 8405e8ff
      Chandler Carruth authored
      support for MOVDDUP which is really important for matrix multiply style
      operations that do lots of non-vector-aligned load and splats.
      
      The original motivation was to add support for MOVDDUP as the lack of it
      regresses matmul_f64_4x4 by 5% or so. However, all of the rules here
      were somewhat suspicious.
      
      First, we should always be using the floating point domain shuffles,
      regardless of how many copies we have to make as a movapd is *crazy*
      faster than the domain switching cost on some chips. (Mostly because
      movapd is crazy cheap.) Because SHUFPD can't do the copy-for-free trick
      of the PSHUF instructions, there is no need to avoid canonicalizing on
      UNPCK variants, so do that canonicalizing. This also ensures we have the
      chance to form MOVDDUP. =]
      
      Second, we assume SSE2 support when doing any vector lowering, and given
      that we should just use UNPCKLPD and UNPCKHPD as they can operate on
      registers or memory. If vectors get spilled or come from memory at all
      this is going to allow the load to be folded into the operation. If we
      want to optimize for encoding size (the only difference, and only
      a 2 byte difference) it should be done *much* later, likely after RA.
      
      llvm-svn: 217332
      8405e8ff
    • Hans Wennborg's avatar
      Try to unflake AllocatorTest.TestAlignmentPastSlab · e5a96a5c
      Hans Wennborg authored
      llvm-svn: 217331
      e5a96a5c
    • Hans Wennborg's avatar
      BumpPtrAllocator: do the size check without moving any pointers · 44e27464
      Hans Wennborg authored
      Instead of aligning and moving the CurPtr forward, and then comparing
      with End, simply calculate how much space is needed, and compare that
      to how much is available.
      
      Hopefully this avoids any doubts about comparing addresses possibly
      derived from past the end of the slab array, overflowing, etc.
      
      Also add a test where aligning CurPtr would move it past End.
      
      llvm-svn: 217330
      44e27464
    • Lang Hames's avatar
      [MCJIT] Revert partial RuntimeDyldELF cleanup that was prematurely committed in · 9a891052
      Lang Hames authored
      r217328.
      
      llvm-svn: 217329
      9a891052
    • Lang Hames's avatar
      [MCJIT] Rewrite RuntimeDyldMachO and its derived classes to use the 'Offset' · ca279c22
      Lang Hames authored
      field of RelocationValueRef, rather than the 'Addend' field.
      
      This is consistent with RuntimeDyldELF's use of RelocationValueRef, and more
      consistent with the semantics of the data being stored (the offset from the
      start of a section or symbol).
      
      llvm-svn: 217328
      ca279c22
    • Lang Hames's avatar
      [MCJIT] Fix a bug RuntimeDyldImpl's read/writeBytesUnaligned methods. · 69abd72e
      Lang Hames authored
      The previous implementation was writing to the high-bytes of integers on BE
      targets (when run on LE hosts).
      
      http://llvm.org/PR20640
      
      llvm-svn: 217325
      69abd72e
    • Matt Arsenault's avatar
      R600/SI: Fix register class for some 64-bit atomics · 76803bd3
      Matt Arsenault authored
      llvm-svn: 217323
      76803bd3
  3. Sep 06, 2014
    • Matt Arsenault's avatar
      R600/SI: Relax a few tests to help enable scheduler · 7b46a59b
      Matt Arsenault authored
      llvm-svn: 217320
      7b46a59b
    • Matt Arsenault's avatar
      R600/SI: Fix broken check lines. · a9fcf62a
      Matt Arsenault authored
      Fix missing check, and hardcoded register numbers.
      
      llvm-svn: 217318
      a9fcf62a
    • Saleem Abdulrasool's avatar
      MC: correct DWARF line info for PE/COFF · fcefa21b
      Saleem Abdulrasool authored
      DWARF address ranges contain a reference to the debug_info section.  This offset
      is an absolute relocation except on non-PE/COFF targets where it is section
      relative.  We would emit this incorrectly, and trying to map the debug info from
      the address would fail.
      
      llvm-svn: 217317
      fcefa21b
    • Chandler Carruth's avatar
      [x86] Fix a pretty horrible bug and inconsistency in the x86 asm · 373b2b17
      Chandler Carruth authored
      parsing (and latent bug in the instruction definitions).
      
      This is effectively a revert of r136287 which tried to address
      a specific and narrow case of immediate operands failing to be accepted
      by x86 instructions with a pretty heavy hammer: it introduced a new kind
      of operand that behaved differently. All of that is removed with this
      commit, but the test cases are both preserved and enhanced.
      
      The core problem that r136287 and this commit are trying to handle is
      that gas accepts both of the following instructions:
      
        insertps $192, %xmm0, %xmm1
        insertps $-64, %xmm0, %xmm1
      
      These will encode to the same byte sequence, with the immediate
      occupying an 8-bit entry. The first form was fixed by r136287 but that
      broke the prior handling of the second form! =[ Ironically, we would
      still emit the second form in some cases and then be unable to
      re-assemble the output.
      
      The reason why the first instruction failed to be handled is because
      prior to r136287 the operands ere marked 'i32i8imm' which forces them to
      be sign-extenable. Clearly, that won't work for 192 in a single byte.
      However, making thim zero-extended or "unsigned" doesn't really address
      the core issue either because it breaks negative immediates. The correct
      fix is to make these operands 'i8imm' reflecting that they can be either
      signed or unsigned but must be 8-bit immediates. This patch backs out
      r136287 and then changes those places as well as some others to use
      'i8imm' rather than one of the extended variants.
      
      Naturally, this broke something else. The custom DAG nodes had to be
      updated to have a much more accurate type constraint of an i8 node, and
      a bunch of Pat immediates needed to be specified as i8 values.
      
      The fallout didn't end there though. We also then ceased to be able to
      match the instruction-specific intrinsics to the instructions so
      modified. Digging, this is because they too used i32 rather than i8 in
      their signature. So I've also switched those intrinsics to i8 arguments
      in line with the instructions.
      
      In order to make the intrinsic adjustments of course, I also had to add
      auto upgrading for the intrinsics.
      
      I suspect that the intrinsic argument types may have led everything down
      this rabbit hole. Pretty happy with the result.
      
      llvm-svn: 217310
      373b2b17
    • Nick Lewycky's avatar
      Check whether the iterator p == the end iterator before trying to dereference... · 095b92e5
      Nick Lewycky authored
      Check whether the iterator p == the end iterator before trying to dereference it. This is a speculative fix for a failure found on the valgrind buildbot triggered by a clang test.
      
      llvm-svn: 217295
      095b92e5
    • Alexey Samsonov's avatar
      Fix right shift by 64 bits detected on CXX/lex/lex.literal/lex.ext/p4.cpp · ba1ecbc7
      Alexey Samsonov authored
      test case on UBSan bootstrap bot.
      
      This fixes the last failure of "check-clang" in UBSan bootstrap bot.
      
      llvm-svn: 217294
      ba1ecbc7
    • Sean Silva's avatar
      [docs] Document what "NFC" means in a commit message. · 5e44ffdb
      Sean Silva authored
      llvm-svn: 217292
      5e44ffdb
    • Lang Hames's avatar
      [MCJIT] Fix an iterator invalidation bug in MCJIT::finalizeObject. · 018452e6
      Lang Hames authored
      The finalizeObject method calls generateCodeForModule on each of the currently
      'added' objects, but generateCodeForModule moves objects out of the 'added'
      set as it's called. To avoid iterator invalidation issues, the added set is
      copied out before any calls to generateCodeForModule.
      
      This should fix http://llvm.org/PR20851 .
      
      llvm-svn: 217291
      018452e6
    • Chandler Carruth's avatar
      [x86] Fix an embarressing bug in the INSERTPS formation code. The mask · 21d27ee9
      Chandler Carruth authored
      computation was totally wrong, but somehow it didn't really show up with
      llc.
      
      I've added an assert that triggers on multiple existing test cases and
      updated one of them to show the correct value.
      
      There appear to still be more bugs lurking around insertps's mask. =/
      However, note that this only really impacts the new vector shuffle
      lowering.
      
      llvm-svn: 217289
      21d27ee9
    • Akira Hatanaka's avatar
      [inline asm] Add a check in InlineAsm::ConstraintInfo::Parse to make sure '{' · 489decec
      Akira Hatanaka authored
      follows '~' in a clobber constraint string.
      
      Previously llc would hit an llvm_unreachable when compiling an inline-asm
      instruction with malformed constraint string "~x{21}". This commit enables
      LLParser to catch the error earlier and print a more helpful diagnostic.
      
      rdar://problem/14206559
      
      llvm-svn: 217288
      489decec
    • Sanjay Patel's avatar
      Allow vector fsub ops with constants to get the same optimizations as scalars. · 75cc90ed
      Sanjay Patel authored
      This problem is bigger than just fsub, but this is the minimum fix to solve
      fneg for PR20556 ( http://llvm.org/bugs/show_bug.cgi?id=20556 ), and we solve
      zero subtraction with the same change.
      
      llvm-svn: 217286
      75cc90ed
  4. Sep 05, 2014
Loading