Skip to content
  1. Apr 15, 2014
  2. Apr 14, 2014
  3. Apr 13, 2014
    • Serge Pavlov's avatar
      Recognize test for overflow in integer multiplication. · 4bb54d51
      Serge Pavlov authored
      If multiplication involves zero-extended arguments and the result is
      compared as in the patterns:
      
          %mul32 = trunc i64 %mul64 to i32
          %zext = zext i32 %mul32 to i64
          %overflow = icmp ne i64 %mul64, %zext
      or
          %overflow = icmp ugt i64 %mul64 , 0xffffffff
      
      then the multiplication may be replaced by call to umul.with.overflow.
      This change fixes PR4917 and PR4918.
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D2814
      
      llvm-svn: 206137
      4bb54d51
  4. Apr 11, 2014
    • Matt Arsenault's avatar
      Fix shift by constants for vector. · 173a1e57
      Matt Arsenault authored
      ashr <N x iM>, <N x iM> M -> undef
      
      llvm-svn: 206045
      173a1e57
    • David Blaikie's avatar
      Implement depth_first and inverse_depth_first range factory functions. · ceec2bda
      David Blaikie authored
      Also updated as many loops as I could find using df_begin/idf_begin -
      strangely I found no uses of idf_begin. Is that just used out of tree?
      
      Also a few places couldn't use df_begin because either they used the
      member functions of the depth first iterators or had specific ordering
      constraints (I added a comment in the latter case).
      
      Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T>
      where you needed iterator_range<idf_iterator<T>>)
      
      llvm-svn: 206016
      ceec2bda
  5. Apr 10, 2014
  6. Apr 09, 2014
  7. Apr 08, 2014
    • Diego Novillo's avatar
      Add support for optimization reports. · a9298b22
      Diego Novillo authored
      Summary:
      This patch adds backend support for -Rpass=, which indicates the name
      of the optimization pass that should emit remarks stating when it
      made a transformation to the code.
      
      Pass names are taken from their DEBUG_NAME definitions.
      
      When emitting an optimization report diagnostic, the lack of debug
      information causes the diagnostic to use "<unknown>:0:0" as the
      location string.
      
      This is the back end counterpart for
      
      http://llvm-reviews.chandlerc.com/D3226
      
      Reviewers: qcolombet
      
      CC: llvm-commits
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D3227
      
      llvm-svn: 205774
      a9298b22
  8. Apr 07, 2014
  9. Apr 05, 2014
  10. Apr 03, 2014
  11. Apr 02, 2014
    • Tim Northover's avatar
      SLPVectorizer: compare entire intrinsic for SLP compatibility. · 670df3d9
      Tim Northover authored
      Some Intrinsics are overloaded to the extent that return type equality (all
      that's been checked up to now) does not guarantee that the arguments are the
      same. In these cases SLP vectorizer should not recurse into the operands, which
      can be achieved by comparing them as "Function *" rather than simply the ID.
      
      llvm-svn: 205424
      670df3d9
    • Hal Finkel's avatar
      [LoopVectorizer] Count dependencies of consecutive pointers as uniforms · b0ebdc0f
      Hal Finkel authored
      For the purpose of calculating the cost of the loop at various vectorization
      factors, we need to count dependencies of consecutive pointers as uniforms
      (which means that the VF = 1 cost is used for all overall VF values).
      
      For example, the TSVC benchmark function s173 has:
        ...
        %3 = add nsw i64 %indvars.iv, 16000
        %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3
        ...
      and we must realize that the add will be a scalar in order to correctly deduce
      it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all
      dependencies of a consecutive pointer must be a scalar (uniform), and so we
      simply need to add all consecutive pointers to the worklist that currently
      detects collects uniforms.
      
      Fixes PR19296.
      
      llvm-svn: 205387
      b0ebdc0f
  12. Apr 01, 2014
    • Hal Finkel's avatar
      Add some additional fields to TTI::UnrollingPreferences · 6386cb8d
      Hal Finkel authored
      In preparation for an upcoming commit implementing unrolling preferences for
      x86, this adds additional fields to the UnrollingPreferences structure:
      
       - PartialThreshold and PartialOptSizeThreshold - Like Threshold and
         OptSizeThreshold, but used when not fully unrolling. These are necessary
         because we need different thresholds for full unrolling from those used when
         partially unrolling (the full unrolling thresholds are generally going to be
         larger).
      
       - MaxCount - A cap on the unrolling factor when partially unrolling. This can
         be used by a target to prevent the unrolled loop from exceeding some
         resource limit independent of the loop size (such as number of branches).
      
      There should be no functionality change for any in-tree targets.
      
      llvm-svn: 205347
      6386cb8d
    • Hal Finkel's avatar
      Move partial/runtime unrolling late in the pipeline · 86b3064f
      Hal Finkel authored
      The generic (concatenation) loop unroller is currently placed early in the
      standard optimization pipeline. This is a good place to perform full unrolling,
      but not the right place to perform partial/runtime unrolling. However, most
      targets don't enable partial/runtime unrolling, so this never mattered.
      
      However, even some x86 cores benefit from partial/runtime unrolling of very
      small loops, and follow-up commits will enable this. First, we need to move
      partial/runtime unrolling late in the optimization pipeline (importantly, this
      is after SLP and loop vectorization, as vectorization can drastically change
      the size of a loop), while keeping the full unrolling where it is now. This
      change does just that.
      
      llvm-svn: 205264
      86b3064f
    • Arnold Schwaighofer's avatar
      Revert "SLPVectorizer: Ignore users that are insertelements we can reschedule them" · 15262e67
      Arnold Schwaighofer authored
      This reverts commit r205018.
      
      Conflicts:
      	lib/Transforms/Vectorize/SLPVectorizer.cpp
      	test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
      
      This is breaking libclc build.
      
      llvm-svn: 205260
      15262e67
  13. Mar 30, 2014
    • Rafael Espindola's avatar
      Add a missing break. · 5e66a7e6
      Rafael Espindola authored
      Patch by Tobias Güntner.
      
      I tried to write a test, but the only difference is the Changed value that
      gets returned. It can be tested with "opt -debug-pass=Executions -functionattrs,
      but that doesn't seem worth it.
      
      llvm-svn: 205121
      5e66a7e6
  14. Mar 29, 2014
    • Tim Northover's avatar
      ARM64: initial backend import · 00ed9964
      Tim Northover authored
      This adds a second implementation of the AArch64 architecture to LLVM,
      accessible in parallel via the "arm64" triple. The plan over the
      coming weeks & months is to merge the two into a single backend,
      during which time thorough code review should naturally occur.
      
      Everything will be easier with the target in-tree though, hence this
      commit.
      
      llvm-svn: 205090
      00ed9964
  15. Mar 28, 2014
  16. Mar 27, 2014
    • Reid Kleckner's avatar
      InstCombine: Don't combine constants on unsigned icmps · 3bdf9bc4
      Reid Kleckner authored
      Fixes a miscompile introduced in r204912.  It would miscompile code like
      (unsigned)(a + -49) <= 5U.  The transform would turn this into
      (unsigned)a < 55U, which would return true for values in [0, 49], when
      it should not.
      
      llvm-svn: 204948
      3bdf9bc4
    • Rafael Espindola's avatar
      Prevent alias from pointing to weak aliases. · 24a669d2
      Rafael Espindola authored
      This adds back r204781.
      
      Original message:
      
      Aliases are just another name for a position in a file. As such, the
      regular symbol resolutions are not applied. For example, given
      
      define void @my_func() {
        ret void
      }
      @my_alias = alias weak void ()* @my_func
      @my_alias2 = alias void ()* @my_alias
      
      We produce without this patch:
      
              .weak   my_alias
      my_alias = my_func
              .globl  my_alias2
      my_alias2 = my_alias
      
      That is, in the resulting ELF file my_alias, my_func and my_alias are
      just 3 names pointing to offset 0 of .text. That is *not* the
      semantics of IR linking. For example, linking in a
      
      @my_alias = alias void ()* @other_func
      
      would require the strong my_alias to override the weak one and
      my_alias2 would end up pointing to other_func.
      
      There is no way to represent that with aliases being just another
      name, so the best solution seems to be to just disallow it, converting
      a miscompile into an error.
      
      llvm-svn: 204934
      24a669d2
    • Erik Verbruggen's avatar
      InstCombine: merge constants in both operands of icmp. · 59a12198
      Erik Verbruggen authored
      Transform:
          icmp X+Cst2, Cst
      into:
          icmp X, Cst-Cst2
      when Cst-Cst2 does not overflow, and the add has nsw.
      
      llvm-svn: 204912
      59a12198
    • Nick Lewycky's avatar
  17. Mar 26, 2014
Loading