Skip to content
  1. Apr 15, 2013
  2. Apr 14, 2013
  3. Apr 13, 2013
    • Benjamin Kramer's avatar
      GlobalDCE: Fix an oversight in my last commit that could lead to crashes. · adc1727c
      Benjamin Kramer authored
      There is a Constant with non-constant operands: blockaddress.
      
      llvm-svn: 179460
      adc1727c
    • Benjamin Kramer's avatar
      Fix a scalability issue with complex ConstantExprs. · 89ca4bc6
      Benjamin Kramer authored
      This is basically the same fix in three different places. We use a set to avoid
      walking the whole tree of a big ConstantExprs multiple times.
      
      For example: (select cmp, (add big_expr 1), (add big_expr 2))
      We don't want to visit big_expr twice here, it may consist of thousands of
      nodes.
      
      The testcase exercises this by creating an insanely large ConstantExprs out of
      a loop. It's questionable if the optimizer should ever create those, but this
      can be triggered with real C code. Fixes PR15714.
      
      llvm-svn: 179458
      89ca4bc6
  4. Apr 12, 2013
  5. Apr 11, 2013
    • David Majnemer's avatar
      Optimize icmp involving addition better · b81cd63c
      David Majnemer authored
      Allows LLVM to optimize sequences like the following:
      
      %add = add nsw i32 %x, 1
      %cmp = icmp sgt i32 %add, %y
      
      into:
      
      %cmp = icmp sge i32 %x, %y
      
      as well as:
      
      %add1 = add nsw i32 %x, 20
      %add2 = add nsw i32 %y, 57
      %cmp = icmp sge i32 %add1, %add2
      
      into:
      
      %add = add nsw i32 %y, 37
      %cmp = icmp sle i32 %cmp, %x
      
      llvm-svn: 179316
      b81cd63c
    • Benjamin Kramer's avatar
      Fix for wrong instcombine on vector insert/extract · a95f8749
      Benjamin Kramer authored
      When trying to collapse sequences of insertelement/extractelement
      instructions into single shuffle instructions, there is one specific
      case where the Instruction Combiner wrongly updates the resulting
      Mask of shuffle indexes.
      
      The problem is in function CollectShuffleElments.
      
      If we have a sequence of insert/extract element instructions
      like the one below:
      
        %tmp1 = extractelement <4 x float> %LHS, i32 0
        %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1
        %tmp3 = extractelement <4 x float> %RHS, i32 2
        %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3
      
      Where:
        . %RHS will have a mask of [4,5,6,7]
        . %LHS will have a mask of [0,1,2,3]
      
      The Mask of shuffle indexes is wrongly computed to [4,1,6,7]
      instead of [4,0,6,7].
      When analyzing %tmp2 in order to compute the Mask for the
      resulting shuffle instruction, the algorithm forgets to update
      the mask index at position 1 with the index associated to the
      element extracted from %LHS by instruction %tmp1.
      
      Patch by Andrea DiBiagio!
      
      llvm-svn: 179291
      a95f8749
    • Alexey Samsonov's avatar
      a28f36c2
    • Benjamin Kramer's avatar
      Rename the C function to create a SLPVectorizerPass to something sane and... · c86fdf12
      Benjamin Kramer authored
      Rename the C function to create a SLPVectorizerPass to something sane and expose it in the header file.
      
      llvm-svn: 179272
      c86fdf12
  6. Apr 10, 2013
  7. Apr 09, 2013
    • Nadav Rotem's avatar
      Add support for bottom-up SLP vectorization infrastructure. · 2d9dec32
      Nadav Rotem authored
      This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations.
      The infrastructure has three potential users:
      
        1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]).
      
        2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute.
      
        3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization.
      
      This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code:
      
      void SAXPY(int *x, int *y, int a, int i) {
        x[i]   = a * x[i]   + y[i];
        x[i+1] = a * x[i+1] + y[i+1];
        x[i+2] = a * x[i+2] + y[i+2];
        x[i+3] = a * x[i+3] + y[i+3];
      }
      
      llvm-svn: 179117
      2d9dec32
    • Shuxin Yang's avatar
      Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate. · 331f01dc
      Shuxin Yang authored
      I brazenly think this change is slightly simpler than r178793 because: 
        - no "state" in functor
        - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" 
      
        While I can reproduce the probelm in Valgrind, it is rather difficult to come up
      a standalone testing case. The reason is that when an iterator is invalidated,
      the stale invalidated elements are not yet clobbered by nonsense data, so the
      optimizer can still proceed successfully. 
      
        Thank Benjamin for fixing this bug and generously providing the test case.
      
      llvm-svn: 179062
      331f01dc
  8. Apr 07, 2013
    • Chandler Carruth's avatar
      Fix PR15674 (and PR15603): a SROA think-o. · 0e8a52d1
      Chandler Carruth authored
      The fix for PR14972 in r177055 introduced a real think-o in the *store*
      side, likely because I was much more focused on the load side. While we
      can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily
      widen a value to be stored, as that changes the width of memory access!
      Lock down the code path in the store rewriting which would do this to
      only handle the intended circumstance.
      
      All of the existing tests continue to pass, and I've added a test from
      the PR.
      
      llvm-svn: 178974
      0e8a52d1
  9. Apr 06, 2013
    • Michael Gottesman's avatar
      Removed trailing whitespace. · 7924997c
      Michael Gottesman authored
      llvm-svn: 178932
      7924997c
    • Michael Gottesman's avatar
      An objc_retain can serve as a use for a different pointer. · 31ba23aa
      Michael Gottesman authored
      This is the counterpart to commit r160637, except it performs the action
      in the bottomup portion of the data flow analysis.
      
      llvm-svn: 178922
      31ba23aa
    • Michael Gottesman's avatar
      Properly model precise lifetime when given an incomplete dataflow sequence. · 1d8d2577
      Michael Gottesman authored
      The normal dataflow sequence in the ARC optimizer consists of the following
      states:
      
          Retain -> CanRelease -> Use -> Release
      
      The optimizer before this patch stored the uses that determine the lifetime of
      the retainable object pointer when it bottom up hits a retain or when top down
      it hits a release. This is correct for an imprecise lifetime scenario since what
      we are trying to do is remove retains/releases while making sure that no
      ``CanRelease'' (which is usually a call) deallocates the given pointer before we
      get to the ``Use'' (since that would cause a segfault).
      
      If we are considering the precise lifetime scenario though, this is not
      correct. In such a situation, we *DO* care about the previous sequence, but
      additionally, we wish to track the uses resulting from the following incomplete
      sequences:
      
        Retain -> CanRelease -> Release   (TopDown)
        Retain <- Use <- Release          (BottomUp)
      
      *NOTE* This patch looks large but the most of it consists of updating
      test cases. Additionally this fix exposed an additional bug. I removed
      the test case that expressed said bug and will recommit it with the fix
      in a little bit.
      
      llvm-svn: 178921
      1d8d2577
  10. Apr 05, 2013
  11. Apr 04, 2013
Loading