Skip to content
  1. Oct 26, 2012
  2. Oct 25, 2012
    • Hal Finkel's avatar
      Begin incorporating target information into BBVectorize. · cbf9365f
      Hal Finkel authored
      This is the first of several steps to incorporate information from the new
      TargetTransformInfo infrastructure into BBVectorize. Two things are done here:
      
       1. Target information is used to determine if it is profitable to fuse two
          instructions. This means that the cost of the vector operation must not
          be more expensive than the cost of the two original operations. Pairs that
          are not profitable are no longer considered (because current cost information
          is incomplete, for intrinsics for example, equal-cost pairs are still
          considered).
      
       2. The 'cost savings' computed for the profitability check are also used to
          rank the DAGs that represent the potential vectorization plans. Specifically,
          for nodes of non-trivial depth, the cost savings is used as the node
          weight.
      
      The next step will be to incorporate the shuffle costs into the DAG weighting;
      this will give the edges of the DAG weights as well. Once that is done, when
      target information is available, we should be able to dispense with the
      depth heuristic.
      
      llvm-svn: 166716
      cbf9365f
    • Nadav Rotem's avatar
    • Jakob Stoklund Olesen's avatar
      Also optimize large switch statements. · 977f41a1
      Jakob Stoklund Olesen authored
      The isValueEqualityComparison() guard at the top of SimplifySwitch()
      only applies to some of the possible transformations.
      
      The newer transformations work just fine on large switches, and the
      check on predecessor count is nonsensical.
      
      llvm-svn: 166710
      977f41a1
    • Chandler Carruth's avatar
      Teach SROA how to split whole-alloca integer loads and stores into · 58d05567
      Chandler Carruth authored
      smaller integer loads and stores.
      
      The high-level motivation is that the frontend sometimes generates
      a single whole-alloca integer load or store during ABI lowering of
      splittable allocas. We need to be able to break this apart in order to
      see the underlying elements and properly promote them to SSA values. The
      hope is that this fixes some performance regressions on x86-32 with the
      new SROA pass.
      
      Unfortunately, this causes quite a bit of churn in the test cases, and
      bloats some IR that comes out. When we see an alloca that consists soley
      of bits and bytes being extracted and re-inserted, we now do some
      splitting first, before building widened integer "bucket of bits"
      representations. These are always well folded by instcombine however, so
      this shouldn't actually result in missed opportunities.
      
      If this splitting of all-integer allocas does cause problems (perhaps
      due to smaller SSA values going into the RA), we could potentially go to
      some extreme measures to only do this integer splitting trick when there
      are non-integer component accesses of an alloca, but discovering this is
      quite expensive: it adds yet another complete walk of the recursive use
      tree of the alloca.
      
      Either way, I will be watching build bots and LNT bots to see what
      fallout there is here. If anyone gets x86-32 numbers before & after this
      change, I would be very interested.
      
      llvm-svn: 166662
      58d05567
    • Nadav Rotem's avatar
      Add support for additional reduction variables: AND, OR, XOR. · 5ffb049a
      Nadav Rotem authored
      Patch by Paul Redmond <paul.redmond@intel.com>.
      
      llvm-svn: 166649
      5ffb049a
    • Nadav Rotem's avatar
      revert accidental change · 086ea5c1
      Nadav Rotem authored
      llvm-svn: 166643
      086ea5c1
    • Nadav Rotem's avatar
      Implement a basic cost model for vector and scalar instructions. · 4a87683a
      Nadav Rotem authored
      llvm-svn: 166642
      4a87683a
    • Micah Villmow's avatar
      Fix a compiler warning with an unused variable. · f07b9628
      Micah Villmow authored
      llvm-svn: 166634
      f07b9628
  3. Oct 24, 2012
  4. Oct 23, 2012
    • Nadav Rotem's avatar
      · 5bed7b4f
      Nadav Rotem authored
      Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news.
      
      PR14158.
      
      llvm-svn: 166491
      5bed7b4f
    • Duncan Sands's avatar
      Fix typo that somehow escaped both testing and code inspection. · 5ed3900d
      Duncan Sands authored
      llvm-svn: 166475
      5ed3900d
    • Duncan Sands's avatar
      Transform code like this · 533c8ae7
      Duncan Sands authored
       %V = mul i64 %N, 4
       %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
      into
       %t1 = getelementptr i32* %arr, i32 %N
       %t = bitcast i32* %t1 to i8*
      incorporating the multiplication into the getelementptr.
      This happens all the time in dragonegg, for example for
        int foo(int *A, int N) {
          return A[N];
        }
      because gcc turns this into byte pointer arithmetic before it hits the plugin:
        D.1590_2 = (long unsigned int) N_1(D);
        D.1591_3 = D.1590_2 * 4;
        D.1592_5 = A_4(D) + D.1591_3;
        D.1589_6 = *D.1592_5;
        return D.1589_6;
      The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
      on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
      transform fires on.
      
      An analogous transform (with no testcases!) already existed for bitcasts of
      arrays, so I rewrote it to share code with this one.
      
      llvm-svn: 166474
      533c8ae7
    • Richard Smith's avatar
      Per the C++ standard, we need to include the definition of llvm::Calculate in · 6289a4e8
      Richard Smith authored
      every TU where it's implicitly instantiated, even if there's an implicit
      instantiation for the same types available in another TU.
      
      llvm-svn: 166470
      6289a4e8
    • Julien Lerouge's avatar
      Fix typo. · a302b6d9
      Julien Lerouge authored
      llvm-svn: 166456
      a302b6d9
    • Julien Lerouge's avatar
      Explain why DenseMap is still used here instead of MapVector. · d7fa5e42
      Julien Lerouge authored
      llvm-svn: 166454
      d7fa5e42
  5. Oct 22, 2012
  6. Oct 21, 2012
Loading