- Apr 21, 2013
-
-
Nadav Rotem authored
We did not terminate the switch case and we executed the search routine twice. llvm-svn: 179974
-
Nadav Rotem authored
llvm-svn: 179960
-
- Apr 20, 2013
-
-
Benjamin Kramer authored
llvm-svn: 179936
-
Benjamin Kramer authored
Avoids a couple of copies and allows more flexibility in the clients. llvm-svn: 179935
-
Nadav Rotem authored
SLPVectorizer: Reduce the compile time by eliminating the search for some of the more expensive patterns. After this change will only check basic arithmetic trees that start at cmpinstr. llvm-svn: 179933
-
Nadav Rotem authored
llvm-svn: 179932
-
Nadav Rotem authored
llvm-svn: 179931
-
Nadav Rotem authored
llvm-svn: 179930
-
Nadav Rotem authored
llvm-svn: 179929
-
Nadav Rotem authored
llvm-svn: 179928
-
Nadav Rotem authored
llvm-svn: 179927
-
- Apr 19, 2013
-
-
Arnold Schwaighofer authored
Also make some static function class functions to avoid having to mention the class namespace for enums all the time. No functionality change intended. llvm-svn: 179886
-
- Apr 18, 2013
-
-
Dmitri Gribenko authored
llvm-svn: 179789
-
Arnold Schwaighofer authored
A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y) sequence in LLVM. If we see such a sequence we can treat it just as any other commutative binary instruction and reduce it. This appears to help bzip2 by about 1.5% on an imac12,2. radar://12960601 llvm-svn: 179773
-
Benjamin Kramer authored
Fixes PR15748. llvm-svn: 179757
-
- Apr 16, 2013
-
-
Nadav Rotem authored
SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops. llvm-svn: 179562
-
- Apr 15, 2013
-
-
Nadav Rotem authored
llvm-svn: 179504
-
- Apr 14, 2013
-
-
Benjamin Kramer authored
llvm-svn: 179483
-
Nadav Rotem authored
llvm-svn: 179479
-
Nadav Rotem authored
SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree. llvm-svn: 179475
-
Nadav Rotem authored
llvm-svn: 179470
-
- Apr 12, 2013
-
-
Nadav Rotem authored
SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from. llvm-svn: 179414
-
Nadav Rotem authored
llvm-svn: 179412
-
Arnold Schwaighofer authored
Don't classify idiv/udiv as a reduction operation. Integer division is lossy. For example : (1 / 2) * 4 != 4/2. Example: int a[] = { 2, 5, 2, 2} int x = 80; for() x /= a[i]; Scalar: x /= 2 // = 40 x /= 5 // = 8 x /= 2 // = 4 x /= 2 // = 2 Vectorized: <80, 1> / <2,5> //= <40,0> <40, 0> / <2,2> //= <20,0> 20*0 = 0 radar://13640654 llvm-svn: 179381
-
- Apr 11, 2013
-
-
Benjamin Kramer authored
Rename the C function to create a SLPVectorizerPass to something sane and expose it in the header file. llvm-svn: 179272
-
- Apr 10, 2013
-
-
Nadav Rotem authored
Make the SLP store-merger less paranoid about function calls. We check for function calls when we check if it is safe to sink instructions. llvm-svn: 179207
-
Nadav Rotem authored
llvm-svn: 179206
-
- Apr 09, 2013
-
-
Nadav Rotem authored
This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations. The infrastructure has three potential users: 1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]). 2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute. 3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization. This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code: void SAXPY(int *x, int *y, int a, int i) { x[i] = a * x[i] + y[i]; x[i+1] = a * x[i+1] + y[i+1]; x[i+2] = a * x[i+2] + y[i+2]; x[i+3] = a * x[i+3] + y[i+3]; } llvm-svn: 179117
-
- Apr 05, 2013
-
-
Arnold Schwaighofer authored
Pass down the fact that an operand is going to be a vector of constants. This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86 back. It had degraded to scalar performance due to my pervious shift cost change that made all shifts expensive on x86. radar://13576547 llvm-svn: 178809
-
- Mar 14, 2013
-
-
Arnold Schwaighofer authored
We generate a select with a vectorized condition argument when the condition is NOT loop invariant. Not the other way around. llvm-svn: 177098
-
- Mar 10, 2013
-
-
Hal Finkel authored
After the recent data-structure improvements, a couple of debugging statements were broken (printing pointer values). llvm-svn: 176791
-
- Mar 09, 2013
-
-
Benjamin Kramer authored
This made us emit runtime checks in a random order. Hopefully bootstrap miscompares will go away now. llvm-svn: 176775
-
Arnold Schwaighofer authored
Ignore all DbgIntriniscInfo instructions instead of just DbgValueInst. llvm-svn: 176769
-
Arnold Schwaighofer authored
We want vectorization to happen at -g. Ignore calls to the dbg.value intrinsic and don't transfer them to the vectorized code. radar://13378964 llvm-svn: 176768
-
- Mar 08, 2013
-
-
Benjamin Kramer authored
Fixes PR15344. llvm-svn: 176701
-
- Mar 02, 2013
-
-
Nadav Rotem authored
The LoopVectorizer often runs multiple times on the same function due to inlining. When this happens the loop vectorizer often vectorizes the same loops multiple times, increasing code size and adding unneeded branches. With this patch, the vectorizer during vectorization puts metadata on scalar loops and marks them as 'already vectorized' so that it knows to ignore them when it sees them a second time. PR14448. llvm-svn: 176399
-
- Mar 01, 2013
-
-
Benjamin Kramer authored
Fixes PR15384. llvm-svn: 176366
-
- Feb 27, 2013
-
-
Benjamin Kramer authored
This properly asks TargetLibraryInfo if a call is available and if it is, it can be translated into the corresponding LLVM builtin. We don't vectorize sqrt() yet because I'm not sure about the semantics for negative numbers. The other intrinsic should be exact equivalents to the libm functions. Differential Revision: http://llvm-reviews.chandlerc.com/D465 llvm-svn: 176188
-
- Feb 21, 2013
-
-
Renato Golin authored
Storing the load/store instructions with the values and inspect them using Alias Analysis to make sure they don't alias, since the GEP pointer operand doesn't take the offset into account. Trying hard to not add any extra cost to loads and stores that don't overlap on global values, AA is *only* calculated if all of the previous attempts failed. Using biggest vector register size as the stride for the vectorization access, as we're being conservative and the cost model (which calculates the real vectorization factor) is only run after the legalization phase. We might re-think this relationship in the future, but for now, I'd rather be safe than sorry. llvm-svn: 175818
-
- Feb 17, 2013
-
-
Hal Finkel authored
This fixes PR15289. This bug was introduced (recently) in r175215; collecting all std::vector references for candidate pairs to delete at once is invalid because subsequent lookups in the owning DenseMap could invalidate the references. bugpoint was able to reduce a useful test case. Unfortunately, because whether or not this asserts depends on memory layout, this test case will sometimes appear to produce valid output. Nevertheless, running under valgrind will reveal the error. llvm-svn: 175397
-