- Jun 18, 2013
-
-
Nadav Rotem authored
llvm-svn: 184200
-
- May 22, 2013
-
-
Nadav Rotem authored
We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs). * Imroved the numbering API. * Changed the placement of new instructions to the last root. * Fixed a bug with external tree users with non-zero lane. * Fixed a bug in the placement of in-tree users. llvm-svn: 182508
-
- May 11, 2013
-
-
Nadav Rotem authored
For example: bar() { int a = A[i]; int b = A[i+1]; B[i] = a; B[i+1] = b; foo(a); <--- a is used outside the vectorized expression. } llvm-svn: 181648
-
- Apr 20, 2013
-
-
Benjamin Kramer authored
Avoids a couple of copies and allows more flexibility in the clients. llvm-svn: 179935
-
Nadav Rotem authored
llvm-svn: 179930
-
Nadav Rotem authored
llvm-svn: 179928
-
- Apr 16, 2013
-
-
Nadav Rotem authored
SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops. llvm-svn: 179562
-
- Apr 14, 2013
-
-
Benjamin Kramer authored
llvm-svn: 179483
-
Nadav Rotem authored
SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree. llvm-svn: 179475
-
Nadav Rotem authored
llvm-svn: 179470
-
- Apr 12, 2013
-
-
Nadav Rotem authored
SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from. llvm-svn: 179414
-
- Apr 09, 2013
-
-
Nadav Rotem authored
This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations. The infrastructure has three potential users: 1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]). 2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute. 3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization. This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code: void SAXPY(int *x, int *y, int a, int i) { x[i] = a * x[i] + y[i]; x[i+1] = a * x[i+1] + y[i+1]; x[i+2] = a * x[i+2] + y[i+2]; x[i+3] = a * x[i+3] + y[i+3]; } llvm-svn: 179117
-