- Mar 05, 2013
-
-
Meador Inge authored
This patch adds many more functions to the target library information. All of the functions being added were discovered while doing the migration of the simplify-libcalls attribute annotation functionality to the functionattrs pass. As a part of that work the attribute annotation logic will query TLI to determine if a function should be annotated or not. Signed-off-by:
Meador Inge <meadori@codesourcery.com> llvm-svn: 176514
-
Jyotsna Verma authored
llvm-svn: 176513
-
Jyotsna Verma authored
llvm-svn: 176508
-
Vincent Lejeune authored
llvm-svn: 176507
-
Jyotsna Verma authored
llvm-svn: 176505
-
Benjamin Kramer authored
llvm-svn: 176501
-
Jyotsna Verma authored
llvm-svn: 176500
-
Jyotsna Verma authored
Set imMoveImm, isAsCheapAsAMove flags for TFRI instructions. llvm-svn: 176499
-
Vincent Lejeune authored
This is a skeleton for a pre-RA MachineInstr scheduler strategy. Currently it only tries to expose more parallelism for ALU instructions (this also makes the distribution of GPR channels more uniform and increases the chances of ALU instructions to be packed together in a single VLIW group). Also it tries to reduce clause switching by grouping instruction of the same kind (ALU/FETCH/CF) together. Vincent Lejeune: - Support for VLIW4 Slot assignement - Recomputation of ScheduleDAG to get more parallelism opportunities Tom Stellard: - Fix assertion failure when trying to determine an instruction's slot based on its destination register's class - Fix some compiler warnings Vincent Lejeune: [v2] - Remove recomputation of ScheduleDAG (will be provided in a later patch) - Improve estimation of an ALU clause size so that heuristic does not emit cf instructions at the wrong position. - Make schedule heuristic smarter using SUnit Depth - Take constant read limitations into account Vincent Lejeune: [v3] - Fix some uninitialized values in ConstPair - Add asserts to ensure an ALU slot is always populated llvm-svn: 176498
-
Vincent Lejeune authored
Maintaining CONST_COPY Instructions until Pre Emit may prevent some ifcvt case and taking them in account for scheduling is difficult for no real benefit. llvm-svn: 176488
-
Vincent Lejeune authored
Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 176487
-
Vincent Lejeune authored
Reviewed-by: Tom Stellard <thomas.stellard at amd.com> mayLoad complexify scheduling and does not bring any usefull info as the location is not writeable at all. llvm-svn: 176486
-
Vincent Lejeune authored
Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 176485
-
Vincent Lejeune authored
NOTE: This is a candidate for the Mesa stable branch. llvm-svn: 176484
-
David Sehr authored
one-byte NOPs. If the processor actually executes those NOPs, as it sometimes does with aligned bundling, this can have a performance impact. From my micro-benchmarks run on my one machine, a 15-byte NOP followed by twelve one-byte NOPs is about 20% worse than a 15 followed by a 12. This patch changes NOP emission to emit as many 15-byte (the maximum) as possible followed by at most one shorter NOP. llvm-svn: 176464
-
- Mar 04, 2013
-
-
Akira Hatanaka authored
"move $4, $5" is printed instead of "or $4, $5, $zero". llvm-svn: 176455
-
Jack Carter authored
'R' An address that can be sued in a non-macro load or store. This patch includes a positive test case. llvm-svn: 176452
-
Preston Gurd authored
* Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
-
Tom Stellard authored
llvm-svn: 176439
-
Jia Liu authored
llvm-svn: 176426
-
- Mar 02, 2013
-
-
Jim Grosbach authored
The VDUP instruction source register doesn't allow a non-constant lane index, so make sure we don't construct a ARM::VDUPLANE node asking it to do so. rdar://13328063 http://llvm.org/bugs/show_bug.cgi?id=13963 llvm-svn: 176413
-
Jim Grosbach authored
llvm-svn: 176412
-
Jim Grosbach authored
llvm-svn: 176411
-
Arnold Schwaighofer authored
Mark them as expand, they are not legal as our backend does not match them. llvm-svn: 176410
-
Arnold Schwaighofer authored
This matters for example in following matrix multiply: int **mmult(int rows, int cols, int **m1, int **m2, int **m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 llvm-svn: 176403
-
Andrew Trick authored
llvm-svn: 176400
-
- Mar 01, 2013
-
-
Akira Hatanaka authored
This patch eliminates the need to emit a constant move instruction when this pattern is matched: (select (setgt a, Constant), T, F) The pattern above effectively turns into this: (conditional-move (setlt a, Constant + 1), F, T) llvm-svn: 176384
-
Akira Hatanaka authored
llvm-svn: 176380
-
Michael Liao authored
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands but TLI.getShiftAmountTy() so far only return scalar type. As a result, backend logic assuming that breaks. - Rename the original TLI.getShiftAmountTy() to TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to return target-specificed scalar type or the same vector type as the 1st operand. - Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar type. llvm-svn: 176364
-
Chad Rosier authored
dispatch code. As far as I can tell the thumb2 code is behaving as expected. I was able to compile and run the associated test case for both arm and thumb1. rdar://13066352 llvm-svn: 176363
-
Jyotsna Verma authored
llvm-svn: 176358
-
Christian Konig authored
v2: based on Michels patch, but now allows copying of all registers sizes. Signed-off-by:
Michel Dänzer <michel.daenzer@amd.com> Signed-off-by:
Christian König <christian.koenig@amd.com> llvm-svn: 176346
-
Christian Konig authored
They won't match anyway. Signed-off-by:
Christian König <christian.koenig@amd.com> llvm-svn: 176345
-
Christian Konig authored
It's much easier to specify the encoding with tablegen directly. Signed-off-by:
Christian König <christian.koenig@amd.com> llvm-svn: 176344
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> llvm-svn: 176343
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> llvm-svn: 176342
-
Duncan Sands authored
llvm-svn: 176341
-
Akira Hatanaka authored
llvm-svn: 176330
-
Akira Hatanaka authored
successor basic blocks. Currently this is off by default. llvm-svn: 176329
-
Akira Hatanaka authored
terminator. No functionality change. llvm-svn: 176326
-