- Mar 05, 2013
-
-
Bill Wendling authored
llvm-svn: 176467
-
David Sehr authored
one-byte NOPs. If the processor actually executes those NOPs, as it sometimes does with aligned bundling, this can have a performance impact. From my micro-benchmarks run on my one machine, a 15-byte NOP followed by twelve one-byte NOPs is about 20% worse than a 15 followed by a 12. This patch changes NOP emission to emit as many 15-byte (the maximum) as possible followed by at most one shorter NOP. llvm-svn: 176464
-
- Mar 04, 2013
-
-
Lang Hames authored
GlobalValue linkage up to ExternalLinkage in the ExtractGV pass. This prevents linkonce and linkonce_odr symbols from being DCE'd. llvm-svn: 176459
-
Akira Hatanaka authored
"move $4, $5" is printed instead of "or $4, $5, $zero". llvm-svn: 176455
-
Jack Carter authored
'R' An address that can be sued in a non-macro load or store. This patch includes a positive test case. llvm-svn: 176452
-
Eli Bendersky authored
running llvm-objdump on Darwin. llvm-svn: 176443
-
Preston Gurd authored
* Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
-
Tom Stellard authored
llvm-svn: 176439
-
Jia Liu authored
llvm-svn: 176426
-
- Mar 02, 2013
-
-
Jim Grosbach authored
The VDUP instruction source register doesn't allow a non-constant lane index, so make sure we don't construct a ARM::VDUPLANE node asking it to do so. rdar://13328063 http://llvm.org/bugs/show_bug.cgi?id=13963 llvm-svn: 176413
-
Jim Grosbach authored
llvm-svn: 176412
-
Jim Grosbach authored
llvm-svn: 176411
-
Arnold Schwaighofer authored
Mark them as expand, they are not legal as our backend does not match them. llvm-svn: 176410
-
Nuno Lopes authored
This adds minimalistic support for PHI nodes to llvm.objectsize() evaluation fingers crossed so that it does break clang boostrap again.. llvm-svn: 176408
-
Nuno Lopes authored
this is similar to getObjectSize(), but doesnt subtract the offset tweak the BasicAA code accordingly (per PR14988) llvm-svn: 176407
-
Arnold Schwaighofer authored
This matters for example in following matrix multiply: int **mmult(int rows, int cols, int **m1, int **m2, int **m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 llvm-svn: 176403
-
Andrew Trick authored
llvm-svn: 176400
-
Nadav Rotem authored
The LoopVectorizer often runs multiple times on the same function due to inlining. When this happens the loop vectorizer often vectorizes the same loops multiple times, increasing code size and adding unneeded branches. With this patch, the vectorizer during vectorization puts metadata on scalar loops and marks them as 'already vectorized' so that it knows to ignore them when it sees them a second time. PR14448. llvm-svn: 176399
-
Peter Collingbourne authored
llvm-svn: 176397
-
Jordan Rose authored
Previously we relied on it being included by config-ix.cmake. llvm-svn: 176396
-
Michael Gottesman authored
This reverts commit aac7922b8fe7ae733d3fe6697e6789fd730315dc. I am reverting the commit since it broke the phase 1 public buildbot for a few hours. http://lab.llvm.org:8013/builders/clang-x86_64-darwin11-nobootstrap-RA/builds/2137 llvm-svn: 176394
-
Eli Bendersky authored
llvm-svn: 176391
-
Andrew Trick authored
Fix the way resources are counted. I'm taking some time to cleanup the way MachineScheduler handles in-order machine resources. Eventually we'll need more PPC/Atom test cases in tree. llvm-svn: 176390
-
- Mar 01, 2013
-
-
Argyrios Kyrtzidis authored
The sys::fs::is_directory() check is unnecessary because, if the filename is a directory, the function will fail anyway with the same error code returned. Remove the check to avoid an unnecessary stat call. Someone needs to review on windows and see if the check is necessary there or not. llvm-svn: 176386
-
Stefanus Du Toit authored
Checking to see if svn notifications also use correct address now. llvm-svn: 176385
-
Akira Hatanaka authored
This patch eliminates the need to emit a constant move instruction when this pattern is matched: (select (setgt a, Constant), T, F) The pattern above effectively turns into this: (conditional-move (setlt a, Constant + 1), F, T) llvm-svn: 176384
-
Jean-Luc Duprat authored
llvm-svn: 176382
-
Eli Bendersky authored
Also removed the comments of "should produce..." because they completely don't match the actually produced output. llvm-svn: 176381
-
Akira Hatanaka authored
llvm-svn: 176380
-
Akira Hatanaka authored
llvm-svn: 176378
-
Eli Bendersky authored
detail. The was this test was written, it was relying on an implementation detail (fixups) and hence was very brittle (relying, among other things, on the exact ordering of statistics printed by MC). The test was rewritten to check a more observable output difference. While it doesn't cover 100% of the things the original test covered, it's a good practice to write regression tests this way. If we want to check that internal details and invariants hold, such tests should be expressed as unit tests. llvm-svn: 176377
-
Edwin Vane authored
The make (all) target takes care of creating lit configs and auto-generating tests. The problem with the original 'lit.site.cfg' target is it's not recursive and doesn't fully create everything necessary for testing clang-tools-extra. llvm-svn: 176374
-
Michael Liao authored
- These tests wont't crash on trunk but would be better to add them so that they don't break again in the future. llvm-svn: 176369
-
Chad Rosier authored
handle indirect register inputs. rdar://13322011 llvm-svn: 176367
-
Benjamin Kramer authored
Fixes PR15384. llvm-svn: 176366
-
Michael Ilseman authored
This reduces the time actually spent doing string to ID conversion and shows a 10% improvement in compile time for a particularly bad case that involves ARM Neon intrinsics (these have many overloads). Patch by Jean-Luc Duprat! llvm-svn: 176365
-
Michael Liao authored
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands but TLI.getShiftAmountTy() so far only return scalar type. As a result, backend logic assuming that breaks. - Rename the original TLI.getShiftAmountTy() to TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to return target-specificed scalar type or the same vector type as the 1st operand. - Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar type. llvm-svn: 176364
-
Chad Rosier authored
dispatch code. As far as I can tell the thumb2 code is behaving as expected. I was able to compile and run the associated test case for both arm and thumb1. rdar://13066352 llvm-svn: 176363
-
Christian Konig authored
Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 176359
-
Jyotsna Verma authored
llvm-svn: 176358
-