- Mar 02, 2016
-
-
George Burgess IV authored
llvm-svn: 262452
-
Sanjay Patel authored
that is broken by this change llvm-svn: 262440
-
Sanjay Patel authored
As noted in the code comment, I don't think we can do the same transform that we do for *scalar* integers comparisons to *vector* integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424
-
- Mar 01, 2016
-
-
Dehao Chen authored
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419
-
Owen Anderson authored
Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376
-
Daniel Berlin authored
Summary: This adds the beginning of an update API to preserve MemorySSA. In particular, this patch adds a way to remove memory SSA accesses when instructions are deleted. It also adds relevant unit testing infrastructure for MemorySSA's API. (There is an actual user of this API, i will make that diff dependent on this one. In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P) Reviewers: hfinkel, reames, george.burgess.iv Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17157 llvm-svn: 262362
-
Petar Jovanovic authored
Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349
-
Petar Jovanovic authored
This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337
-
Sanjay Patel authored
Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273
-
Adam Nemet authored
llvm-svn: 262270
-
Sanjay Patel authored
The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the load mask is constant: __m128 mload_zeros(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(0)); } __m128 mload_fakeones(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(1)); } __m128 mload_ones(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000)); } __m128 mload_oneset(float *f) { return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0)); } ...so none of the above will actually generate a masked load for optimized code. This is the masked load counterpart to: http://reviews.llvm.org/rL262064 llvm-svn: 262269
-
- Feb 29, 2016
-
-
Adam Nemet authored
We can actually have dependences between accesses with different underlying types. Bail in this case. A test will follow shortly. llvm-svn: 262267
-
Adam Nemet authored
Summary: I re-benchmarked this and results are similar to original results in D13259: On ARM64: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27% SingleSource/Benchmarks/Polybench/stencils/adi -19.78% On x86: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -27.14% And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop Distribution. In terms of compile time, there is ~5% increase on both SingleSource/Benchmarks/Misc/oourafft and SingleSource/Benchmarks/Linkpack/linkpack-pc. These are both very tiny loop-intensive programs where SCEV computations dominates compile time. The reason that time spent in SCEV increases has to do with the design of the old pass manager. If a transform pass does not preserve an analysis we *invalidate* the analysis even if there was *no* modification made by the transform pass. This means that currently we don't take advantage of LLE and LV sharing the same analysis (LAA) and unfortunately we recompute LAA *and* SCEV for LLE. (There should be a way to work around this limitation in the case of SCEV and LAA since both compute things on demand and internally cache their result. Thus we could pretend that transform passes preserve these analyses and manually invalidate them upon actual modification. On the other hand the new pass manager is supposed to solve so I am not sure if this is worthwhile.) Reviewers: hfinkel, dberlin Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16300 llvm-svn: 262250
-
Rong Xu authored
llvm-svn: 262242
-
Dehao Chen authored
Summary: Now discriminator is assigned per-function instead of per-module. Reviewers: davidxl, dnovillo Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D17664 llvm-svn: 262240
-
- Feb 28, 2016
-
-
Xinliang David Li authored
Differential Revision: http://reviews.llvm.org/D17654 llvm-svn: 262157
-
- Feb 27, 2016
-
-
Renato Golin authored
This reverts commit r262103, as it broke all ARM and AArch64 bots. llvm-svn: 262139
-
Sean Silva authored
Summary: The PS4 linker seems to handle this fine. Hi David, it seems that indeed most ELF linkers support __{start,stop}_SECNAME, as our proprietary linker does as well. This follows the pattern of r250679 w.r.t. the testing. Maggie, Phillip, Paul: I've tested this with the PS4 SDK 3.5 toolchain prerelease and it seems to work fine. Reviewers: davidxl Subscribers: probinson, phillip.power, MaggieYi Differential Revision: http://reviews.llvm.org/D17672 llvm-svn: 262112
-
Mike Aizatsky authored
llvm-svn: 262111
-
Kostya Serebryany authored
[libFuzzer] don't emit callbacks to sanitizer run-time in -fsanitize-coverage=trace-pc mode; update libFuzzer doc for previous commit llvm-svn: 262110
-
Chandler Carruth authored
merged into a loop that was subsequently unrolled (or otherwise nuked). In this case it can't merge in the ASTs for any remaining nested loops, it needs to re-add their instructions dircetly. The fix is very isolated, but I've pulled the code for merging blocks into the AST into a single place in the process. The only behavior change is in the case which would have crashed before. This fixes a crash reported by Mikael Holmen on the list after r261316 restored much of the loop pass pipelining and allowed us to actually do this kind of nested transformation sequenc. I've taken that test case and further reduced it into the somewhat twisty maze of loops in the included test case. This does in fact trigger the bug even in this reduced form. llvm-svn: 262108
-
Mike Aizatsky authored
Summary: Without tree pruning clang has 2,667,552 points. Wiht only dominators pruning: 1,515,586. With both dominators & predominators pruning: 1,340,534. Differential Revision: http://reviews.llvm.org/D17671 llvm-svn: 262103
-
Reid Kleckner authored
We ended up removing a save/restore pair around an inalloca call, leading to a miscompile in Chromium. llvm-svn: 262095
-
- Feb 26, 2016
-
-
Sanjay Patel authored
Replicate everything for integers...because x86. Continuation of: http://reviews.llvm.org/rL262064 llvm-svn: 262077
-
Sanjay Patel authored
The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the store mask is constant: void mstore_zero_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0), v); } void mstore_fake_ones_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(1), v); } void mstore_ones_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v); } void mstore_one_set_elt_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v); } ...so none of the above will actually generate a masked store for optimized code. Differential Revision: http://reviews.llvm.org/D17485 llvm-svn: 262064
-
Haicheng Wu authored
This change tries to find more opportunities to thread over basic blocks. llvm-svn: 261981
-
Michael Zolotukhin authored
Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect. Reviewers: chandlerc, hfinkel Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17632 llvm-svn: 261958
-
Mike Aizatsky authored
Summary: This is the first simple attempt to reduce number of coverage- instrumented blocks. If a basic block dominates all its successors, then its coverage information is useless to us. Ingore such blocks if santizer-coverage-prune-tree option is set. Differential Revision: http://reviews.llvm.org/D17626 llvm-svn: 261949
-
- Feb 24, 2016
-
-
Anna Zaks authored
llvm-svn: 261794
-
David Majnemer authored
The cleanupret instruction has an invariant that it's 'from' operand be a cleanuppad. This invariant was violated when we removed a dead block which removed a cleanuppad leaving behind a cleanupret with an undef 'from' operand. This was solved in r261731 by staving off the removal of the dead block to a later pass. However, it occured to me that we do not need to do this. Instead, we can simply avoid processing the cleanupret if it has an undef 'from' operand because we know that it will be removed soon. llvm-svn: 261754
-
Sanjay Patel authored
This is part of the payoff for the refactoring in: http://reviews.llvm.org/rL261649 http://reviews.llvm.org/rL261707 In addition to removing a pile of duplicated code, the xor case was missing the optimization for vector types because it checked "SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()" like 'and' and 'or' were already doing. This solves part of: https://llvm.org/bugs/show_bug.cgi?id=26702 llvm-svn: 261750
-
Artur Pilipenko authored
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736
-
David Majnemer authored
DeleteDeadBlock was called indiscriminately, leading to cleanuprets with undef cleanuppad references. Instead, try to drain the BB of most of it's instructions if it is unreachable. We can then remove the BB if it solely consists of a terminator (and maybe some phis). llvm-svn: 261731
-
Sanjay Patel authored
Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra check from the nearly identical 'or' case: if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) But I'm not sure how to expose that difference in a regression test. Without that check, the 'or' path will infinite loop on: test/Transforms/InstCombine/zext-or-icmp.ll because the zext-or-icmp fold is attempting a reverse transform. The refactoring should extend to the 'xor' case next to solve part of PR26702. llvm-svn: 261707
-
- Feb 23, 2016
-
-
Sanjay Patel authored
Less indenting, named local variables, more descriptive names. llvm-svn: 261659
-
David Majnemer authored
It is problematic if the inlinee has a cleanupret which unwinds to caller and we inline it into a call site which doesn't unwind. If the funclet unwinds anywhere other than to the caller, then we will give the funclet two unwind destinations. This will result in a verifier failure. Seeing as how the caller wasn't an invoke (which would locally unwind) and that the funclet cannot unwind to caller, we must conclude that an 'unwind to caller' cleanupret is dynamically unreachable. This fixes PR26698. Differential Revision: http://reviews.llvm.org/D17536 llvm-svn: 261656
-
Sanjay Patel authored
llvm-svn: 261652
-
Sanjay Patel authored
This is a straight cut and paste of the existing code and is intended to be the first step in solving part of PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 We should be able to reuse most of this and delete the nearly identical existing code in visitOr(). Then, we can enhance visitXor() to use the same code too. llvm-svn: 261649
-
Michael Zolotukhin authored
llvm-svn: 261600
-
Michael Zolotukhin authored
llvm-svn: 261597
-