- Jan 22, 2015
-
-
Sanjoy Das authored
ScalarEvolution currently lowers a subtraction recurrence to an add recurrence with the same no-wrap flags as the subtraction. This is incorrect because `sub nsw X, Y` is not the same as `add nsw X, -Y` and `sub nuw X, Y` is not the same as `add nuw X, -Y`. This patch fixes the issue, and adds two test cases demonstrating the bug. Differential Revision: http://reviews.llvm.org/D7081 llvm-svn: 226755
-
Paul Robinson authored
llvm-svn: 226750
-
Adrian Prantl authored
llvm-svn: 226748
-
Adrian Prantl authored
Addresses review comments for r226627. llvm-svn: 226747
-
Chandler Carruth authored
function. This is a bit tidier anyways and will make a subsquent patch simpler as I want to add another case to this combine. llvm-svn: 226746
-
Simon Pilgrim authored
Added most of the missing integer vector folding patterns for SSE (to SSE42) and AVX1. The most useful of these are probably the i32/i64 extraction, i8/i16/i32/i64 insertions, zero/sign extension, unsigned saturation subtractions, i64 subtractions and the variable mask blends (pblendvb) - others include CLMUL, SSE42 string comparisons and bit tests. Differential Revision: http://reviews.llvm.org/D7094 llvm-svn: 226745
-
Tim Northover authored
It can help with argument juggling on some targets, and is generally a good idea. llvm-svn: 226740
-
- Jan 21, 2015
-
-
David Blaikie authored
When two calls from the same MDLocation are inlined they currently get treated as one inlined function call (creating difficulty debugging, duplicate variables, etc). Clang worked around this by including column information on inline calls which doesn't address LTO inlining or calls to the same function from the same line and column (such as through a macro). It also didn't address ctor and member function calls. By making the inlinedAt locations distinct, every call site has an explicitly distinct location that cannot be coalesced with any other call. This can produce linearly (2x in the worst case where every call is inlined and the call instruction has a non-call instruction at the same location) more debug locations. Any increase beyond that are in cases where the Clang workaround was insufficient and the new scheme is creating necessary distinct nodes that were being erroneously coalesced previously. After this change to LLVM the incomplete workarounds in Clang. That should reduce the number of debug locations (in a build without column info, the default on Darwin, not the default on Linux) by not creating pseudo-distinct locations for every call to an inline function. (oh, and I made the inlined-at chain rebuilding iterative instead of recursive because I was having trouble wrapping my head around it the way it was - open to discussion on the right design for that function (including going back to a recursive solution)) llvm-svn: 226736
-
Matt Arsenault authored
Make sure this uses the faster expansion using magic constants to avoid the full division path. llvm-svn: 226734
-
Matthias Braun authored
LiveIntervalAnalysis: Mark subregister defs as undef when we determined they are only reading a dead superregister value This was not necessary before as this case can only be detected when the liveness analysis is at subregister level. llvm-svn: 226733
-
Chris Bieneman authored
Summary: cl::getRegisteredOptions really exposes some of the innards of how command line parsing is implemented. Exposing new APIs that allow us to disentangle client code from implementation details will allow us to make more extensive changes to command line parsing. Reviewers: chandlerc, dexonsmith, beanz Reviewed By: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7100 llvm-svn: 226729
-
Simon Pilgrim authored
This patch adds shuffle matching for the SSE3 MOVDDUP, MOVSLDUP and MOVSHDUP instructions. The big use of these being that they avoid many single source shuffles from needing to use (pre-AVX) dual source instructions such as SHUFPD/SHUFPS: causing extra moves and preventing load folds. Adding these instructions uncovered an issue in XFormVExtractWithShuffleIntoLoad which crashed on single operand shuffle instructions (now fixed). It also involved fixing getTargetShuffleMask to correctly identify theses instructions as unary shuffles. Also adds a missing tablegen pattern for MOVDDUP. Differential Revision: http://reviews.llvm.org/D7042 llvm-svn: 226716
-
Matt Arsenault authored
llvm-svn: 226713
-
Jon Roelofs authored
Thumbv4t does not have lo->lo copies other than MOVS, and that can't be predicated. So emit MOVS when needed and bail if there's a predicate. http://reviews.llvm.org/D6592 llvm-svn: 226711
-
George Burgess IV authored
llvm-svn: 226710
-
David Majnemer authored
The return type of a thunk is meaningless, we just want the arguments and return value to be forwarded. llvm-svn: 226708
-
Simon Pilgrim authored
Patch to provide shuffle decodes and asm comments for the SSE3/AVX1 movddup double duplication instructions. llvm-svn: 226705
-
Adrian Prantl authored
llvm-svn: 226701
-
Adrian Prantl authored
llvm-svn: 226694
-
Matthias Braun authored
This cleans up code and is more in line with the general philosophy of modifying LiveIntervals through LiveIntervalAnalysis instead of changing them directly. This also fixes a case where SplitEditor::removeBackCopies() would miss the subregister ranges. llvm-svn: 226690
-
Matthias Braun authored
llvm-svn: 226689
-
Matthias Braun authored
This cleans up code and is more in line with the general philosophy of modifying LiveIntervals through LiveIntervalAnalysis instead of changing them directly. llvm-svn: 226687
-
Matthias Braun authored
llvm-svn: 226686
-
Adrian Prantl authored
verification. Tested via a unit test. Follow-up to r226616. llvm-svn: 226684
-
Matt Arsenault authored
This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682
-
Colin LeMahieu authored
llvm-svn: 226681
-
Ahmed Bougacha authored
Now that we can fully specify extload legality, we can declare them legal for the PMOVSX/PMOVZX instructions. This for instance enables a DAGCombine to fire on code such as (and (<zextload-equivalent> ...), <redundant mask>) to turn it into: (zextload ...) as seen in the testcase changes. There is one regression, in widen_load-2.ll: we're no longer able to do store-to-load forwarding with illegal extload memory types. This will be addressed separately. Differential Revision: http://reviews.llvm.org/D6533 llvm-svn: 226676
-
Eric Fiselier authored
llvm-svn: 226672
-
George Burgess IV authored
llvm-svn: 226671
-
Yaron Keren authored
llvm-svn: 226669
-
Tim Northover authored
It hadn't gone through review yet, but was still on my local copy. This reverts commit r226663 llvm-svn: 226665
-
Tim Northover authored
AAPCS64 says that it's up to the platform to specify whether x18 is reserved, and a first step on that way is to add a flag controlling it. From: Andrew Turner <andrew@fubar.geek.nz> llvm-svn: 226664
-
Tim Northover authored
llvm-svn: 226663
-
Michael Kuperstein authored
llvm-svn: 226661
-
Alexander Potapenko authored
When opt is compiled with AddressSanitizer it takes more than 30 seconds to unroll the loop in unroll_1M(). llvm-svn: 226660
-
Evgeniy Stepanov authored
Previously we always stored 4 bytes of origin at the destination address even for 8-byte (and longer) stores. This should fix rare missing, or incorrect, origin stacks in MSan reports. llvm-svn: 226658
-
Jozef Kolek authored
Implement microMIPS 16-bit unconditional branch instruction B. Implemented 16-bit microMIPS unconditional instruction has real name B16, and B is an alias which expands to either B16 or BEQ according to the rules: b 256 --> b16 256 # R_MICROMIPS_PC10_S1 b 12256 --> beq $zero, $zero, 12256 # R_MICROMIPS_PC16_S1 b label --> beq $zero, $zero, label # R_MICROMIPS_PC16_S1 Differential Revision: http://reviews.llvm.org/D3514 llvm-svn: 226657
-
Jozef Kolek authored
Differential Revision: http://reviews.llvm.org/D6582 llvm-svn: 226656
-
Chandler Carruth authored
Because in its primary function pass the combiner is run repeatedly over the same function until doing so produces no changes, it is essentially to not re-allocate the worklist. However, as a utility, the more common pattern would be to put a limited set of instructions in the worklist rather than the entire function body. That is also the more likely pattern when used by the new pass manager. The result is a very light weight combiner that does the visiting with a separable worklist. This can then be wrapped up in a helper function for users that want a combiner utility, or as I have here it can be wrapped up in a pass which manages the iterations used when combining an entire function's instructions. Hopefully this removes some of the worst of the interface warts that became apparant with the last patch here. However, there is clearly more work. I've again left some FIXMEs for the most egregious. The ones that stick out to me are the exposure of the worklist and IR builder as public members, and the use of pointers rather than references. However, fixing these is likely to be much more mechanical and less interesting so I didn't want to touch them in this patch. llvm-svn: 226655
-
Chandler Carruth authored
SimplifyLibCalls utility by sinking it into the specific call part of the combiner. This will avoid us needing to do any contortions to build this object in a subsequent refactoring I'm doing and seems generally better factored. We don't need this utility everywhere and it carries no interesting state so we might as well build it on demand. llvm-svn: 226654
-