Commits · 5156c3857066396e744137cdadf94fb18fa29da2 · Lorenzo Albano / LLVM bpEVL

Jan 22, 2015

Make ScalarEvolution less aggressive with respect to no-wrap flags. · cb473663

Sanjoy Das authored Jan 22, 2015

ScalarEvolution currently lowers a subtraction recurrence to an add
recurrence with the same no-wrap flags as the subtraction.  This is
incorrect because `sub nsw X, Y` is not the same as `add nsw X, -Y`
and `sub nuw X, Y` is not the same as `add nuw X, -Y`.  This patch
fixes the issue, and adds two test cases demonstrating the bug.

Differential Revision: http://reviews.llvm.org/D7081

llvm-svn: 226755

cb473663

Explicitly describe '///' versus '//' comment delimiters. · 343e4964
Paul Robinson authored Jan 22, 2015
```
llvm-svn: 226750
```
343e4964
Make DwarfExpression use the new DIExpressionIterator. NFC. · 531641a0
Adrian Prantl authored Jan 22, 2015
```
llvm-svn: 226748
```
531641a0
Rewrite DIExpression::Verify() using an iterator. NFC. · 9260ccae
Adrian Prantl authored Jan 22, 2015
```
Addresses review comments for r226627.

llvm-svn: 226747
```
9260ccae

[canonicalization] Refactor how we create new stores into a helper · 2135b97d

Chandler Carruth authored Jan 21, 2015

function. This is a bit tidier anyways and will make a subsquent patch
simpler as I want to add another case to this combine.

llvm-svn: 226746

2135b97d

[X86][SSE] Missing SSE/AVX1 memory folding integer instructions · 5fa0fb23

Simon Pilgrim authored Jan 21, 2015

Added most of the missing integer vector folding patterns for SSE (to SSE42) and AVX1.

The most useful of these are probably the i32/i64 extraction, i8/i16/i32/i64 insertions, zero/sign extension, unsigned saturation subtractions, i64 subtractions and the variable mask blends (pblendvb) - others include CLMUL, SSE42 string comparisons and bit tests.

Differential Revision: http://reviews.llvm.org/D7094

llvm-svn: 226745

5fa0fb23

DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N)) · 3007ba0a
Tim Northover authored Jan 21, 2015
```
It can help with argument juggling on some targets, and is generally a good
idea.

llvm-svn: 226740
```
3007ba0a

Jan 21, 2015

DebugInfo: Use distinct inlinedAt MDLocations to avoid separate inlined calls being coalesced · df706288

David Blaikie authored Jan 21, 2015

When two calls from the same MDLocation are inlined they currently get
treated as one inlined function call (creating difficulty debugging,
duplicate variables, etc).

Clang worked around this by including column information on inline calls
which doesn't address LTO inlining or calls to the same function from
the same line and column (such as through a macro). It also didn't
address ctor and member function calls.

By making the inlinedAt locations distinct, every call site has an
explicitly distinct location that cannot be coalesced with any other
call.

This can produce linearly (2x in the worst case where every call is
inlined and the call instruction has a non-call instruction at the same
location) more debug locations. Any increase beyond that are in cases
where the Clang workaround was insufficient and the new scheme is
creating necessary distinct nodes that were being erroneously coalesced
previously.

After this change to LLVM the incomplete workarounds in Clang. That
should reduce the number of debug locations (in a build without column
info, the default on Darwin, not the default on Linux) by not creating
pseudo-distinct locations for every call to an inline function.

(oh, and I made the inlined-at chain rebuilding iterative instead of
recursive because I was having trouble wrapping my head around it the
way it was - open to discussion on the right design for that function
(including going back to a recursive solution))

llvm-svn: 226736

df706288

R600: Add checks for urem/srem by a constant · b45c78bc

Matt Arsenault authored Jan 21, 2015

Make sure this uses the faster expansion using magic constants
to avoid the full division path.

llvm-svn: 226734

b45c78bc

LiveIntervalAnalysis: Mark subregister defs as undef when we determined they... · c1988f38

Matthias Braun authored Jan 21, 2015

LiveIntervalAnalysis: Mark subregister defs as undef when we determined they are only reading a dead superregister value

This was not necessary before as this case can only be detected when the
liveness analysis is at subregister level.

llvm-svn: 226733

c1988f38

Adding a new cl::HideUnrelatedOptions API to allow clang to migrate off cl::getRegisteredOptions. · 9e13af7a

Chris Bieneman authored Jan 21, 2015

Summary: cl::getRegisteredOptions really exposes some of the innards of how command line parsing is implemented. Exposing new APIs that allow us to disentangle client code from implementation details will allow us to make more extensive changes to command line parsing.

Reviewers: chandlerc, dexonsmith, beanz

Reviewed By: dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7100

llvm-svn: 226729

9e13af7a

[X86][SSE] Added support for SSE3 lane duplication shuffle instructions · b16b09b1

Simon Pilgrim authored Jan 21, 2015

This patch adds shuffle matching for the SSE3 MOVDDUP, MOVSLDUP and MOVSHDUP instructions. The big use of these being that they avoid many single source shuffles from needing to use (pre-AVX) dual source instructions such as SHUFPD/SHUFPS: causing extra moves and preventing load folds.

Adding these instructions uncovered an issue in XFormVExtractWithShuffleIntoLoad which crashed on single operand shuffle instructions (now fixed). It also involved fixing getTargetShuffleMask to correctly identify theses instructions as unary shuffles.

Also adds a missing tablegen pattern for MOVDDUP.

Differential Revision: http://reviews.llvm.org/D7042

llvm-svn: 226716

b16b09b1

R600: Add missing tests for i64 srem · d9987c7b
Matt Arsenault authored Jan 21, 2015
```
llvm-svn: 226713
```
d9987c7b

Fix load-store optimizer on thumbv4t · 229eb4ca

Jon Roelofs authored Jan 21, 2015

Thumbv4t does not have lo->lo copies other than MOVS,
and that can't be predicated. So emit MOVS when needed
and bail if there's a predicate.

http://reviews.llvm.org/D6592

llvm-svn: 226711

229eb4ca

Added test to cover the CFLAA bitset indexing bug. · a1255d3a
George Burgess IV authored Jan 21, 2015
```
llvm-svn: 226710
```
a1255d3a

InstCombine: Don't strip bitcasts off of callsites marked 'thunk' · 4c0a6e91

David Majnemer authored Jan 21, 2015

The return type of a thunk is meaningless, we just want the arguments
and return value to be forwarded.

llvm-svn: 226708

4c0a6e91

[X86][SSE] movddup shuffle mask decodes · 47af023a

Simon Pilgrim authored Jan 21, 2015

Patch to provide shuffle decodes and asm comments for the SSE3/AVX1 movddup double duplication instructions.

llvm-svn: 226705

47af023a

simplify expression · abf5553e
Adrian Prantl authored Jan 21, 2015
```
llvm-svn: 226701
```
abf5553e
Fix a compile issue on MSVC and call finalize(). · 53d382fc
Adrian Prantl authored Jan 21, 2015
```
llvm-svn: 226694
```
53d382fc

LiveIntervalAnalysis: Factor out code to update liveness on vreg def removal · 311730ac

Matthias Braun authored Jan 21, 2015

This cleans up code and is more in line with the general philosophy of
modifying LiveIntervals through LiveIntervalAnalysis instead of changing
them directly.

This also fixes a case where SplitEditor::removeBackCopies() would miss
the subregister ranges.

llvm-svn: 226690

311730ac

LiveIntervalAnalysis: document removePhysRegDefAt() function. · e1b7da71
Matthias Braun authored Jan 21, 2015
```
llvm-svn: 226689
```
e1b7da71

LiveIntervalAnalysis: Factor out code to update liveness on physreg def removal · cfb8ad29

Matthias Braun authored Jan 21, 2015

This cleans up code and is more in line with the general philosophy of
modifying LiveIntervals through LiveIntervalAnalysis instead of changing
them directly.

llvm-svn: 226687

cfb8ad29

LiveIntervalAnalysis: Remove unused pruneValue() variant. · 1002baf7
Matthias Braun authored Jan 21, 2015
```
llvm-svn: 226686
```
1002baf7
Let subprograms with instructions without parent scopes fail the · 1292e24d
Adrian Prantl authored Jan 21, 2015
```
verification. Tested via a unit test.

Follow-up to r226616.

llvm-svn: 226684
```
1292e24d

R600/SI: Custom lower fround · b0055488

Matt Arsenault authored Jan 21, 2015

This fixes it for SI. It also removes the pattern
used previously for Evergreen for f32. I'm not sure
if the the new R600 output is better or not, but it uses
1 fewer instructions if BFI is available.

llvm-svn: 226682

b0055488

[Hexagon] Converting multiply and accumulate with immediate intrinsics to patterns. · 94269db8
Colin LeMahieu authored Jan 21, 2015
```
llvm-svn: 226681
```
94269db8

[X86] Declare SSE4.1/AVX2 vector extloads covered by PMOV[SZ]X legal. · 8f09e9f7

Ahmed Bougacha authored Jan 21, 2015

Now that we can fully specify extload legality, we can declare them
legal for the PMOVSX/PMOVZX instructions.  This for instance enables
a DAGCombine to fire on code such as
  (and (<zextload-equivalent> ...), <redundant mask>)
to turn it into:
  (zextload ...)
as seen in the testcase changes.

There is one regression, in widen_load-2.ll: we're no longer able
to do store-to-load forwarding with illegal extload memory types.
This will be addressed separately.

Differential Revision: http://reviews.llvm.org/D6533

llvm-svn: 226676

8f09e9f7

[lit] Format JSONMetricValue strings better. · dea770ae
Eric Fiselier authored Jan 21, 2015
```
llvm-svn: 226672
```
dea770ae
Fixed a bug with how we determine bitset indices. · 3c898c21
George Burgess IV authored Jan 21, 2015
```
llvm-svn: 226671
```
3c898c21
Add missing include guards to WindowsSupport.h. · 3f02c14c
Yaron Keren authored Jan 21, 2015
```
llvm-svn: 226669
```
3f02c14c
Revert "DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))" · cf3d80fe
Tim Northover authored Jan 21, 2015
```
It hadn't gone through review yet, but was still on my local copy.

This reverts commit r226663

llvm-svn: 226665
```
cf3d80fe

AArch64: add backend option to reserve x18 (platform register) · b9184f2b

Tim Northover authored Jan 21, 2015

AAPCS64 says that it's up to the platform to specify whether x18 is
reserved, and a first step on that way is to add a flag controlling
it.

From: Andrew Turner <andrew@fubar.geek.nz>
llvm-svn: 226664

b9184f2b

DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N)) · 85cd2791
Tim Northover authored Jan 21, 2015
```
llvm-svn: 226663
```
85cd2791
[x32] Fast ISel should use LEA64_32r instead of LEA32r to adjust addresses in x32 mode. · ada9fa1c
Michael Kuperstein authored Jan 21, 2015
```
llvm-svn: 226661
```
ada9fa1c
Use a smaller pragma unroll threshold to reduce test execution time. · 4ac461c3
Alexander Potapenko authored Jan 21, 2015
```
When opt is compiled with AddressSanitizer it takes more than 30 seconds
to unroll the loop in unroll_1M().

llvm-svn: 226660
```
4ac461c3

[msan] Update origin for the entire destination range on memory store. · 79ca0fd1

Evgeniy Stepanov authored Jan 21, 2015

Previously we always stored 4 bytes of origin at the destination address
even for 8-byte (and longer) stores.

This should fix rare missing, or incorrect, origin stacks in MSan reports.

llvm-svn: 226658

79ca0fd1

[mips][microMIPS] MicroMIPS 16-bit unconditional branch instruction B · 5cfebdde

Jozef Kolek authored Jan 21, 2015

Implement microMIPS 16-bit unconditional branch instruction B.

Implemented 16-bit microMIPS unconditional instruction has real name B16, and
B is an alias which expands to either B16 or BEQ according to the rules:
b 256 --> b16 256 # R_MICROMIPS_PC10_S1
b 12256 --> beq $zero, $zero, 12256 # R_MICROMIPS_PC16_S1
b label --> beq $zero, $zero, label # R_MICROMIPS_PC16_S1

Differential Revision: http://reviews.llvm.org/D3514

llvm-svn: 226657

5cfebdde

[mips][microMIPS] Implement ADDIUPC instruction · 2c6d7320
Jozef Kolek authored Jan 21, 2015
```
Differential Revision: http://reviews.llvm.org/D6582

llvm-svn: 226656
```
2c6d7320

[PM] Refactor the InstCombiner interface to use an external worklist. · df5747a9

Chandler Carruth authored Jan 21, 2015

Because in its primary function pass the combiner is run repeatedly over
the same function until doing so produces no changes, it is essentially
to not re-allocate the worklist. However, as a utility, the more common
pattern would be to put a limited set of instructions in the worklist
rather than the entire function body. That is also the more likely
pattern when used by the new pass manager.

The result is a very light weight combiner that does the visiting with
a separable worklist. This can then be wrapped up in a helper function
for users that want a combiner utility, or as I have here it can be
wrapped up in a pass which manages the iterations used when combining an
entire function's instructions.

Hopefully this removes some of the worst of the interface warts that
became apparant with the last patch here. However, there is clearly more
work. I've again left some FIXMEs for the most egregious. The ones that
stick out to me are the exposure of the worklist and IR builder as
public members, and the use of pointers rather than references. However,
fixing these is likely to be much more mechanical and less interesting
so I didn't want to touch them in this patch.

llvm-svn: 226655

df5747a9

[PM] Simplify (ha! ha!) the way that instcombine calls the · ba4c5179

Chandler Carruth authored Jan 21, 2015

SimplifyLibCalls utility by sinking it into the specific call part of
the combiner.

This will avoid us needing to do any contortions to build this object in
a subsequent refactoring I'm doing and seems generally better factored.
We don't need this utility everywhere and it carries no interesting
state so we might as well build it on demand.

llvm-svn: 226654

ba4c5179