Commits · 65b4ab9921364d27eb0c4704d8259b22b0cdc53e · Lorenzo Albano / LLVM bpEVL

Feb 22, 2019

BreakCriticalEdges: Update PostDominatorTree · 65b4ab99
Matt Arsenault authored Feb 22, 2019
```
llvm-svn: 354673
```
65b4ab99

[LowerSwitch][AMDGPU] Do not handle impossible values · 99a6672b

Roman Tereshin authored Feb 22, 2019

This patch adds LazyValueInfo to LowerSwitch to compute the range of the
value being switched over and reduce the size of the tree LowerSwitch
builds to lower a switch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D58096

llvm-svn: 354670

99a6672b

[DTU] Refine the interface and logic of applyUpdates · 70e97163

Chijun Sima authored Feb 22, 2019

Summary:
This patch separates two semantics of `applyUpdates`:
1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update.
2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated.

Logic changes:

Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example,
```
DTU(Lazy) and Edge A->B exists.
1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued
2. Remove A->B
3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended)
```
But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue.

Interface changes:
The second semantic of `applyUpdates`  is separated to `applyUpdatesPermissive`.
These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`.

Reviewers: kuhar, brzycki, dmgreen, grosser

Reviewed By: brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58170

llvm-svn: 354669

70e97163

[MemorySSA & LoopPassManager] Resolve PR40038. · 90d2e3a1

Alina Sbirlea authored Feb 22, 2019

The correct edge being deleted is not to the unswitched exit block, but to the
original block before it was split. That's the key in the map, not the
value.
The insert is correct. The new edge is to the .split block.

The splitting turns OriginalBB into:
OriginalBB -> OriginalBB.split.
Assuming the orignal CFG edge: ParentBB->OriginalBB, we must now delete
ParentBB->OriginalBB, not ParentBB->OriginalBB.split.

llvm-svn: 354656

90d2e3a1

[DTU] Deprecate insertEdge*/deleteEdge* · f131d611

Chijun Sima authored Feb 22, 2019

Summary: This patch converts all existing `insertEdge*/deleteEdge*` to `applyUpdates` and marks `insertEdge*/deleteEdge*` as deprecated.

Reviewers: kuhar, brzycki

Reviewed By: kuhar, brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58443

llvm-svn: 354652

f131d611

Feb 21, 2019

[MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks. · 97468e92
Alina Sbirlea authored Feb 21, 2019
```
MemorySSA is now updated when forming dedicated exit blocks.
Resolves PR40037.

llvm-svn: 354623
```
97468e92

[LoopSimplifyCFG] Update MemorySSA after r353911. · d2d32443

Alina Sbirlea authored Feb 21, 2019

Summary:
MemorySSA is not properly updated in LoopSimplifyCFG after recent changes. Use SplitBlock utility to resolve that and clear all updates once handleDeadExits is finished.
All updates that follow are removal of edges which are safe to handle via the removeEdge() API.
Also, deleting dead blocks is done correctly as is, i.e. delete from MemorySSA before updating the CFG and DT.

Reviewers: mkazantsev, rtereshin

Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58524

llvm-svn: 354613

d2d32443

[EarlyCSE] Cleanup deadcode. [NFCI] · 73446cd5

Alina Sbirlea authored Feb 21, 2019

Summary: Cleanup nop assignments.

Reviewers: george.burgess.iv, davide

Subscribers: sanjoy, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58308

llvm-svn: 354612

73446cd5

[InferAddressSpaces] Fix fallthrough error · fdf651ee
Joey Gouly authored Feb 21, 2019
```
llvm-svn: 354580
```
fdf651ee

[InferAddressSpaces] Fix crash on select of non-ptr operands · 92af1360

Joey Gouly authored Feb 21, 2019

Check the operands of a select are pointers, to determine if it is an address
expression or not.

https://reviews.llvm.org/D58226

llvm-svn: 354576

92af1360

[LoopSimplifyCFG] Add missing MSSA edge deletion · 10489d76

Max Kazantsev authored Feb 21, 2019

When we create fictive switch in preheader, we should take
care about MSSA and delete edge between old preheader and
header.

llvm-svn: 354547

10489d76

[Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled · 500606f2

Wei Mi authored Feb 21, 2019

is false.

Right now for inliner and partial inliner, we always pass the address of a
valid ORE object to getInlineCost even if RemarkEnabled is false because of
no -Rpass is specified. Since ComputeFullInlineCost will be set to true if
ORE is non-null in getInlineCost, this introduces the problem that in
getInlineCost we cannot return early even if we already know the cost is
definitely higher than the threshold. It is a general problem for compile
time.

This patch fixes that by pass nullptr as the ORE argument if RemarkEnabled is
false.

Differential Revision: https://reviews.llvm.org/D58399

llvm-svn: 354542

500606f2

Feb 20, 2019

[GVN] Small tweaks to comments, style, and missed vector handling · 79d5e16f

Philip Reames authored Feb 20, 2019

Noticed these while doing a final sweep of the code to make sure I hadn't missed anything in my last couple of patches.  The (minor) missed optimization was noticed because of the stylistic fix to avoid an overly specific cast.

llvm-svn: 354412

79d5e16f

[GVN] Fix last crasher w/non-integral pointers · a259dc32

Philip Reames authored Feb 20, 2019

Same case as for memset and memcpy, but this time for clobbering stores and loads. We still can't allow coercion to or from non-integrals, regardless of the transform.

Now that I'm done the whole little sequence, it seems apparent that we'd entirely missed reasoning about clobbers in the original GVN support for non-integral pointers.

My appologies, I thought we'd upstreamed all of this, but it turns out we were still carrying a downstream hack which hid all of these issues. My chanks to Cherry Zhang for helping debug.

llvm-svn: 354407

a259dc32

[GVN] Fix a crash bug w/non-integral pointers and memtransfers · 952d234d

Philip Reames authored Feb 19, 2019

Problem is very similiar to the one fixed for memsets in r354399, we try to coerce a value to non-integral type, and then crash while try to do so.  Since we shouldn't be doing such coercions to start with, easy fix.  From inspection, I see two other cases which look to be similiar and will follow up with most test cases and fixes if confirmed.

llvm-svn: 354403

952d234d

[GVN] Fix a non-integral pointer bug w/vector types · 322eb766

Philip Reames authored Feb 19, 2019

GVN generally doesn't forward structs or array types, but it *will* forward vector types to non-vectors and vice versa.  As demonstrated in tests, we need to inhibit the same set of transforms for vector of non-integral pointers as for non-integral pointers themselves.

llvm-svn: 354401

322eb766

[GVN] Fix a crash bug around non-integral pointers · 92756a80

Philip Reames authored Feb 19, 2019

If we encountered a location where we tried to forward the value of a memset to a load of a non-integral pointer, we crashed.  Such a forward is not legal in general, but we can forward null pointers.  Test for both cases are included.

llvm-svn: 354399

92756a80

Feb 19, 2019

[InstCombine] reduce even more unsigned saturated add with 'not' op · c1e01843

Sanjay Patel authored Feb 19, 2019

We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: uaddsat, -1 fval
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %notx, %y
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

  Name: uaddsat, -1 fval + ult
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %y, %notx
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/nTp

llvm-svn: 354393

c1e01843

[InstCombine] rearrange saturated add folds; NFC · dcb93c0d

Sanjay Patel authored Feb 19, 2019

This is no-functional-change-intended, but that was also
true when it was part of rL354276, and I managed to lose
2 predicates for the fold with constant...causing much bot
distress. So this time I'm adding a couple of negative tests
to avoid that.

llvm-svn: 354384

dcb93c0d

[NFC] API for signaling that the current loop is being deleted · ebd95ea8

Max Kazantsev authored Feb 19, 2019

We are planning to be able to delete the current loop in LoopSimplifyCFG
in the future. Add API to notify the loop pass manager that it happened.

llvm-svn: 354314

ebd95ea8

[NFC] Store loop header in a local to keep it available after the loop is deleted · 30095d97
Max Kazantsev authored Feb 19, 2019
```
llvm-svn: 354313
```
30095d97

Feb 18, 2019

Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op" · 8a35d339
Sanjay Patel authored Feb 18, 2019
```
This reverts commit 079b610c.
Bots are failing after this change on a stage 2 compile of clang.

llvm-svn: 354277
```
8a35d339

[InstCombine] reduce even more unsigned saturated add with 'not' op · 079b610c

Sanjay Patel authored Feb 18, 2019

We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: uaddsat, -1 fval
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %notx, %y
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

  Name: uaddsat, -1 fval + ult
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %y, %notx
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/nTp

llvm-svn: 354276

079b610c

Feb 17, 2019

[NFC] Teach getInnermostLoopFor walk up the loop trees · 4561475e

Max Kazantsev authored Feb 17, 2019

This should be NFC in current use case of this method, but it will
help to use it for solving more compex tasks in follow-up patches.

llvm-svn: 354227

4561475e

[InstCombine] reduce more unsigned saturated add with 'not' op · b341ee70

Sanjay Patel authored Feb 17, 2019

We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: not op
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %notx, %y
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, %y
  %c2 = icmp ult i32 %a, %y
  %r = select i1 %c2, i32 -1, i32 %a

  Name: not op ugt
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %y, %notx
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, %y
  %c2 = icmp ult i32 %a, %y
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/niom

(The matching here is still incomplete.)

llvm-svn: 354224

b341ee70

[InstCombine] reduce unsigned saturated add with 'not' op · bee20735

Sanjay Patel authored Feb 17, 2019

We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

(The matching here is incomplete. Trying to take minimal steps
to make sure we don't induce infinite looping from existing
canonicalizations of the 'select'.)

llvm-svn: 354221

bee20735

[NFC] Fix name and clarifying comment for factored-out function · d72c1a0c
Max Kazantsev authored Feb 17, 2019
```
llvm-svn: 354220
```
d72c1a0c
[NFC] Factor out a function for future reuse · 0f943269
Max Kazantsev authored Feb 17, 2019
```
llvm-svn: 354218
```
0f943269

Feb 15, 2019

[EarlyCSE & MSSA] Cap the clobbering calls in EarlyCSE. · 383ccfb3

Alina Sbirlea authored Feb 15, 2019

Summary:
Unlimitted number of calls to getClobberingAccess can lead to high
compile times in pathological cases.
Limitting getClobberingAccess to a fairly high number. Can be adjusted
based on users/need.
Note: this is the only user of MemorySSA currently enabled by default.
The same handling exists in LICM (disabled atm). As MemorySSA gains more
users, this logic of capping will need to move inside MemorySSA.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D58248

llvm-svn: 354182

383ccfb3

[InstCombine] Address a couple stylistic issues pointed out by reviewer [NFC] · 8220ecbc
Philip Reames authored Feb 15, 2019
```
Better addressing comments from https://reviews.llvm.org/D58290.

llvm-svn: 354171
```
8220ecbc

[InstCombine] Convert atomicrmws to xchg or store where legal · cae6c767

Philip Reames authored Feb 15, 2019

Implement two more transforms of atomicrmw:
1) We can convert an atomicrmw which produces a known value in memory into an xchg instead.
2) We can convert an atomicrmw xchg w/o users into a store for some orderings.

Differential Revision: https://reviews.llvm.org/D58290

llvm-svn: 354170

cae6c767

[CodeExtractor] Do not lift lifetime.end markers for region inputs · 5f5cac3a

Vedant Kumar authored Feb 15, 2019

If a lifetime.end marker occurs along one path through the extraction
region, but not another, then it's still incorrect to lift the marker,
because there is some path through the extracted function which would
ordinarily not reach the marker. If the call to the extracted function
is in a loop, unrolling can cause inputs to the function to become
optimized out as undef after the first iteration.

To prevent incorrect stack slot merging in the calling function, it
should be sufficient to lift lifetime.start markers for region inputs.
I've tested this theory out by doing a stage2 check-all with randomized
splitting enabled.

This is a follow-up to r353973, and there's additional context for this
change in https://reviews.llvm.org/D57834.

rdar://47896986

Differential Revision: https://reviews.llvm.org/D58253

llvm-svn: 354159

5f5cac3a

[HotColdSplit] Schedule splitting late to fix perf regression · 47a0c9b6

Vedant Kumar authored Feb 15, 2019

With or without PGO data applied, splitting early in the pipeline
(either before the inliner or shortly after it) regresses performance
across SPEC variants. The cause appears to be that splitting hides
context for subsequent optimizations.

Schedule splitting late again, in effect reversing r352080, which
scheduled the splitting pass early for code size benefits (documented in
https://reviews.llvm.org/D57082).

Differential Revision: https://reviews.llvm.org/D58258

llvm-svn: 354158

47a0c9b6

[InstCombine] fix crash while trying to narrow a binop of shuffles (PR40734) · 8a2b543a
Sanjay Patel authored Feb 15, 2019
```
https://bugs.llvm.org/show_bug.cgi?id=40734

llvm-svn: 354144
```
8a2b543a

[MergeICmps] Make base ordering really deterministic. · f7e84a2c

Clement Courbet authored Feb 15, 2019

Summary:
The idea is that we now manipulate bases through a `unsigned BaseID` based on
order of appearance in the comparison chain rather than through the `Value*`.

Fixes 40714.

Reviewers: gchatelet

Subscribers: mgrang, jfb, jdoerfert, llvm-commits, hans

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58274

llvm-svn: 354131

f7e84a2c

[MergeICmps][NFC] Improve doc. · cc004df7
Clement Courbet authored Feb 15, 2019
```
llvm-svn: 354128
```
cc004df7
[NFCI] Factor out block removal from stack of nested loops · c065b025
Max Kazantsev authored Feb 15, 2019
```
llvm-svn: 354124
```
c065b025
Fix "field 'DFS' will be initialized after field 'DTU'" warning. NFCI. · 623c38d6
Simon Pilgrim authored Feb 15, 2019
```
llvm-svn: 354123
```
623c38d6
[NFC] Promote DFS to field for further use · 136f09be
Max Kazantsev authored Feb 15, 2019
```
llvm-svn: 354118
```
136f09be
[NFC] Tweak SplitBlockAndInsertIfThen to use existing ThenBlock · 73db5c13
Max Kazantsev authored Feb 15, 2019
```
llvm-svn: 354107
```
73db5c13