Commits · 3c7a35de7fbc0a73505545cd9f68a3bbacb68e57 · Lorenzo Albano / LLVM bpEVL

Dec 13, 2017

[EarlyCSE] recognize commuted and swapped variants of min/max as equivalent (PR35642) · 3c7a35de

Sanjay Patel authored Dec 13, 2017

As shown in:
https://bugs.llvm.org/show_bug.cgi?id=35642
...we can have different forms of min/max, so we should recognize those here in EarlyCSE 
similar to how we already handle binops and compares that can commute.

Differential Revision: https://reviews.llvm.org/D41136

llvm-svn: 320640

3c7a35de

[JumpThreading] Preservation of DT and LVI across the pass · d989af98

Brian M. Rzycki authored Dec 13, 2017

Summary:
See D37528 for a previous (non-deferred) version of this
patch and its description.

Preserves dominance in a deferred manner using a new class
DeferredDominance. This reduces the performance impact of
updating the DominatorTree at every edge insertion and
deletion. A user may call DDT->flush() within JumpThreading
for an up-to-date DT. This patch currently has one flush()
at the end of runImpl() to ensure DT is preserved across
the pass.

LVI is also preserved to help subsequent passes such as
CorrelatedValuePropagation. LVI is simpler to maintain and
is done immediately (not deferred). The code to perfom the
preversation was minimally altered and was simply marked
as preserved for the PassManager to be informed.

This extends the analysis available to JumpThreading for
future enhancements. One example is loop boundary threading.

Reviewers: dberlin, kuhar, sebpop

Reviewed By: kuhar, sebpop

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40146

llvm-svn: 320612

d989af98

[GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases load · 49c03b11

Aditya Kumar authored Dec 13, 2017

w.r.t. the paper
"A Practical Improvement to the Partial Redundancy Elimination in SSA Form"
(https://sites.google.com/site/jongsoopark/home/ssapre.pdf)

Proper dominance check was missing here, so having a loopinfo should not be required.
Committing this diff as this fixes the bug, if there are
further concerns, I'll be happy to work on them.

Differential Revision: https://reviews.llvm.org/D39781

llvm-svn: 320607

49c03b11

Reintroduce r320049, r320014 and r319894. · e0edb664
Igor Laevsky authored Dec 13, 2017
```
OpenGL issues should be fixed by now.

llvm-svn: 320568
```
e0edb664

[SLP] Vectorize jumbled memory loads. · dbd30edb

Mohammad Shahid authored Dec 13, 2017

Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: mgrang, dcaballe, hans, mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

llvm-svn: 320548

dbd30edb

[CallSiteSplitting] Refactor creating callsites. · beda7d51

Florian Hahn authored Dec 13, 2017

Summary:
This change makes the call site creation more general if any of the
arguments is predicated on a condition in the call site's predecessors.

If we find a callsite, that potentially can be split, we collect the set
of conditions for the call site's predecessors (currently only 2
predecessors are allowed). To do that, we traverse each predecessor's
predecessors as long as it only has single predecessors and record the
condition, if it is relevant to the call site. For each condition, we
also check if the condition is taken or not. In case it is not taken,
we record the inverse predicate.

We use the recorded conditions to create the new call sites and split
the basic block.

This has 2 benefits: (1) it is slightly easier to see what is going on
(IMO) and (2) we can easily extend it to handle more complex control
flow.

Reviewers: davidxl, junbuml

Reviewed By: junbuml

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40728

llvm-svn: 320547

beda7d51

Dec 12, 2017

[EarlyCSE] add tests for commuted min/max; NFC · 3cf695aa
Sanjay Patel authored Dec 12, 2017
```
See PR35642: https://bugs.llvm.org/show_bug.cgi?id=35642

llvm-svn: 320530
```
3cf695aa

[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. · 83c15b13

Alexey Bataev authored Dec 12, 2017

Summary:
If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1,
&V2)))), bitcast)`, but the load is used in other instructions, it leads
to looping in InstCombiner. Patch adds additional check that all users
of the load instructions are stores and then replaces all uses of load
instruction by the new one with new type.

Reviewers: RKSimon, spatel, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41072

llvm-svn: 320525

83c15b13

Reassociate: add global reassociation algorithm · b8a330c4

Fiona Glaser authored Dec 12, 2017

This algorithm (explained more in the source code) takes into account
global redundancies by building a "pair map" to find common subexprs.

The primary motivation of this is to handle situations like

foo = (a * b) * c
bar = (a * d) * c

where we currently don't identify that "a * c" is redundant.

Accordingly, it prioritizes the emission of a * c so that CSE
can remove the redundant calculation later.

Does not change the actual reassociation algorithm -- only the
order in which the reassociated operand chain is reconstructed.

Gives ~1.5% floating point math instruction count reduction on
a large offline suite of graphics shaders.

llvm-svn: 320515

b8a330c4

Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." · fa0a76db
Alexey Bataev authored Dec 12, 2017
```
This reverts commit r320510 - again sanitizers bbots.

llvm-svn: 320513
```
fa0a76db

Split IndirectBr critical edges before PGO gen/use passes. · f3bda1da

Hiroshi Yamauchi authored Dec 12, 2017

Summary:
The PGO gen/use passes currently fail with an assert failure if there's a
critical edge whose source is an IndirectBr instruction and that edge
needs to be instrumented.

To avoid this in certain cases, split IndirectBr critical edges in the PGO
gen/use passes. This works for blocks with single indirectbr predecessors,
but not for those with multiple indirectbr predecessors (splitting an
IndirectBr critical edge isn't always possible.)

Reviewers: davidxl, xur

Reviewed By: davidxl

Subscribers: efriedma, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D40699

llvm-svn: 320511

f3bda1da

[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. · 195c97e2

Alexey Bataev authored Dec 12, 2017

Summary:
If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1,
&V2)))), bitcast)`, but the load is used in other instructions, it leads
to looping in InstCombiner. Patch adds additional check that all users
of the load instructions are stores and then replaces all uses of load
instruction by the new one with new type.

Reviewers: RKSimon, spatel, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41072

llvm-svn: 320510

195c97e2

Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." · 6132a50d
Alexey Bataev authored Dec 12, 2017
```
This reverts commit r320499 again to resolve the problem with the
sanitizers bbots.

llvm-svn: 320501
```
6132a50d

[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. · ca4c9a52

Alexey Bataev authored Dec 12, 2017

Summary:
If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1,
&V2)))), bitcast)`, but the load is used in other instructions, it leads
to looping in InstCombiner. Patch adds additional check that all users
of the load instructions are stores and then replaces all uses of load
instruction by the new one with new type.

Reviewers: RKSimon, spatel, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41072

llvm-svn: 320499

ca4c9a52

Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." · d19dbe67
Alexey Bataev authored Dec 12, 2017
```
This reverts commit r320496 to solve the problems with sanitizer
buildbots.

llvm-svn: 320498
```
d19dbe67

[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. · d0c3aeb2

Alexey Bataev authored Dec 12, 2017

Summary:
If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1,
&V2)))), bitcast)`, but the load is used in other instructions, it leads
to looping in InstCombiner. Patch adds additional check that all users
of the load instructions are stores and then replaces all uses of load
instruction by the new one with new type.

Reviewers: RKSimon, spatel, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41072

llvm-svn: 320496

d0c3aeb2

Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." · c9f1d2e4
Alexey Bataev authored Dec 12, 2017
```
This reverts commit r320488 because of the failed asan buildbots..

llvm-svn: 320490
```
c9f1d2e4

[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. · fb68c48a

Alexey Bataev authored Dec 12, 2017

Summary:
If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1,
&V2)))), bitcast)`, but the load is used in other instructions, it leads
to looping in InstCombiner. Patch adds additional check that all users
of the load instructions are stores and then replaces all uses of load
instruction by the new one with new type.

Reviewers: RKSimon, spatel, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41072

llvm-svn: 320488

fb68c48a

Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." · ca2a8cea
Alexey Bataev authored Dec 12, 2017
```
This reverts commit r320483 because of the failed Windows buildbots.

llvm-svn: 320485
```
ca2a8cea

[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. · 1daef8a6

Alexey Bataev authored Dec 12, 2017

If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1,
&V2)))), bitcast)`, but the load is used in other instructions, it leads
to looping in InstCombiner. Patch adds additional check that all users
of the load instructions are stores and then replaces all uses of load
instruction by the new one with new type.

Reviewers: RKSimon, spatel, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41072

llvm-svn: 320483

1daef8a6

[InstComineLoadStoreAlloca] Optimize stores to GEP off null base · 2dd9835f

Anna Thomas authored Dec 12, 2017

Summary:
Currently, in InstCombineLoadStoreAlloca, we have simplification
rules for the following cases:
  1. load off a null
  2. load off a GEP with null base
  3. store to a null

This patch adds support for the fourth case which is store into a
GEP with null base. Since this is UB as well (and directly analogous to
the load off a GEP with null base), we can substitute the stored val
with undef in instcombine, so that SimplifyCFG can optimize this code
into unreachable code.

Note: Right now, simplifyCFG hasn't been taught about optimizing
this to unreachable and adding an llvm.trap (this is already done for
the above 3 cases).

Reviewers: majnemer, hfinkel, sanjoy, davide

Reviewed by: sanjoy, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41026

llvm-svn: 320480

2dd9835f

Revert r320049, r320014 and r319894 · d63560b8

Igor Laevsky authored Dec 12, 2017

They were causing failures of the piglit OpenGL tests with AMD GPUs using the
Mesa radeonsi driver.

llvm-svn: 320466

d63560b8

[LV] Ignore the cost of values that will not appear in the vectorized loop · 927b3160

Dorit Nuzman authored Dec 12, 2017

VecValuesToIgnore holds values that will not appear in the vectorized loop.
We should therefore ignore their cost when VF > 1.

Differential Revision: https://reviews.llvm.org/D40883

llvm-svn: 320463

927b3160

[CallSiteSplitting] Don't let debug intrinsics affect optimizations · 66cf3837

Mikael Holmen authored Dec 12, 2017

Summary:
This solves PR35616.

We don't want the compiler to generate different code when we compile
with/without -g, so we now ignore debug intrinsics when determining if
the optimization can trigger or not.

Reviewers: junbuml

Subscribers: davide, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D41068

llvm-svn: 320460

66cf3837

Dec 11, 2017

LSR: Check more intrinsic pointer operands · 3e268cc0
Matt Arsenault authored Dec 11, 2017
```
llvm-svn: 320424
```
3e268cc0

Revert r320407 "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." · 27d1c00c

Hans Wennborg authored Dec 11, 2017

The tests fail (opt asserts) on Windows.

> Summary:
> If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1,
> &V2)))), bitcast)`, but the load is used in other instructions, it leads
> to looping in InstCombiner. Patch adds additional check that all users
> of the load instructions are stores and then replaces all uses of load
> instruction by the new one with new type.
>
> Reviewers: RKSimon, spatel, majnemer
>
> Subscribers: llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D41072

llvm-svn: 320421

27d1c00c

[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. · ec128ace

Alexey Bataev authored Dec 11, 2017

Summary:
If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1,
&V2)))), bitcast)`, but the load is used in other instructions, it leads
to looping in InstCombiner. Patch adds additional check that all users
of the load instructions are stores and then replaces all uses of load
instruction by the new one with new type.

Reviewers: RKSimon, spatel, majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41072

llvm-svn: 320407

ec128ace

Dec 10, 2017

[SimplifyLibCalls] propagate FMF when folding pow(x, -1.0) call · b23e1481
Sanjay Patel authored Dec 10, 2017
```
Follow-up for a bug that's similar to:
https://bugs.llvm.org/show_bug.cgi?id=35601

llvm-svn: 320312
```
b23e1481
[InstCombine] add test for pow(x, -1.0) with FMF; NFC · ac9cbd6c
Sanjay Patel authored Dec 10, 2017
```
llvm-svn: 320311
```
ac9cbd6c
[SimplifyLibCalls] propagate FMF when folding pow(x, 2.0) call (PR35601) · 09ec3434
Sanjay Patel authored Dec 10, 2017
```
This should fix the larger problem with sqrt shown in:
https://bugs.llvm.org/show_bug.cgi?id=35601

llvm-svn: 320310
```
09ec3434
[InstCombine] add test for pow(x, 2.0) with FMF; NFC · 719bc64b
Sanjay Patel authored Dec 10, 2017
```
llvm-svn: 320309
```
719bc64b

[SCEV] Fix wrong Equal predicate created in getAddRecForPhiWithCasts · 5809e705

Dorit Nuzman authored Dec 10, 2017

CreateAddRecFromPHIWithCastsImpl() adds an IncrementNUSW overflow predicate
which allows the PSCEV rewriter to rewrite this scev expression:
 (zext i8 {0, + , (trunc i32 step to i8)} to i32)
into
 {0, +, (sext i8 (trunc i32 step to i8) to i32)}

But then it adds the wrong Equal predicate:
 %step == (zext i8 (trunc i32 %step to i8) to i32).
instead of:
 %step == (sext i8 (trunc i32 %step to i8) to i32)

This is fixed here.

Differential Revision: https://reviews.llvm.org/D40641

llvm-svn: 320298

5809e705

[InstCombine] Fix SimplifyDemandedUseBits SHL handling (PR35515) · a42a5425
Simon Pilgrim authored Dec 09, 2017
```
Don't assume that the pattern matched SRL can be cast to an Instruction (might be ConstExpr etc.)

llvm-svn: 320270
```
a42a5425

Dec 09, 2017

[InlineFunction] Set debug loc for call to forward varargs. · c5bebffe

Florian Hahn authored Dec 09, 2017

Reviewers: aprantl, dblaikie, rnk

Reviewed By: rnk

Subscribers: eraman, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D40432

llvm-svn: 320252

c5bebffe

Hardware-assisted AddressSanitizer (llvm part). · c667c1f4

Evgeniy Stepanov authored Dec 09, 2017

Summary:
This is LLVM instrumentation for the new HWASan tool. It is basically
a stripped down copy of ASan at this point, w/o stack or global
support. Instrumenation adds a global constructor + runtime callbacks
for every load and store.

HWASan comes with its own IR attribute.

A brief design document can be found in
clang/docs/HardwareAssistedAddressSanitizerDesign.rst (submitted earlier).

Reviewers: kcc, pcc, alekseyshl

Subscribers: srhines, mehdi_amini, mgorny, javed.absar, eraman, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D40932

llvm-svn: 320217

c667c1f4

Dec 08, 2017

[Debugify] Add a pass to test debug info preservation · 195dfd10

Vedant Kumar authored Dec 08, 2017

The Debugify pass synthesizes debug info for IR. It's paired with a
CheckDebugify pass which determines how much of the original debug info
is preserved. These passes make it easier to create targeted tests for
debug info preservation.

Here is the Debugify algorithm:

  NextLine = 1
  for (Instruction &I : M)
    attach DebugLoc(NextLine++) to I

  NextVar = 1
  for (Instruction &I : M)
    if (canAttachDebugValue(I))
      attach dbg.value(NextVar++) to I

The CheckDebugify pass expects contiguous ranges of DILocations and
DILocalVariables. If it fails to find all of the expected debug info, it
prints a specific error to stderr which can be FileChecked.

This was discussed on llvm-dev in the thread:
"Passes to add/validate synthetic debug info"

Differential Revision: https://reviews.llvm.org/D40512

llvm-svn: 320202

195dfd10

[CodeExtractor] Add debug locations for new call and branch instrs. · e5089e2e

Florian Hahn authored Dec 08, 2017

Summary:
If a partially inlined function has debug info, we have to add debug
locations to the call instruction calling the outlined function.
We use the debug location of the first instruction in the outlined
function, as the introduced call transfers control to this statement and
there is no other equivalent line in the source code.

We also use the same debug location for the branch instruction added
to jump from artificial entry block for the outlined function, which just
jumps to the first actual basic block of the outlined function.

Reviewers: davide, aprantl, rriddle, dblaikie, danielcdh, wmi

Reviewed By: aprantl, rriddle, danielcdh

Subscribers: eraman, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D40413

llvm-svn: 320199

e5089e2e

Revert r320104: infinite loop profiling bug fix · d91057bf

Xinliang David Li authored Dec 08, 2017

Causes unexpected memory issue with New PM this time.
The new PM invalidates BPI but not BFI, leaving the
reference to BPI from BFI invalid.

Abandon this patch.  There is a more general solution
which also handles runtime infinite loop (but not statically).

llvm-svn: 320180

d91057bf

[InstCombine] PR35354: Convert store(bitcast, load bitcast (select (Cond, &V1,... · ec95c6cc

Alexey Bataev authored Dec 08, 2017

[InstCombine] PR35354: Convert store(bitcast, load bitcast (select (Cond, &V1, &V2))  --> store (, load (select(Cond, load &V1, load &V2)))

Summary:
If we have the code like this:
```
float a, b;
a = std::max(a ,b);
```
it is converted into something like this:
```
%call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float* nonnull dereferenceable(4) %a.addr, float* nonnull dereferenceable(4) %b.addr)
%1 = bitcast float* %call to i32*
%2 = load i32, i32* %1, align 4
%3 = bitcast float* %a.addr to i32*
store i32 %2, i32* %3, align 4
```
After inlinning this code is converted to the next:
```
%1 = load float, float* %a.addr
%2 = load float, float* %b.addr
%cmp.i = fcmp fast olt float %1, %2
%__b.__a.i = select i1 %cmp.i, float* %a.addr, float* %b.addr
%3 = bitcast float* %__b.__a.i to i32*
%4 = load i32, i32* %3, align 4
%5 = bitcast float* %arrayidx to i32*
store i32 %4, i32* %5, align 4

```
This pattern is not recognized as minmax pattern.
Patch solves this problem by converting sequence
```
store (bitcast, (load bitcast (select ((cmp V1, V2), &V1, &V2))))
```
to a sequence
```
store (,load (select((cmp V1, V2), &V1, &V2)))
```
After this the code is recognized as minmax pattern.

Reviewers: RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40304

llvm-svn: 320157

ec95c6cc

Dec 07, 2017
- [PGO] detect infinite loop and form MST properly · 4b0027f6
  Xinliang David Li authored Dec 07, 2017
```
Differential Revision: http://reviews.llvm.org/D40873

llvm-svn: 320104
```
  4b0027f6