Commits · affbc99bea94e77f7ebccd8ba887e33051bd04ee · Lorenzo Albano / LLVM bpEVL

May 14, 2018
- Test commit access. · 617d4a81
  Nicola Zaghen authored May 14, 2018
```
Remove trailing whitespace.

llvm-svn: 332220
```
  617d4a81
- [X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use uitofp+insertelement instead. · 0e71c6d5
  Craig Topper authored May 14, 2018
```
llvm-svn: 332206
```
  0e71c6d5
May 13, 2018
- [X86] Extend instcombine folds for pclmuldq intrinsics to the 256 and 512 bit version. · 911025b1
  Craig Topper authored May 13, 2018
```
llvm-svn: 332202
```
  911025b1
- [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics. · 85906cf0
  Craig Topper authored May 13, 2018
```
llvm-svn: 332198
```
  85906cf0
- [X86] Remove an autoupgrade legacy cvtss2sd intrinsics. · df3a9ced
  Craig Topper authored May 13, 2018
```
llvm-svn: 332187
```
  df3a9ced
- [X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what clang... · 38ad7dda
  Craig Topper authored May 12, 2018
```
[X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what clang has used for a very long time.

llvm-svn: 332186
```
  38ad7dda
May 12, 2018

Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." · a41660df

Michael Zolotukhin authored May 12, 2018

Stage3/stage4 bootstrap miscompares should be fixed by a non-determinism
fix in IDF (r332167).

This reverts commit r330446.

llvm-svn: 332168

a41660df

[CodeExtractor] Allow extracting blocks with exception handling · 69c9cd27

Sergey Dmitriev authored May 11, 2018

This is a CodeExtractor improvement which adds support for extracting blocks
which have exception handling constructs if that is legal to do. CodeExtractor
performs validation checks to ensure that extraction is legal when it finds
invoke instructions or EH pads (landingpad, catchswitch, or cleanuppad) in
blocks to be extracted.

I have also added an option to allow extraction of blocks with alloca
instructions, but no validation is done for allocas. CodeExtractor caller has
to validate it himself before allowing alloca instructions to be extracted.
By default allocas are still not allowed in extraction blocks.

Differential Revision: https://reviews.llvm.org/D45904

llvm-svn: 332151

69c9cd27

May 11, 2018

[X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer used by clang. · a17d627a
Craig Topper authored May 11, 2018
```
llvm-svn: 332146
```
a17d627a

[Split GEP] handle trunc() in separate-const-offset-from-gep pass. · c2cd5d5c

Artem Belevich authored May 11, 2018

Let separate-const-offset-from-gep pass handle trunc() when it calculates
constant offset relative to base. The pass itself may insert trunc()
instructions when it canonicalises array indices to pointer-size integers
and needs to handle trunc() in order to evaluate the offset.

Differential Revision: https://reviews.llvm.org/D46732

llvm-svn: 332142

c2cd5d5c

[InstCombine] Handle atomic memset in the same way as regular memset · f6651d4d

Daniel Neilson authored May 11, 2018

Summary:
This change adds handling of the atomic memset intrinsic to the
code path that simplifies the regular memset. In practice this means
that we will now also expand a small constant-length atomic memset
into a single unordered atomic store.

Reviewers: apilipenko, skatkov, mkazantsev, anna, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D46660

llvm-svn: 332132

f6651d4d

[InstCombine] snprintf optimizations · cd93c4ef

David Bolvansky authored May 11, 2018

Reviewers: spatel, efriedma, majnemer, rja, bkramer

Reviewed By: rja, bkramer

Subscribers: mstorsjo, rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 332110

cd93c4ef

[Reassociate] Prevent infinite loops when processing PHIs. · 6e1f7bf3

Davide Italiano authored May 11, 2018

Phi nodes can reside in live blocks but one of their incoming
arguments can come from a dead block. Dead blocks and reassociate
don't play nice together. In fact, reassociate performs an RPO
as a first step to avoid processing dead blocks.

The reason why Reassociate might not fixpoint when examining
dead blocks is that the following:

  %xor0 = xor i16 %xor1, undef
  %xor1 = xor i16 %xor0, undef

is perfectly valid LLVM IR (if it appears in a dead block),
so the worklist algorithm keeps pushing the two instructions for
reexamination. Note that this is not Reassociate fault, at least
not entirely. It's llvm that has a weird definition of dominance.

Fixes PR37390.

llvm-svn: 332100

6e1f7bf3

[InstCombine] Unify handling of atomic memtransfer with non-atomic memtransfer · 8f30ec65

Daniel Neilson authored May 11, 2018

Summary:
This change reworks the handling of atomic memcpy within the instcombine pass.
Previously, a constant length atomic memcpy would be lowered into loads & stores
as long as no more than 16 load/store pairs are created. This is quite different
from the lowering done for a non-atomic memcpy; which only ever lowers into a single
load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are
expanded to load/stores in later passes, such as SelectionDAG lowering.

In this change the behaviour for atomic memcpy is unified with non-atomic memcpy;
atomic memcpy is now treated in the same was as non-atomic memcpy has always been.
We leave it to later passes to lower longer-length atomic memcpy calls.

Due to the structure of the pass's handling of memtransfer intrinsics, this change
also gives us handling of atomic memmove that we did not previously have.

Reviewers: apilipenko, skatkov, mkazantsev, anna, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D46658

llvm-svn: 332093

8f30ec65

[Coroutines] PR34897: Fix incorrect elisions · c6511134

Brian Gesiak authored May 11, 2018

Summary:
https://bugs.llvm.org/show_bug.cgi?id=34897 demonstrates an incorrect
coroutine frame allocation elision in the coro-elide pass. The elision
is performed on the basis that the SSA variables from all llvm.coro.begin
are directly referenced in subsequent llvm.coro.destroy instructions.

However, this ignores the fact that the function may exit through paths
that do not run these destroy instructions. In the sample program from
PR34897, for example, the llvm.coro.destroy instruction is only
executed in exception handling code. When the coroutine function exits
normally, llvm.coro.destroy is not called. Eliding the allocation in
this case causes a subsequent reference to the coroutine handle from
outside of the function to access freed memory.

To fix the issue, when finding an llvm.coro.destroy for each llvm.coro.begin,
only consider llvm.coro.destroy that are executed along non-exceptional paths.

Test Plan:
1. Download the sample program from
   https://bugs.llvm.org/show_bug.cgi?id=34897, compile it with
   `clang++ -fcoroutines-ts -stdlib=libc++ -std=c++1z -O2`, and run it.
   It should print `"run1\ncheck1\nrun2\ncheck2"` and then exit
   successfully.
2. Compile https://godbolt.org/g/mCKfnr and confirm it is still
   optimized to a single instruction, 'return 1190'.
3. `check-llvm`

Reviewers: rsmith, GorNishanov, eric_niebler

Reviewed By: GorNishanov

Subscribers: andrewrk, lewissbaker, EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D43242

llvm-svn: 332077

c6511134

[sanitizer-coverage] don't instrument a function if it's entry block ends with 'unreachable' · a2759327
Kostya Serebryany authored May 11, 2018
```
llvm-svn: 332072
```
a2759327

Kamil Rytarowski authored May 11, 2018

Summary:
Ship kNetBSD_ShadowOffset32 set to 1ULL << 30.

This is prepared for the amd64 kernel runtime.

Sponsored by <The NetBSD Foundation>

Reviewers: vitalybuka, joerg, kcc

Reviewed By: vitalybuka

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46724

llvm-svn: 332069

02c432a7

[SampleFDO] Don't treat warm callsite with inline instance in the profile as cold · 0c2f6be6

Wei Mi authored May 10, 2018

We found current sampleFDO had a performance issue when triaging a regression.
For a callsite with inline instance in the profile, even if hot callsite inliner
cannot inline it, it may still execute enough times and should not be treated as
cold in regular inliner later. However, currently if such callsite is not inlined
by hot callsite inliner, and the BB where the callsite locates doesn't get
samples from other instructions inside of it, the callsite will have no profile
metadata annotated. In regular inliner cost analysis, if the callsite has no
profile annotated and its caller has profile information, it will be treated as
cold.

The fix changes the isCallsiteHot check and chooses to compare
CallsiteTotalSamples with hot cutoff value computed by ProfileSummaryInfo.

Differential Revision: https://reviews.llvm.org/D45377

llvm-svn: 332058

0c2f6be6

[STLExtras] Add distance() for ranges, pred_size(), and succ_size() · e0b5f86b

Vedant Kumar authored May 10, 2018

This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.

Differential Revision: https://reviews.llvm.org/D46668

llvm-svn: 332057

e0b5f86b

[InstCombine] Replace an 'if' that should always be true with an assert. · ea78a261

Craig Topper authored May 10, 2018

The bitwidth of the operation should always be wider than the result width of the truncate since we don't recurse through any width changing operations.

llvm-svn: 332055

ea78a261

May 10, 2018

Revert "[InstCombine] snprintf optimizations" · 86e6742c

Martin Storsjö authored May 10, 2018

This reverts commit SVN r331889, which could trigger failed
assertions for cases where the snprintf function is declared
with a vaguely differing signature (e.g. being defined as
static inline), see PR37408.

llvm-svn: 332043

86e6742c

[InstCombine] add folds for minnum(-a, -b) --> -maxnum(a, b) · c7bb1430

Sanjay Patel authored May 10, 2018

This is similar to what we do for integer min/max with 'not'
ops (rL321882).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=37404
https://bugs.llvm.org/show_bug.cgi?id=37405

llvm-svn: 332031

c7bb1430

[InstCombine] Moving overflow computation logic from InstCombine to ValueTracking; NFC · fbb83dee
Omer Paparo Bivas authored May 10, 2018
```
Differential Revision: https://reviews.llvm.org/D46704

Change-Id: Ifabcbe431a2169743b3cc310f2a34fd706f13f02
llvm-svn: 332026
```
fbb83dee

[PM/LoopUnswitch] Avoid pointlessly creating an exit block set. · baf045fb

Chandler Carruth authored May 10, 2018

This code can just test whether blocks are *in* the loop, which we
already have a dedicated set tracking in the loop itself.

llvm-svn: 332004

baf045fb

[DSE] Teach the pass about partial overwrite of atomic memory intrinsics · 71fa1b90

Daniel Neilson authored May 10, 2018

Summary:
This change teaches DSE that the atomic memory intrinsics can be overwriten
partially in the same way as the non-atomic forms. Specifically, that the
atomic memcpy & memset can be shortened at the end and that the atomic memset
can be shortened at the beginning, if they partially overwritten
by later stores.

Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith, spatel, filcab, sanjoy

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45584

llvm-svn: 331991

71fa1b90

[PR37339] Fix assertion in FunctionComparator::cmpInlineAsm · 68403564

whitequark authored May 10, 2018

Fixes bug https://bugs.llvm.org/show_bug.cgi?id=37339.

InlineAsm is only uniqued if the FunctionTypes are exactly the
same, while cmpTypes() for example considers all pointer types
in the default address space to be the same. For this reason
the end of cmpInlineAsm() can be reached.

This patch replaces the unreachable assertion with a check that
the function types are not identical.

Differential Revision: https://reviews.llvm.org/D46495

Reviewers: jfb
llvm-svn: 331990

68403564

[InstCombine] Only propagate known leading zeros from udiv input to output. · 456f473e

Benjamin Kramer authored May 10, 2018

Put in a conservatively correct estimate for now. Avoids miscompiling
clang in FDO mode. This is really tricky to trigger in reality as
basically all interesting cases will be folded away by computeKnownBits
earlier, I was unable to find a reasonably small test case.

llvm-svn: 331975

456f473e

[InstCombine] Reorder an if condition to put a cheap check in front of a computeKnownBits call. NFC · 553d451e
Craig Topper authored May 10, 2018
```
llvm-svn: 331948
```
553d451e
[InstCombine] Use APInt::getBitsSetFrom to shortern a line and fix an 80 columns violation. NFC · 333efc95
Craig Topper authored May 10, 2018
```
Fix a similar line in the same function.

llvm-svn: 331947
```
333efc95
[Inscombine] fix a signedness warning which broke -Werror builds · 913a779d
Philip Reames authored May 10, 2018
```
llvm-svn: 331944
```
913a779d

[AggressiveInstCombine] convert a chain of 'and-shift' bits into masked compare · ac3951a7

Sanjay Patel authored May 09, 2018

This is a follow-up to D45986. As suggested there, we should match the "all-bits-set" 
pattern in addition to "any-bits-set".

This was a little more complicated than I thought it would be initially because the 
"and 1" instruction can be anywhere in the chain. Hopefully, the code comments make 
that logic understandable, but if you see a way to simplify or improve that, it's 
most appreciated.

This transforms patterns that emerge from bitfield tests as seen in PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098

I think it would also help reduce the large test from:
D46336
D46595 
but we need something to reassociate that case to the forms we're expecting here first.

Differential Revision: https://reviews.llvm.org/D46649

llvm-svn: 331937

ac3951a7

[InstCombine] Widen guards with conditions between · 79e917d1

Philip Reames authored May 09, 2018

The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed.

Differential Revision: https://reviews.llvm.org/D46203

llvm-svn: 331935

79e917d1

[InstCombine] Teach SimplifyDemandedBits that udiv doesn't demand low dividend... · 0d2fc1a5

Benjamin Kramer authored May 09, 2018

[InstCombine] Teach SimplifyDemandedBits that udiv doesn't demand low dividend bits that are zero in the divisor

This is safe as long as the udiv is not exact. The pattern is not common in
C++ code, but comes up all the time in code generated by XLA's GPU backend.

Differential Revision: https://reviews.llvm.org/D46647

llvm-svn: 331933

0d2fc1a5

May 09, 2018

[InstCombine] snprintf optimizations · 9b5e6e82

David Bolvansky authored May 09, 2018

Reviewers: spatel, efriedma, majnemer, rja, bkramer

Reviewed By: rja, bkramer

Subscribers: rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 331889

9b5e6e82

[LV] Change MaxVectorSize bound to 256 in assertion, NFC otherwise · ea4c1bb7
Krzysztof Parzyszek authored May 09, 2018
```
It's possible to have a vector of 256 bytes in HVX code on Hexagon
(vector pair in 128-byte mode).

llvm-svn: 331885
```
ea4c1bb7

Revert "[InstCombine] snprintf optimizations" · ccb0fbe9

Benjamin Kramer authored May 09, 2018

This reverts commit r331849. It miscompiles
snprintf(buf, sizeof(buf), "%s", "any constant string); into
memcpy(buf, "%s", sizeof("any constant string"));

llvm-svn: 331866

ccb0fbe9

[MergedLoadStoreMotion] Fix a debug invariant bug in mergeStores · 9f953cdd

Bjorn Pettersson authored May 09, 2018

Summary:
MergedLoadStoreMotion::mergeStores is using some heuristics
to limit the amount of stores that it tries to sink (see
MagicCompileTimeControl in MergedLoadStoreMotion.cpp). The
heuristic involves counting the number of instructions in
one of the basic blocks that is part of the transformation.

We now ignore dbg intrinsics when counting instruction for
the MagicCompileTimeControl heuristic. This to make sure that
the amount of stores that are sunk doesn't depend on the amount
of debug information (if -g is used or not).

Reviewers: Gerolf, davide, majnemer

Reviewed By: davide

Subscribers: dberlin, bjope, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D46600

llvm-svn: 331852

9f953cdd

[InstCombine] snprintf optimizations · 44a37f04

David Bolvansky authored May 09, 2018

Reviewers: spatel, efriedma, majnemer, rja

Reviewed By: rja

Subscribers: rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 331849

44a37f04

[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label. · 2c864551

Shiva Chen authored May 09, 2018

In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is

!DILabel(scope: !1, name: "foo", file: !2, line: 3)

We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is

llvm.dbg.label(metadata !1)

It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.

We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.

Differential Revision: https://reviews.llvm.org/D45024

Patch by Hsiangkai Wang.

llvm-svn: 331841

2c864551

Support a funclet operand bundle in LowerInvoke · bf771695

Heejin Ahn authored May 09, 2018

Summary:
The current LowerInvoke pass cannot handle invoke instructions with a
funclet bundle operand. The order of operands for an invoke instruction
is {call arguments, callee, funclet operand (if any), normal dest,
unwind dest}. The current code assumes there is no funclet operand and
incorrectly includes a funclet operand into call arguments.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46242

llvm-svn: 331832

bf771695