Commits · 9adbb6c468c1c4010727a1bfb8e3959eea11f5c7 · Roger Ferrer / llvm-epi-0.8

Feb 03, 2020

[mlir] Fix link to 'Getting started with MLIR' · 9adbb6c4

Marius Brehler authored Feb 03, 2020

The link in the toy example pointed to the 'tensorflow/mlir' repo and is
replaced with https://mlir.llvm.org.

Differential Revision: https://reviews.llvm.org/D73770

9adbb6c4

[ARM,MVE] Fix vreinterpretq in big-endian mode. · 961530fd

Simon Tatham authored Feb 03, 2020

Summary:
In big-endian MVE, the simple vector load/store instructions (i.e.
both contiguous and non-widening) don't all store the bytes of a
register to memory in the same order: it matters whether you did a
VSTRB.8, VSTRH.16 or VSTRW.32. Put another way, the in-register
formats of different vector types relate to each other in a different
way from the in-memory formats.

So, if you want to 'bitcast' or 'reinterpret' one vector type as
another, you have to carefully specify which you mean: did you want to
reinterpret the //register// format of one type as that of the other,
or the //memory// format?

The ACLE `vreinterpretq` intrinsics are specified to reinterpret the
register format. But I had implemented them as LLVM IR bitcast, which
is specified for all types as a reinterpretation of the memory format.
So a `vreinterpretq` intrinsic, applied to values already in registers,
would code-generate incorrectly if compiled big-endian: instead of
emitting no code, it would emit a `vrev`.

To fix this, I've introduced a new IR intrinsic to perform a
register-format reinterpretation: `@llvm.arm.mve.vreinterpretq`. It's
implemented by a trivial isel pattern that expects the input in an
MQPR register, and just returns it unchanged.

In the clang codegen, I only emit this new intrinsic where it's
actually needed: I prefer a bitcast wherever it will have the right
effect, because LLVM understands bitcasts better. So we still generate
bitcasts in little-endian mode, and even in big-endian when you're
casting between two vector types with the same lane size.

For testing, I've moved all the codegen tests of vreinterpretq out
into their own file, so that they can have a different set of RUN
lines to check both big- and little-endian.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73786

961530fd

[ARM,MVE] Add intrinsics for v[id]dupq and v[id]wdupq. · f8d4afc4

Simon Tatham authored Jan 31, 2020

Summary:
These instructions generate a vector of consecutive elements starting
from a given base value and incrementing by 1, 2, 4 or 8. The `wdup`
versions also wrap the values back to zero when they reach a given
limit value. The instruction updates the scalar base register so that
another use of the same instruction will continue the sequence from
where the previous one left off.

At the IR level, I've represented these instructions as a family of
target-specific intrinsics with two return values (the constructed
vector and the updated base). The user-facing ACLE API provides a set
of intrinsics that throw away the written-back base and another set
that receive it as a pointer so they can update it, plus the usual
predicated versions.

Because the intrinsics return two values (as do the underlying
instructions), the isel has to be done in C++.

This is the first family of MVE intrinsics that use the `imm_1248`
immediate type in the clang Tablegen framework, so naturally, I found
I'd given it the wrong C integer type. Also added some tests of the
check that the immediate has a legal value, because this is the first
time those particular checks have been exercised.

Finally, I also had to fix a bug in MveEmitter which failed an
assertion when I nested two `seq` nodes (the inner one used to extract
the two values from the pair returned by the IR intrinsic, and the
outer one put on by the predication multiclass).

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73357

f8d4afc4

[ARM,MVE] Add intrinsics for vdupq. · cf7e98e6

Simon Tatham authored Jan 31, 2020

Summary:
The unpredicated case of this is trivial: the clang codegen just makes
a vector splat of the input, and LLVM isel is already prepared to
handle that. For the predicated version, I've generated a `select`
between the same vector splat and the `inactive` input parameter, and
added new Tablegen isel rules to match that pattern into a predicated
`MVE_VDUP` instruction.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73356

cf7e98e6

[clang][AST] Add an AST matcher for deducedTemplateSpeializationType. · bdbdf748

Haojian Wu authored Feb 03, 2020

Summary:
misc-unused-using clang-tidy check needs this matcher to fix a false
positive of C++17 deduced class template types.

Reviewers: gribozavr2

Reviewed By: gribozavr2

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73869

bdbdf748

Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI. · ae5d3e8c
Simon Pilgrim authored Feb 03, 2020

ae5d3e8c

Reland "[llvm] Add a way to speed up the speed in which BumpPtrAllocator increases slab sizes"" · 46e5603c

Raphael Isemann authored Feb 03, 2020

Disable the red zone in the unit test allocator to fix the test errors in sanitizer builds.
The red zone changed the amount of allocated bytes which made the test fail as it
checked the number of allocated bytes of the allocator.

46e5603c

[LLDB] Add missing declarations for linking to psapi · eb5ee927

Martin Storsjö authored Feb 01, 2020

This fixes building for mingw with BUILD_SHARED_LIBS. In static builds,
the psapi dependency gets linked in transitively from Support, but
when linking Support dynamically, it's revealed that these components
also need linking against psapi.

Differential Revision: https://reviews.llvm.org/D73839

eb5ee927

[llvm-exegesis] Restrict the range of allowable rounding countrols. · 082dccac

Clement Courbet authored Jan 24, 2020

Summary:
It turns out that CUR_DIRECTION is just an internal placeholder, not an actual
valid encoded value.

Reviewers: gchatelet

Subscribers: tschuett, mstojanovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73343

082dccac

[MLIR][Linalg] Lower linalg.generic to ploops. · 3dcc1fc6
Alexander Belyaev authored Jan 31, 2020
```
Differential Revision: https://reviews.llvm.org/D73684
```
3dcc1fc6
Fixed a -Wunused-variable warning in no-assertion builds · 7b6e49a2
Dmitri Gribenko authored Feb 03, 2020

7b6e49a2
Make quick-append.test resilient to running in paths with '1.o' in the name · f00ab188
Hans Wennborg authored Feb 03, 2020

f00ab188
[clangd] TUScheduler::run() (i.e. workspace/symbol) counts towards concurrent threads · 6b15a3d7
Sam McCall authored Feb 03, 2020
```
This seems to just be an oversight.
```
6b15a3d7
[clangd] Refactor TUScheduler options into a struct. NFC · b79cb547
Sam McCall authored Feb 03, 2020

b79cb547

Revert "[llvm] Add a way to speed up the speed in which BumpPtrAllocator increases slab sizes" · da1fb2be

Raphael Isemann authored Feb 03, 2020

This reverts commit b848b510 as the unit tests
fail on the sanitizer bots:
/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/unittests/Support/AllocatorTest.cpp:145: Failure
      Expected: SlabSize
      Which is: 4096
To be equal to: Alloc.getTotalMemory()
      Which is: 4097

da1fb2be

Revert "[lldb] Increase the rate at which ConstString's memory allocator... · 0afdc7be

Raphael Isemann authored Feb 03, 2020

Revert "[lldb] Increase the rate at which ConstString's memory allocator scales the memory chunks it allocates"

This reverts commit 500c324f because its parent commit
b848b510 is failing on the sanitizer bots.

0afdc7be

Revert "[libcxx] Force-cache LIBCXX_CXX_ABI_LIBRARY_PATH" · 1a7e688b

Sergej Jaskiewicz authored Jan 31, 2020

This reverts commit 41f4dfd6.

It broke standalone libc++ builds, which now try to use libc++abi from the wrong directory, instead of system instance.

(cherry picked from commit 3573526c0286c9461f0459be1a4592b2214594e7)

1a7e688b

Fix broken invariant · 75d9994a

Guillaume Chatelet authored Jan 31, 2020

Summary:
A Copy with a source that is zeros is the same as a Set of zeros.
This fixes the invariant that SrcAlign should always be non-null.

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73791

75d9994a

[lldb] Increase the rate at which ConstString's memory allocator scales the... · 500c324f

Raphael Isemann authored Feb 03, 2020

[lldb] Increase the rate at which ConstString's memory allocator scales the memory chunks it allocates

Summary:
We currently do far more malloc calls than necessary in the ConstString BumpPtrAllocator. This is due to the 256 BumpPtrAllocators
our ConstString implementation uses internally which end up all just receiving a small share of the total allocated memory
and therefore keep allocating memory in small chunks for far too long. This patch fixes this by increasing the rate at which we increase the
memory chunk size so that our collection of BumpPtrAllocators behaves in total similar to a single BumpPtrAllocator.

Reviewers: llunak

Reviewed By: llunak

Subscribers: abidh, JDevlieghere, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D71699

500c324f

[llvm-exegesis] Add pfm counters for Zen2 (znver2). · 5b2c5e26

Clement Courbet authored Dec 31, 2019

Summary: There are no counters for individual ports, but this is already
enough to find a lot of issues in the current model (upcoming patch).

Reviewers: dblaikie, gchatelet

Subscribers: hiraditya, tschuett, RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72032

5b2c5e26

[AMDGPU] Don't remove short branches over kills · 97d9a76a

Jay Foad authored Jan 31, 2020

Summary:
D68092 introduced a new SIRemoveShortExecBranches optimization pass and
broke some graphics shaders. The problem is that it was removing
branches over KILL pseudo instructions, and the fix is to explicitly
check for that in mustRetainExeczBranch.

Reviewers: critson, arsenm, nhaehnle, cdevadas, hakzsam

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73771

97d9a76a

[MLIR] Make gpu.launch implicitly capture uses of values defined above. · 283b5e73

Stephan Herhut authored Jan 31, 2020

Summary:
In the original design, gpu.launch required explicit capture of uses
and passing them as operands to the gpu.launch operation. This was
motivated by infrastructure restrictions rather than design. This
change lifts the requirement and removes the concept of kernel
arguments from gpu.launch. Instead, the kernel outlining
transformation now does the explicit capturing.

This is a breaking change for users of gpu.launch.

Differential Revision: https://reviews.llvm.org/D73769

283b5e73

[JumpThreading] Half the duplicate threshold at Oz · 2663a25f

Sam Parker authored Feb 03, 2020

Duplicating instructions can lead to code size increases but using
a threshold of 3 is good for reducing code size.

Differential Revision: https://reviews.llvm.org/D72916

2663a25f

[mlir] NFC: Fix trivial typo in comment · 54958869

Kazuaki Ishizaki authored Jan 27, 2020

Summary: Also, an exercise to merge this into the master myself after a reviewer gives LGTM.

Reviewers: nicolasvasilache, mehdi_amini

Reviewed By: mehdi_amini

Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73432

54958869

[llvm] Add a way to speed up the speed in which BumpPtrAllocator increases slab sizes · b848b510

Raphael Isemann authored Feb 03, 2020

Summary:
In D68549 we noticed that our BumpPtrAllocator we use for LLDB's ConstString implementation is growing its slabs at
a rate that is too slow for our use case. It causes that we spend a lot of time calling `malloc` for all the tiny slabs that our
ConstString BumpPtrAllocators create. We also can't just increase the slab size in the ConstString implementation
(which is what D68549 originally did) as this really increased the amount of (mostly unused) allocated memory
in any process using ConstString.

This patch adds a template argument for the BumpPtrAllocatorImpl that allows specifying a faster rate at which the
BumpPtrAllocator increases the slab size. This allows LLDB to specify a faster rate at which the slabs grow which
should keep both memory consumption and time spent calling malloc low.

Reviewers: george.karpenkov, chandlerc, NoQ

Subscribers: NoQ, llvm-commits, llunak

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71654

b848b510

[PM][CGSCC] Add parentheses to avoid a GCC warning. NFC. · f867c8e8
Martin Storsjö authored Feb 03, 2020
```
This avoids a warning about "suggest parentheses around && within ||".
```
f867c8e8

[libcxxabi] Fix layout of __cxa_exception for win64 · 09dc884e

Martin Storsjö authored Feb 01, 2020

Win64 isn't LP64, it's LLP64, but there's no __LLP64__ predefined -
just check _WIN64 in addition to __LP64__.

This fixes compilation after static asserts about the struct layout
were added in f2a43605.

Differential Revision: https://reviews.llvm.org/D73838

09dc884e

[OpenMP] Fix GCC warnings. NFC. · 2dc45bf3

Martin Storsjö authored Feb 03, 2020

Remove an extra semicolon, and add llvm_unreachable to avoid warnings
about control reaching the end of a non-void function.

2dc45bf3

[LLDB] Fix GCC warnings about extra semicolon. NFC. · 534aeb0b
Martin Storsjö authored Feb 03, 2020

534aeb0b

clang-format: [JS] document InsertTrailingCommas. · dc04c54f

Martin Probst authored Jan 31, 2020

Summary: In release notes and the regular docs.

Reviewers: MyDeveloperDay

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73768

dc04c54f

[Attributor][FIX] Try to resolve non-determinism problem for now · 5cc5fce4

Johannes Doerfert authored Feb 03, 2020

There seems to be another instance of non-determinism which causes the
number of iterations to be either 1 or 3 for one benchmark, depending
on the system. This needs to be investigated and resolved. In the
meantime we do not verify the number of iterations for this benchmark.

5cc5fce4

[Attributor] AANoRecurse check all call sites for `norecurse` · 26d02b0f

Johannes Doerfert authored Dec 30, 2019

If all call sites are in `norecurse` functions we can derive `norecurse`
as the ReversePostOrderFunctionAttrsPass does. This should make
ReversePostOrderFunctionAttrsLegacyPass obsolete once the Attributor is
enabled.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D72017

26d02b0f

[Attributor] Propagate known information from `checkForAllCallSites` · 368f7ee7

Johannes Doerfert authored Dec 30, 2019

If we know that all call sites have been processed we can derive an
early fixpoint. The use in this patch is likely not to trigger right now
but a follow up patch will make use of it.

Reviewed By: uenoku, baziotis

Differential Revision: https://reviews.llvm.org/D72016

368f7ee7

[Driver][test] Change %itanium_abi_triple to generic ELF · 3ecba396
Fangrui Song authored Feb 02, 2020
```
x86_64-windows and darwin default to PIC. They don't use PIE.
```
3ecba396

[DebugInfo] Remove an unused method DWARFUnit::getDWARF5HeaderSize(). NFC. · afb41e3e

Igor Kudrin authored Jan 31, 2020

The method was initially added for DWARFVerifier::verifyUnitHeader() but
its results were never actually used.

Differential Revision: https://reviews.llvm.org/D73773

afb41e3e

[X86] Remove a couple unnecessary calls to ConvertCmpIfNecessary. · cf20fde1

Craig Topper authored Feb 02, 2020

We only need to call this on floating point comparisons. In this
case these are known to be integer compares. One of them even
has a SUB opcode instead of CMP.

cf20fde1

[PM][CGSCC] Add a helper to update the call graph from SCC passes · 01377453

Johannes Doerfert authored Dec 30, 2019

With this patch new trivial edges can be added to an SCC in a CGSCC
pass via the updateCGAndAnalysisManagerForCGSCCPass method. It shares
almost all the code with the existing
updateCGAndAnalysisManagerForFunctionPass method but it implements the
first step towards the TODOs.

This was initially part of D70927.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D72025

01377453

[llvm-extract] Add -keep-const-init commandline option · 578d2e2c

Juneyoung Lee authored Feb 02, 2020

Summary:
This adds -keep-const-init option to llvm-extract which preserves initializers of
used global constants.

For example:

```
$ cat a.ll
@g = constant i32 0
define i32 @f() {
  %v = load i32, i32* @g
  ret i32 %v
}

$ llvm-extract --func=f a.ll -S -o -
@g = external constant i32
define i32 @f() { .. }

$ llvm-extract --func=f a.ll -keep-const-init -S -o -
@g = constant i32 0
define i32 @f() { .. }
```

This option is useful in checking whether a function that uses a constant global is optimized correctly.

Reviewers: jsji, MaskRay, david2050

Reviewed By: MaskRay

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73833

578d2e2c

[gn build] Port c953409f · 47f309d9
LLVM GN Syncbot authored Feb 03, 2020

47f309d9

[Inliner][NoAlias] Use call site attributes too · 342357c5

Johannes Doerfert authored Jan 28, 2020

If we had `noalias` on an argument the inliner created alias scope
metadata already. However, the call site `noalias` annotation was not
considered. Since the Attributor can derive such call site `noalias`
annotation we should treat them the same as argument annotations.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D73528

342357c5