Commits · aa99da5ace4587440973c97a4cd5f486e7bb3c33 · Lorenzo Albano / LLVM bpEVL

May 12, 2020

Avoid binding pointers to "auto&" (by dereferencing the pointer that's non-null anyway) · aa99da5a
David Blaikie authored May 12, 2020
```
Based on @djtodoro's 2552dc53
```
aa99da5a

[libcxx] Re-commit: shared_ptr changes from library fundamentals (P0414R2). · ce195fb2

zoecarver authored May 11, 2020

Implements P0414R2:
  * Adds support for array types in std::shared_ptr.
  * Adds reinterpret_pointer_cast for shared_ptr.

Re-committing now that the leaking tests are fixed.

Differential Revision: https://reviews.llvm.org/D62259

ce195fb2

[PowerPC] Fold redundant load immediates of zero and delete if possible · cd83333f

Kamau Bridgeman authored May 12, 2020

This patch folds redundant load immediates into a zero for instructions
which recognise this as the value zero and not the register. If the load
immediate is no longer in use it is then deleted.

This is already done in earlier passes but the ppc-mi-peephole allows for
a more general implementation.

Differential Revision: https://reviews.llvm.org/D69168

cd83333f

[Reproducers] Serialize process arguments in ProcessInfo · bad61548

Jonas Devlieghere authored May 12, 2020

While debugging why TestProcessList.py failed during passive replay, I
remembered that we don't serialize the arguments for ProcessInfo. This
is necessary to make the test pass and to make platform process list -v
behave the same during capture and replay.

Differential revision: https://reviews.llvm.org/D79646

bad61548

[FileCollector][NFC] Add comments · 9202df35
Jan Korous authored May 08, 2020
```
Differential Revision: https://reviews.llvm.org/D78961
```
9202df35

[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc. · e5f602d8

Juneyoung Lee authored Apr 21, 2020

Summary:
This patch makes propagatesPoison be more accurate by returning true on
more bin ops/unary ops/casts/etc.

The changed test in ScalarEvolution/nsw.ll was introduced by
https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 .
IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has
no-overflow flags even if the loop isn't in the wanted form.
It becomes more accurate with this patch, so think this is okay.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy

Reviewed By: spatel, nikic

Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78615

e5f602d8

[X86] Remove the v16i8->v16i16 path for MULHS with AVX2. · 01636c1e

Craig Topper authored May 12, 2020

We have a couple main strategies for legalizing MULH.

-If the vXi16 type is legal, extend to do the full i16 multiply
and then shift and truncate the results.
-Use unpcks to split each 128 bit lane into high and low halves.a

For signed we have an extra case to split a v32i8 to v16i8 and then
use the extending to v16i16 strategy.

This patch proposes to use the unpck strategy instead. Which is
what we already do for unsigned.

This seems to be 1 instruction shorter when the RHS is constant
like the idiv case. It's 1 instruction longer for the smulo case.
But we're trading cross lane shuffles for inlane shuffles and a
shift.

Differential Revision: https://reviews.llvm.org/D79652

01636c1e

[arm] Add big-endian version of pcrel fixups for adr instructions · fc373522

Dimitry Andric authored May 12, 2020

Summary:
In 2e24219d, a number of ARM pcrel fixups were resolved at assembly
time, to solve PR44929. This only covered little-endian ARM however, so
add similar fixups for big-endian ARM. Also extend the test case to
cover big-endian ARM.

Reviewers: hans, psmith, MaskRay

Reviewed By: psmith, MaskRay

Subscribers: kristof.beyls, hiraditya, danielkiss, emaste, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79774

fc373522

[AMDGPU] Add AGPRs to getRegClassForSizeOnBank · 9f0b7361
Austin Kerbow authored May 11, 2020
```
Differential Revision: https://reviews.llvm.org/D79761
```
9f0b7361
[CodeGen] Use Align in MachineConstantPool. · 8c72b027
Craig Topper authored May 12, 2020

8c72b027
[VectorCombine] add test to check for iterative improvements; NFC · 93bd6963
Sanjay Patel authored May 12, 2020

93bd6963

[WebAssembly] Implement pseudo-min/max SIMD instructions · 3d49d1cf

Thomas Lively authored May 12, 2020

Summary:
As proposed in https://github.com/WebAssembly/simd/pull/122. Since
these instructions are not yet merged to the SIMD spec proposal, this
patch makes them entirely opt-in by surfacing them only through LLVM
intrinsics and clang builtins. If these instructions are made
official, these intrinsics and builtins should be replaced with simple
instruction patterns.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D79742

3d49d1cf

[gcov][test] Fix clang test · 25a95f49
Fangrui Song authored May 12, 2020

25a95f49

[gcov] Default coverage version to '408*' and delete CC1 option -coverage-exit-block-before-body · b56b1e67

Fangrui Song authored May 11, 2020

gcov 4.8 (r189778) moved the exit block from the last to the second.
The .gcda format is compatible with 4.7 but

* decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the
  exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings,
  and print wrong `" returned %s` for branch statistics (-b).
* decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues.

Also, rename "return block" to "exit block" because the latter is the
appropriate term.

b56b1e67

[PassBuilder] Moved ProfileSummaryAnalysis in buildInlinerPipeline. · 5c10c6e0

Whitney Tsang authored May 12, 2020

Summary:
As commented in the code, ProfileSummaryAnalysis is required for inliner
pass to query, so this patch moved
RequireAnalysisPass<ProfileSummaryAnalysis> in the recently created
buildInlinerPipeline.
Reviewer: mtrofin, davidxl, tejohnson, dblaikie, jdoerfert, sstefan1
Reviewed By: mtrofin, davidxl, jdoerfert
Subscribers: hiraditya, steven_wu, dexonsmith, wuzish, llvm-commits,
jsji
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D79696

5c10c6e0

[GlobalISel][IRTranslator] Fix <1 x Ty> handling in ConstantExprs · 989be65b

Jay Foad authored Apr 17, 2020

Summary:
ConstantExprs involving operations on <1 x Ty> could translate into MIR
that failed to verify with:
*** Bad machine code: Reading virtual register without a def ***

The problem was that translate(const Constant &C, Register Reg) had
recursive calls that passed the same Reg in for the translation of a
subexpression, but without updating VMap for the subexpression first as
translate(const Constant &C, Register Reg) expects.

Fix this by using the same translateCopy helper function that we use for
translating Instructions. In some cases this causes extra G_COPY
MIR instructions to be generated.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45576

Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar

Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78378

989be65b

[GlobalISel][IRTranslator] New helper function translateCopy. NFC. · bd80a8bb

Jay Foad authored Apr 17, 2020

Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar

Subscribers: wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78377

bd80a8bb

[Matrix] Check non-dependent elt type before creating DepSizedMatrix. · ffcaed32

Florian Hahn authored May 12, 2020

We should check non-dependent element types before creating a
DependentSizedMatrixType. Otherwise we do not generate an error message
for dependent-sized matrix types with invalid non-dependent element
types, if the template is never instantiated. See the make5 struct in
the tests.

It also moves the SEMA template tests to
clang/test/SemaTemplate/matrix-type.cpp and introduces a few more test
cases.

ffcaed32

[docs] Corrected inaccuracies in Common Problems section. · 5c707fd9

Michael Kruse authored May 12, 2020

Changed the language in LLVM_USE_LINKER to more strongly recommend LLD
and to specify that the GNU gold linker is only useful if LLD is
unavailable in binary form and it is the first build of LLVM. Added that
LLD will help when used on ELF-based platforms.

Corrected information in CMAKE_BUILD_TYPE regarding the Release build
type and enabling assertions.

Added option LLVM_ENABLE_ASSERTIONS and mentioned enabling this option
with a Release build as an alternative to using a Debug build.

Specified that the LLVM_OPTIMIZED_TABLEGEN
option is only for Debug builds, that the LLVM_USE_SPLIT_DWARF option
is only available on ELF host platforms, and that setting
CLANG_ENABLE_STATIC_ANALYZER to OFF only slightly improves build time.

These changes address comments made in D75425.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D77346

5c707fd9

[lld-macho] Add support for creating and reading reexported dylibs · 87b6fd3e

Jez Ng authored Apr 23, 2020

This unblocks the linking of real programs, since many core system
functions are only available as sub-libraries of libSystem.

Differential Revision: https://reviews.llvm.org/D79228

87b6fd3e

[lld-macho] Re-add dylink-lazy test · c8c39185

Jez Ng authored May 12, 2020

This reverts commit eb81de2d; the
test commands just needed to be run under llvm-lit.

c8c39185

Add comment for SelectionDAGBuilder::SL field. · e9536795
James Y Knight authored May 12, 2020

e9536795

[clangd] Add metrics for selection tree and recovery expressions. · 774acdfb

Haojian Wu authored May 11, 2020

Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79701

774acdfb

[AMDGPU] Order pos exports before param exports · 58f1417e

Carl Ritson authored May 12, 2020

Summary:
Modify export clustering DAG mutation to move position exports
before other exports types.

Reviewers: foad, arsenm, rampitec, nhaehnle

Reviewed By: foad

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79670

58f1417e

HIP: Merge builtin library handling · 14e18457

Matt Arsenault authored Mar 27, 2020

Merge with the new --rocm-path handling used for OpenCL. This looks
for a usable set of device libraries upfront, rather than giving a
generic "no such file or directory error". If any of the required
bitcode libraries are missing, this will now produce a "cannot find
ROCm installation." error. This differs from the existing hip specific
flags by pointing to a rocm root install instead of a single directory
with bitcode files.

This tries to maintain compatibility with the existing the
--hip-device-lib and --hip-device-lib-path flags, as well as the
HIP_DEVICE_LIB_PATH environment variable, or at least the range of
uses with testcases. The existing range of uses and behavior doesn't
entirely make sense to me, so some of the untested edge cases change
behavior. Currently the two path forms seem to have the double purpose
of a search path for an arbitrary --hip-device-lib, and for finding
the stock set of libraries. Since the stock set of libraries This also
changes the behavior when multiple paths are specified, and only takes
the last one (and the environment variable only handles a single
path).

If --hip-device-lib is used, it now only treats --hip-device-lib-path
as the search path for it, and does not attempt to find the rocm
installation. If not, --hip-device-lib-path and the environment
variable are used as the directory to search instead of the rocm root
based path.

This should also automatically fix handling of the options to use
wave64.

14e18457

AMDGPU: Search for new ROCm bitcode library structure · 123bee60

Matt Arsenault authored Apr 10, 2020

The current install situation is a mess, but I'm working on fixing
it. Search for the target layout instead of one of the N options that
exist today.

123bee60

[LLD] Rename iDTable -> idTable, NFC · 6da56729
Reid Kleckner authored May 11, 2020
```
The variable renaming change did not handle this variable well.
```
6da56729
Fold single-use variables into assert · f242950f
Benjamin Kramer authored May 12, 2020
```
This avoids unused variable warnings in Release builds.
```
f242950f
Add Linux SVE Ptrace macros. · 5d7f5ca0
Kristof Beyls authored May 07, 2020
```
Differential Revision: https://reviews.llvm.org/D79623
```
5d7f5ca0

Revert "[mlir] Revisit std.subview handling of static information." · 691e8269

Sam McCall authored May 12, 2020

This reverts commit 80d133b2.

Per Stephan Herhut: The canonicalizer pattern that was added creates
forms of the subview op that cannot be lowered.

This is shown by failing Tensorflow XLA tests such as:
  tensorflow/compiler/xla/service/mlir_gpu/tests:abs.hlo.test
Will provide more details offline, they rely on logs from private CI.

691e8269

[PATCH] #pragma float_control should be permitted in namespace scope. · 7f2db993

Melanie Blower authored May 08, 2020

Summary: Erroneous error diagnostic observed in VS2017 <numeric> header
Also correction to propagate usesFPIntrin from template func to instantiation.

Reviewers: rjmccall, erichkeane (no feedback received)

Differential Revision: https://reviews.llvm.org/D79631

7f2db993

[X86] combineX86ShuffleChain - use narrowShuffleMaskElts scale == 1 builtin handling. NFC. · 0387df7f
Simon Pilgrim authored May 12, 2020
```
narrowShuffleMaskElts already has the fast-path for scale == 1, no need to reimplement it here.
```
0387df7f

[CUDA][HIP] Workaround for resolving host device function against wrong-sided function · e03394c6

Yaxun (Sam) Liu authored Apr 24, 2020

recommit c77a4078 with fix

https://reviews.llvm.org/D77954 caused regressions due to diagnostics in implicit
host device functions.

For now, it seems the most feasible workaround is to treat implicit host device function and explicit host
device function differently. Basically in device compilation for implicit host device functions, keep the
old behavior, i.e. give host device candidates and wrong-sided candidates equal preference. For explicit
host device functions, favor host device candidates against wrong-sided candidates.

The rationale is that explicit host device functions are blessed by the user to be valid host device functions,
that is, they should not cause diagnostics in both host and device compilation. If diagnostics occur, user is
able to fix them. However, there is no guarantee that implicit host device function can be compiled in
device compilation, therefore we need to preserve its overloading resolution in device compilation.

Differential Revision: https://reviews.llvm.org/D79526

e03394c6

[NFC][AArch64] More casts tests... · f1f8cffc
Sam Parker authored May 12, 2020
```
Don't use truncs are users because sometimes they're free too.
```
f1f8cffc
[X86][AVX] Use X86ISD::VPERM2X128 for blend-with-zero if optimizing for size · 45aa1b88
Simon Pilgrim authored May 12, 2020
```
Last part of PR22984 - avoid the zero-register dependency if optimizing for size
```
45aa1b88
FuzzerCLI.h - reduce StringRef.h include to forward declaration. NFC. · 24ac6a2d
Simon Pilgrim authored May 10, 2020

24ac6a2d

DebugCounter.h - remove unused includes. NFC. · e143253f

Simon Pilgrim authored May 10, 2020

Added explicit StringRef.h include as we need the full definition for several inline functions in DebugCounter.h.

e143253f

[Target][ARM] Replace outdated getARMVPTBlockMask function · 24bf8063

Pierre-vh authored Apr 08, 2020

getARMVPTBlockMask was an outdated function that only handled basic
block masks: T, TT, TTT and TTTT. This worked fine before the MVE
VPT Block Insertion Pass improvements as it was the only kind of
masks that it could generate, but now it can generate more complex
masks that uses E predicates, so it's dangerous to use that function
to calculate VPT/VPST block masks.

I replaced it with 2 different functions:
  - expandPredBlockMask, in ARMBaseInfo. This adds an "E" or "T" at
    the end of an existing PredBlockMask.
  - recomputeVPTBlockMask, in Thumb2InstrInfo. This takes an iterator
    to a VPT/VPST instruction and recomputes its block mask by looking
    at the predicated instructions that follows it. This should be
    used to recompute a block mask after removing/adding a predicated
    instruction to the block.

The expandPredBlockMask function is pretty much imported from the MVE
VPT Blocks pass.

I had to change the ARMLowOverheadLoops and MVEVPTBlocks passes as well
so they could use these new functions.

Differential Revision: https://reviews.llvm.org/D78201

24bf8063

[Target][ARM] Replace re-uses of old VPR values with VPNOTs · bf218337
Pierre-vh authored Apr 02, 2020
```
Differential Revision: https://reviews.llvm.org/D76847
```
bf218337

[libcxx testing] Remove ALLOW_RETRIES from sleep_for.pass.cpp · 9e32bf55

David Zarzycki authored May 12, 2020

Operating systems are best effort by default, so we cannot assume that
sleep-like APIs return as soon as we'd like.

Even if a sleep-like API returns when we want it to, the potential for
preemption means that attempts to measure time are subject to delays.

9e32bf55