Commits · 4f8bc7caf4e5fcc1620b3fd4980ec8d671e9345b · Raul Torres / llvm-target-spread

Jun 07, 2021

[AMDGPU][Libomptarget] Remove atlc global · 4f8bc7ca

Pushpinder Singh authored Jun 07, 2021

This global struct used to hold various flags for monitoring the
initialization of hsa.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D103795

4f8bc7ca

[OpenCL] Add const attribute to ctz() builtins · 9b14670f
Stuart Brady authored Mar 01, 2021
```
Reviewed By: svenvh

Differential Revision: https://reviews.llvm.org/D97725
```
9b14670f

[llvm] Add interface to order inlining · 4a0de622

Liqiang Tao authored Jun 07, 2021

This patch abstract Calls in Inliner:run() to InlineOrder.
With this patch, it's possible to customize the inlining order,
e.g. use queue or priority queue.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D103315

4a0de622

[lld/mac] Implement support for searching dylibs with @rpath/ in install name · c5ffe979

Nico Weber authored Jun 06, 2021

Also adjust a few comments, and move the DylibFile comment talking about
umbrella next to the parameter again.

Differential Revision: https://reviews.llvm.org/D103783

c5ffe979

[clang] NFC: test for undefined behaviour in RawComment::getFormattedText() · aa0d7179

Dmitry Polukhin authored Jun 04, 2021

This diff adds testcase for the issue fixed in https://reviews.llvm.org/D77468
but regression test was not added in the diff. On Clang 9 it caused
crash in cland during code completion.

Test Plan: check-clang-unit

Differential Revision: https://reviews.llvm.org/D103722

aa0d7179

[NFC] Fix semantic discrepancy for MVT::LAST_VALUETYPE · 1da2c7d2
Guillaume Chatelet authored Jun 07, 2021
```
Differential Revision: https://reviews.llvm.org/D103251
```
1da2c7d2
[PhaseOrdering] Update tests after 23c2f2e6 . · 131343d3
Florian Hahn authored Jun 07, 2021

131343d3
ASTConcept.h - remove unused <string> include. NFCI. · 30a89a75
Simon Pilgrim authored Jun 07, 2021

30a89a75

[SimpleLoopBoundSplit] Split Bound of Loop which has conditional branch with IV · a2a0ac42

Jingu Kang authored May 06, 2021

This pass transforms loops that contain a conditional branch with induction
variable. For example, it transforms left code to right code:

                             newbound = min(n, c)
 while (iv < n) {            while(iv < newbound) {
   A                           A
   if (iv < c)                 B
     B                         C
   C                         }
 }                           if (iv != n) {
                               while (iv < n) {
                                 A
                                 C
                               }
                             }

Differential Revision: https://reviews.llvm.org/D102234

a2a0ac42

[Clang] Support a user-defined __dso_handle · b31f41e7

Andrew Savonichev authored May 24, 2021

This fixes PR49198: Wrong usage of __dso_handle in user code leads to
a compiler crash.

When Init is an address of the global itself, we need to track it
across RAUW. Otherwise the initializer can be destroyed if the global
is replaced.

Differential Revision: https://reviews.llvm.org/D101156

b31f41e7

[LV] Mark increment of main vector loop induction variable as NUW. · 23c2f2e6

Florian Hahn authored Jun 07, 2021

This patch marks the induction increment of the main induction variable
of the vector loop as NUW when not folding the tail.

If the tail is not folded, we know that End - Start >= Step (either
statically or through the minimum iteration checks). We also know that both
Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV +
%Step == %End. Hence we must exit the loop before %IV + %Step unsigned
overflows and we can mark the induction increment as NUW.

This should make SCEV return more precise bounds for the created vector
loops, used by later optimizations, like late unrolling.

At the moment quite a few tests still need to be updated, but before
doing so I'd like to get initial feedback to make sure I am not missing
anything.

Note that this could probably be further improved by using information
from the original IV.

Attempt of modeling of the assumption in Alive2:
https://alive2.llvm.org/ce/z/H_DL_g

Part of a set of fixes required for PR50412.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D103255

23c2f2e6

[AMDGPU] Fix MC tests for v_fmaak_f16 and v_fmamk_f16 · 9e9edede

Jay Foad authored Jun 04, 2021

This looks like a mistake when the tests were committed in r363946.
There were two sets of tests for the f32 variant of these instructions,
instead of one set for f16 and one set for f32.

Differential Revision: https://reviews.llvm.org/D103699

9e9edede

[mlir][linalg] Cleanup LinalgOp usage in comprehensive bufferization. · caf26612

Tobias Gysi authored Jun 07, 2021

Replace the uses of deprecated Structured Op Interface methods in ComprehensiveBufferize.cpp. This patch is based on https://reviews.llvm.org/D103394.

Differential Revision: https://reviews.llvm.org/D103520

caf26612

[OpenCL] Fix missing addrspace on implicit move assignment operator · 438cf557

Ole Strohm authored Jun 07, 2021

This fixes the missing address space on `this` in the implicit move
assignment operator.
The function called here is an abstraction around the lines that have
been removed which also sets the address space correctly.
This is copied from CopyConstructor, CopyAssignment and MoveConstructor,
all of which use this function, and now MoveAssignment does too.

Fixes: PR50259

Reviewed By: svenvh

Differential Revision: https://reviews.llvm.org/D103252

438cf557

[AMDGPU][Libomptarget] Rework logic for locating kernarg pools · f5f329a3

Pushpinder Singh authored Jun 03, 2021

Previous logic was to always use the first kernarg pool found to allocate
kernel args. This patch changes this to use only the kernarg pool which
has non-zero size. This logic is also reworked to not use any globals.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D103600

f5f329a3

Fixed the build failure of yaml2obj in XCOFFEmitter.cpp: · bcb20aa7

Esme-Yi authored Jun 07, 2021

  error: ambiguous overload for 'operator=='
  (operand types are 'llvm::yaml::Hex16' and 'llvm::XCOFF::MagicNumber')
     Is64Bit = Obj.Header.Magic == XCOFF::XCOFF64;

bcb20aa7

[yaml2obj] Initial the support of yaml2obj for 32-bit XCOFF. · 50bb1b93

Esme-Yi authored Jun 07, 2021

Summary: The patch implements the mapping of the Yaml
information to XCOFF object file to enable the yaml2obj
tool for XCOFF. Currently only 32-bit is supported.

Reviewed By: jhenderson, shchenz

Differential Revision: https://reviews.llvm.org/D95505

50bb1b93

[lld/mac] Implement support for searching dylibs with @loader_path/ in install name · 52489021
Nico Weber authored Jun 06, 2021
```
Differential Revision: https://reviews.llvm.org/D103779
```
52489021
[lld/mac] Implement support for searching dylibs with @executable_path/ in install name · a48bd587
Nico Weber authored Jun 06, 2021
```
Differential Revision: https://reviews.llvm.org/D103775
```
a48bd587

[lld/mac] Rename DylibFile::dylibName to DylibFile::installName · 7def7006

Nico Weber authored Jun 06, 2021

The flag to set it is called `-install_name`, and it's called `installName` in tbd files.

No behavior change.

Differential Revision: https://reviews.llvm.org/D103776

7def7006

[lld/mac] Use fewer magic numbers in magic $ld$ handling code · e9104374

Nico Weber authored Jun 06, 2021

Also simply a conditional and de-alias a variable.
Minor cleanups, no behavior change.

Differential Revision: https://reviews.llvm.org/D103774

e9104374

[dfsan] Use the sanitizer allocator to reduce memory cost · 2c82588d

Jianzhou Zhao authored Apr 23, 2021

dfsan does not use sanitizer allocator as others. In practice,
we let it use glibc's allocator since tcmalloc needs more work
to be working with dfsan well. With glibc, we observe large
memory leakage. This could relate to two things:

1) glibc allocator has limitation: for example, tcmalloc can reduce memory footprint 2x easily

2) glibc may call unmmap directly as an internal system call by using system call number. so DFSan has no way to release shadow spaces for those unmmap.

Using sanitizer allocator addresses the above issues
1) its memory management is close to tcmalloc

2) we can register callback when sanitizer allocator calls unmmap, so dfsan can release shadow spaces correctly.

Our experiment with internal server-based application proved that with the change, in a-few-day run, memory usage leakage is close to what tcmalloc does w/o dfsan.

This change mainly follows MSan's code.

1) define allocator callbacks at dfsan_allocator.h|cpp

2) mark allocator APIs to be discard

3) intercept allocator APIs

4) make dfsan_set_label consistent with MSan's SetShadow when setting 0 labels, define dfsan_release_meta_memory when unmap is called

5) add flags about whether zeroing memory after malloc/free. dfsan works at byte-level, so bit-level oparations can cause reading undefined shadow. See D96842. zeroing memory after malloc helps this. About zeroing after free, reading after free is definitely UB, but if user code does so, it is hard to debug an overtainting caused by this w/o running MSan. So we add the flag to help debugging.

This change will be split to small changes for review. Before that, a question is
"this code shares a lot of with MSan, for example, dfsan_allocator.* and dfsan_new_delete.*.
Does it make sense to unify the code at sanitizer_common? will that introduce some
maintenance issue?"

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101204

2c82588d

Jun 06, 2021

[CostModel][X86] Add 512-bit bswap costs · 432eff22
Simon Pilgrim authored Jun 06, 2021

432eff22
[CostModel][X86] Add 512-bit bswap cost tests · ed3b3cfe
Simon Pilgrim authored Jun 06, 2021

ed3b3cfe
[ARM] MVE tests for vmull from a splat. NFC · c85766f7
David Green authored Jun 06, 2021

c85766f7
[AArch64] Extra tests for vector shift. NFC · 8f8273c5
David Green authored Jun 06, 2021

8f8273c5

[CostModel][X86] Improve AVX512 FDIV costs · ae973380

Simon Pilgrim authored Jun 06, 2021

Add missing v16f32/v8f64 costs and adjust other costs as well based off the SkylakeServer model

ae973380

[RISCV] Replace && with ||. Spotted by coverity. · 8bde5f06

Craig Topper authored Jun 06, 2021

We should be exiting when the shift amount is greater than
the bit width regardless of whether it is a power of 2.

Reported by Simon Pilgrim here https://reviews.llvm.org/D96661

This requires getting a shift amount that is out of bounds that
wasn't already optimized by SelectionDAG. This would be pretty
trick to construct a test for.

Or it would require a non-power of 2 shift amount and a mask
that has runs of ones and zeros of the next lowest power of 2 from
that shift amount. I tried a little to produce a test for this,
but didn't get it to work.

8bde5f06

[X86][SSE] LowerFP_TO_INT - remove dead code. NFCI. · 8ab8b3fa
Simon Pilgrim authored Jun 06, 2021
```
Non-Strict v2f32->v2i64 cases have already early-returned to be handled by legalization.
```
8ab8b3fa
[X86][SSE] combineVectorTruncation - simplify PSHUFB-is-better logic. NFCI. · 4879c8f3
Simon Pilgrim authored Jun 06, 2021
```
OutSVT is guaranteed to be i8/i16 and we accept any InSVT that isn't i64
```
4879c8f3
Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass" · 0a9d0799
maekawatoshiki authored Jun 07, 2021
```
This reverts commit 21653600.

To fix the crash problem in legacy pass manager
```
0a9d0799

[Clang][OpenMP] Refactor checking for mutually exclusive clauses. NFC. · c41a8fbf

Michael Kruse authored Jun 06, 2021

Multiple clauses are mutually exclusive. This patch refactors the functions that check for pairs of mutually exclusive clauses into a generalized function which also also accepts a list of clause types if which at most one can appear.

NFC patch extracted out of D99459 by request.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D103666

c41a8fbf

X86MachObjectWriter.cpp - silence null deference warnings. NFCI. · b69e16b5

Simon Pilgrim authored Jun 06, 2021

The MCSymbol data should always be present for non-absolute sections so assert that it is to silence static analysis warnings.

b69e16b5

[TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC) · 1ffa6499

Nikita Popov authored Jun 05, 2021

Don't require a specific kind of IRBuilder for TargetLowering hooks.
This allows us to drop the IRBuilder.h include from TargetLowering.h.

Differential Revision: https://reviews.llvm.org/D103759

1ffa6499

[LexicalScopesTest] Add missing IRBuilder.h include (NFC) · 85dfb377
Nikita Popov authored Jun 06, 2021
```
This currently depends on a transitive include via TargetLowering.h.
```
85dfb377
X86Operand.h - fix uninitialized variable warnings in constructor. NFCI. · 0f938a6e
Simon Pilgrim authored Jun 06, 2021

0f938a6e

AssumeBundleQueries.cpp - don't dereference a dyn_cast<> result. NFCI. · 76a1be05

Simon Pilgrim authored Jun 06, 2021

Use cast<> instead which will assert that the cast is correct and not just return null - the match() should have already failed if the cast isn't valid anyhow.

Fixes static analysis warning.

76a1be05

[Clang][OpenMP] Add static version of getSingleClause<ClauseT>. NFC. · d466ca08

Michael Kruse authored Jun 06, 2021

The current method getSingleClause requires an instance of OMPExecutableDirective to be called. Introduce a static version taking a list of clauses as argument instead that can be used during parsing/Sema before any OMPExecutableDirective has been created.

This is the same approach as taken for getClausesOfKind for getting more more than a single clause of a type which also has a method and static version. NFC patch extracted out of D99459 by request.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D103665

d466ca08

[TargetLowering] Move methods out of line (NFC) · 506875c8
Nikita Popov authored Jun 06, 2021
```
Move methods using IRBuilder out of line, so we can drop the
dependency on the header.
```
506875c8

[CodeGen] Add missing includes (NFC) · 99142003

Nikita Popov authored Jun 06, 2021

These currently rely on the IRBuilder.h include in TargetLowering.h.
Make them explicit.

99142003