Commits · f7655f3df394d90a340dc09465fbb752bef4bae6 · Lorenzo Albano / LLVM bpEVL

Jun 02, 2021

[OpenMP] Fix improper printf format specifier · f7655f3d
Peyton, Jonathan L authored Jun 02, 2021

f7655f3d

[libcxx][NFC] Tidy up calculation of __nbuf in num_put::do_put, and add comments · 06e04722

Daniel McIntosh authored Jun 02, 2021

In 07ef8e67 and 3ed9f6eb, `__nbuf` started to diverge from the amount
of space that was actually needed for the buffer. For 32-bit longs for example,
we allocate a buffer that is one larger than needed. Moreover, it is no longer
clear exactly where the extra +1 or +2 comes from - they're just numbers pulled
from thin air. This PR cleans up how `__nbuf` is calculated, and adds comments
to further clarify where each part comes from.

Specifically, it corrects the underestimation of the max size buffer needed
that the above two commits had to compensate for. The root cause looks to be
the use of signed type parameters to numeric_limits<>::digits. Since digits
only counts non-sign bits, the calculation was acting as though (for a signed
64-bit type) the longest value we would print was 2^63 in octal. However,
printing in octal treats values as unsigned, so it is actually 2^64. Thus,
using unsigned types and changing the final +2 to a +1 is probably a better
option.

Reviewed By: #libc, ldionne, Mordante

Differential Revision: https://reviews.llvm.org/D103339

06e04722

[lld/mac] try to fix tests after a5645513 · 5ecfdb51
Nico Weber authored Jun 02, 2021
```
My linux system doesn't like the `grep` for some reason,
but FileCheck seems to work.
```
5ecfdb51
[OpenMP] Use new task type/flag for taskwait depend events. · 7ba4e96e
Hansang Bae authored May 27, 2021
```
Differential Revision: https://reviews.llvm.org/D103464
```
7ba4e96e

[lld/mac] Implement -dead_strip · a5645513

Nico Weber authored May 07, 2021

Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.

Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).

Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:

    % ministat lld_*
    x lld_nostrip.txt
    + lld_strip.txt
        N           Min           Max        Median           Avg        Stddev
    x  10      3.929414       4.07692     4.0269079     4.0089678   0.044214794
    +  10     3.8129408     3.9025559     3.8670411     3.8642573   0.024779651
    Difference at 95.0% confidence
            -0.144711 +/- 0.0336749
            -3.60967% +/- 0.839989%
            (Student's t, pooled s = 0.0358398)

This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols

It's possible it interacts with more features I didn't think of,
of course.

I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
  as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests

Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.

Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
  since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
  (`.no_dead_strip` in asm)

Fixes PR49276.

Differential Revision: https://reviews.llvm.org/D103324

a5645513

[lld/mac] Implement -needed_framework, -needed_library, -needed-l · 66a1ecd2
Nico Weber authored Jun 02, 2021
```
These allow overriding dead_strip_dylibs.

Differential Revision: https://reviews.llvm.org/D103499
```
66a1ecd2
[lld/mac] Don't strip explicit dylib also mentioned in LC_LINKER_OPTION · e14fd7d8
Nico Weber authored Jun 02, 2021
```
Noticed by Jez in D103499.

Differential Revision: https://reviews.llvm.org/D103521
```
e14fd7d8

[LoopStrengthReduce] Ensure that debug intrinsics do not affect LSR's output · 4316b0e5

Stephen Tozer authored May 27, 2021

During Loop Strength Reduce, if the terminating condition for the loop
is not immediately adjacent to the terminating branch and it has more
than one use, a clone of the condition will be created just before the
terminating branch and will be used as the branch condition. Currently,
whether the instructions are "immediately adjacent" is determined by
checking whether the next instruction after the condition is the
terminating branch; this is incorrect however, as the presence of a
debug intrinsic between the two will result in a change to the output.
This is fixed by using getNextNonDebugInstruction() instead.

Differential Revision: https://reviews.llvm.org/D103033

4316b0e5

[lld/mac] Address review feedback and improve a comment · 476e4d65

Nico Weber authored Jun 02, 2021

I forgot to move the message() call around as requested in D103428
before committing that change. Move it now.

Also, improve the ordinal uniq'ing comment. I hadn't realized that the
distinct-but-identical files happen with --reproduce and not in general.

No behavior change.

Differential Revision: https://reviews.llvm.org/D103522

476e4d65

[coro async] Add the swiftasync attribute to the resume partial function · f1a0c5d6

Arnold Schwaighofer authored May 19, 2021

Transfer the swiftasync attribute to the resume partial function according to
suspend.async specification. It's first argument denotes which argument is the
async context.

rdar://71499498

Differential Revision: https://reviews.llvm.org/D103285

f1a0c5d6

[clang] Implement the using_if_exists attribute · 369c6483

Erik Pilkington authored May 31, 2021

This attribute applies to a using declaration, and permits importing a
declaration without knowing if that declaration exists. This is useful
for libc++ C wrapper headers that re-export declarations in std::, in
cases where the base C library doesn't provide all declarations.

This attribute was proposed in http://lists.llvm.org/pipermail/cfe-dev/2020-June/066038.html.

rdar://69313357

Differential Revision: https://reviews.llvm.org/D90188

369c6483

[clangd] Add support for the `defaultLibrary` semantic token modifier · 2f951ca9

David Goldman authored Apr 29, 2021

This allows us to differentiate symbols from the system (e.g. system
includes or sysroot) differently than symbols defined in the user's
project, which can be used by editors to display them differently.

This is currently based on `FileCharacteristic`, but we can
consider alternatives such as `Sysroot` and file paths in the future.

Differential Revision: https://reviews.llvm.org/D101554

2f951ca9

Fix comments in test cuda-kernel-call.cu · 61c65d8e
Yaxun (Sam) Liu authored Jun 02, 2021

61c65d8e

Add getDemandedBits for uses. · cbde2487

Qunyan Mangus authored Jun 02, 2021

Add getDemandedBits method for uses so we can query demanded bits for each use.  This can help getting better use information. For example, for the code below
define i32 @test_use(i32 %a) {
  %1 = and i32 %a, -256
  %2 = or i32 %1, 1
  %3 = trunc i32 %2 to i8 (didn't optimize this to 1 for illustration purpose)
  ... some use of %3
  ret %2
}
if we look at the demanded bit of %2 (which is all 32 bits because of the return), we would conclude that %a is used regardless of how its return is used. However, if we look at each use separately, we will see that the demanded bit of %2 in trunc only uses the lower 8 bits of %a which is redefined, therefore %a's usage depends on how the function return is used.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97074

cbde2487

[AArch64][NFC] Fix failing cost-model test · ff6fe93f
Irina Dobrescu authored Jun 02, 2021

ff6fe93f

[LV] Build and cost VPlans for scalable VFs. · d41cb6bb

Sander de Smalen authored Jun 02, 2021

This patch uses the calculated maximum scalable VFs to build VPlans,
cost them and select a suitable scalable VF.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D98722

d41cb6bb

[compiler-rt][lsan] Increase libdl_deadlock test timeout · 8c363efe

David Spickett authored Jun 02, 2021

We have been seeing this test fail intermittently on our
2 stage AArch64 bot.

As far back as https://lab.llvm.org/buildbot/#/builders/53/builds/2694

Likely due to a lack of resources at certain times on the
shared machine. Up the time limit to give us some more room.

(this limit only applies to the watchdog thread, so if the
test passes then it won't take 20s)

8c363efe

[PowerPC][AIX} FIx AIX bootstrap build. · 81f7607f

Sean Fertile authored Jun 01, 2021

A recent patch:
https://reviews.llvm.org/rGe0921655b1ff8d4ba7c14be59252fe05b705920e
changed clangs AIX bitfield handling to use 4-byte bitfield containers,
matching XLs behavior. This change triggers static assert failures when
bootstrapping. Change the macro we check to enable bitfield packing on
AIX to `__clang__` which is defined by both xlclang and clang.

Differential Revision: https://reviews.llvm.org/D103474

81f7607f

[LV] NFC: Remove redundant isLegalMasked(Gather|Scatter) functions. · 034503e9

Sander de Smalen authored Jun 01, 2021

This NFC change follows from conversation in D102437, where it was discussed
to remove these functions as a separate patch.

034503e9

[LV] NFC: Replace custom getMemInstValueType by llvm::getLoadStoreType. · 3472d3fd

Sander de Smalen authored Jun 01, 2021

llvm::getLoadStoreType was added recently and has the same implementation
as 'getMemInstValueType' in LoopVectorize.cpp. Since there is no
value in having two implementations, this patch removes the custom LV
implementation in favor of the generic one defined in Instructions.h.

3472d3fd

[TTI] NFC: Change getIntImmCodeSizeCost to return InstructionCost. · 0195e594

Daniil Fukalov authored May 20, 2021

This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102915

0195e594

[compiler-rt][asan] Enable unwind-tables for Arm Linux · ba993597

David Spickett authored Jun 01, 2021

Since https://reviews.llvm.org/D102046 some tests have
been falling back to fast unwinding on our Thumb bot.

This fails because fast unwinding does not work on Thumb.
By adding the extra information we ensure this does not happen
during testing, but the built library can still fast unwind
as a last resort.

Since there are some situations it can work in, like if
eveything is built with clang. During testing we've got gcc
built system libs and clang built tests.

The same change was made for sanitizer-common in
https://reviews.llvm.org/D96337.

Reviewed By: zatrazz

Differential Revision: https://reviews.llvm.org/D103463

ba993597

[mlir][linalg] Update result position calculation in the Structured Op Interface (NFC). · e1a15084

Tobias Gysi authored Jun 02, 2021

Remove two unused methods and replace the implementation of getResultsPositionInLoopsToShapeMap. The patch is based on https://reviews.llvm.org/D103394.

Differential Revision: https://reviews.llvm.org/D103397

e1a15084

[mlir][linalg] Cleanup LinalgOp usage in fusion on tensors (NFC). · f84b908f

Tobias Gysi authored Jun 02, 2021

Replace the uses of deprecated Structured Op Interface methods in FusionOnTensors.cpp. This patch is based on https://reviews.llvm.org/D103394.

Differential Revision: https://reviews.llvm.org/D103471

f84b908f

[RISCV][NFC] Add '+mattr=+experimental-v' to RVV test · 1cea1189
Fraser Cormack authored Jun 02, 2021

1cea1189
[AArch64] Optimise bitreverse lowering in ISel · e971099a
Irina Dobrescu authored May 25, 2021
```
Differential Revision: https://reviews.llvm.org/D103105
```
e971099a

[AMDGPU][Libomptarget][NFC] Remove bunch of dead structs · b25546a4

Pushpinder Singh authored Jun 02, 2021

Dropped structs are atmi_machine_t, atmi_device_t and atmi_memory_t

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D103509

b25546a4

[lld/mac] Implement -reexport_framework, -reexport_library, -reexport-l · 78ce89bb

Nico Weber authored Jun 01, 2021

These are slightly easier-to-use versions of -sub_library and -sub_umbrella.

Differential Revision: https://reviews.llvm.org/D103497

78ce89bb

[AMDGPU][Libomptarget][NFC] Remove atmi_place_t · 2368170a

Pushpinder Singh authored Jun 02, 2021

atmi_place_t has been replaced with int DeviceId.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D103508

2368170a

[mailmap] Use my chromium address as my canonical email address · e008d012
Nico Weber authored Jun 02, 2021

e008d012

Add a toplevel .mailmap file · 39b3c00e

Nico Weber authored May 29, 2021

See "Proposal: Adding a toplevel .mailmap file" on llvm-dev:
https://lists.llvm.org/pipermail/llvm-dev/2021-May/150741.html

Differential Revision: https://reviews.llvm.org/D103360

39b3c00e

[SimpleLoopUnswitch] Port partially invariant unswitch from LoopUnswitch to SimpleLoopUnswitch · f3a27511
Jingu Kang authored May 21, 2021
```
This re-enables commit 107d19eb with bug fixes.

Differential Revision: https://reviews.llvm.org/D99354
```
f3a27511

[InstCombine][msp430] Pre-commit test case for @llvm.powi and 16-bit ints · fe208a4e

Bjorn Pettersson authored May 21, 2021

This is a pre-commit of a test case D99439 which is a patch that
updates @llvm.powi to handle different int sizes for the exponent.

Problem is that @llvm.powi is used as an IR construct that maps
to RT libcalls to __powi* functions, and those lib functions depend
on sizeof(int) to use correct type for the exponent.

The test cases show that we use i32 for the powi expenent, which
later would result in wrong type being used in libcalls (miscompile).

But there are also a couple of the negative test cases that show
that we rewrite into using powi when having a uitofp conversion
from i16, which would be wrong when doing the libcall as an
"unsigned int" isn't guaranteed to fit inside the "int" argument
in the called libcall function.

Differential Revision: https://reviews.llvm.org/D102919

fe208a4e

[CodeGen] Refactor libcall lookups for RTLIB::POWI_* · 536e02a2

Bjorn Pettersson authored May 24, 2021

Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_*
libcall for a given value type.

This includes a small refactoring of ExpandFPLibCall and
ExpandArgFPLibCall in SelectionDAGLegalize to share a bit of code,
plus adding an ExpandFPLibCall version that can be called directly
when expanding FPOWI/STRICT_FPOWI to ensure that we actually use
the same RTLIB::Libcall when expanding the libcall as we used when
checking the legality of such a call by doing a getLibcallName check.

Differential Revision: https://reviews.llvm.org/D103050

536e02a2

[LegalizeTypes] Avoid promotion of exponent in FPOWI · d1273d39

Bjorn Pettersson authored May 22, 2021

The FPOWI DAG node is normally lowered to a libcall to one of the
RTLIB::POWI* runtime functions and the exponent should normally
have a type matching sizeof(int) when making the call. Thus,
type promotion of the exponent could lead to an FPOWI with a type
for the second operand that would be incorrect when doing the
libcall (a situation which would be hard to detect post-legalization
if we allow such FPOWI nodes).

This patch is changing DAGTypeLegalizer::PromoteIntOp_FPOWI to
do the rewrite into a libcall directly instead of promoting the
operand. This way we can check that the exponent is smaller than
sizeof(int) and we can let TargetLowering handle promotion as
part of making the libcall. It could be noticed here that makeLibCall
has some knowledge about targets such as 64-bit RISCV, for which the
libcall argument should be extended to a type larger than sizeof(int).

Differential Revision: https://reviews.llvm.org/D102950

d1273d39

[SimplifyLibCalls] Take size of int into consideration when emitting ldexp/ldexpf · 9c54ee43

Bjorn Pettersson authored Mar 26, 2021

When rewriting
  powf(2.0, itofp(x)) -> ldexpf(1.0, x)
  exp2(sitofp(x)) -> ldexp(1.0, sext(x))
  exp2(uitofp(x)) -> ldexp(1.0, zext(x))

the wrong type was used for the second argument in the ldexp/ldexpf
libc call, for target architectures with 16 bit "int" type.
The transform incorrectly used a bitcasted function pointer with
a 32-bit argument when emitting the ldexp/ldexpf call for such
targets.

The fault is solved by using the correct function prototype
in the call, by asking TargetLibraryInfo about the size of "int".
TargetLibraryInfo by default derives the size of the int type by
assuming that it is 16 bits for 16-bit architectures, and
32 bits otherwise. If this isn't true for a target it should be
possible to override that default in the TargetLibraryInfo
initializer.

Differential Revision: https://reviews.llvm.org/D99438

9c54ee43

[mlir] Add DivOp lowering from Complex dialect to Standard/Math dialect. · 942be7cb
Adrian Kuegel authored Jun 02, 2021
```
Differential Revision: https://reviews.llvm.org/D103507
```
942be7cb
[Demangle][Rust] Parse binders · a67a234e
Tomasz Miąsko authored Jun 02, 2021
```
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102729
```
a67a234e

[RISCV] Expand unaligned fixed-length vector memory accesses · 3b0a33d0

Fraser Cormack authored May 14, 2021

RVV vectors must be aligned to their element types, so anything less is
unaligned.

For regular loads and stores, our custom-lowering of fixed-length
vectors meant that we opted out of LegalizeDAG's built-in unaligned
expansion. This patch adds that logic in to our custom lower function.

For masked intrinsics, we declare that anything unaligned is not legal,
leaving the ScalarizeMaskedMemIntrin pass to do the expansion for us.

Note that neither of these methods can handle the expansion of
scalable-vector memory ops, so those cases are left alone by this patch.
Scalable loads and stores already go through expansion by default but
hit an assertion, and scalable masked intrinsics will silently generate
incorrect code. It may be prudent to return an error in both of these
cases.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D102493

3b0a33d0

[flang] Add tests for REPEAT. NFC · 5f251453

Diana Picus authored May 31, 2021

These should already pass with the current implementation.

Differential Revision: https://reviews.llvm.org/D103402

5f251453