Commits · 7daed359111f6d151fef447f520f85ef1dabedf6 · Lorenzo Albano / LLVM bpEVL

Mar 07, 2022

[clang][modules] Report module maps affecting `no_undeclared_includes` modules · b45888e9

Jan Svoboda authored Mar 07, 2022

Since D106876, PCM files don't report module maps as input files unless they contributed to the compilation.

Reporting only module maps of (transitively) imported modules is not enough, though. For modules marked with `[no_undeclared_includes]`, other module maps affect the compilation by introducing anti-dependencies.

This patch makes sure such module maps are being reported as input files.

Depends on D120463.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120464

b45888e9

[clang][modules] NFC: Simplify and clarify test · 242b24c1

Jan Svoboda authored Mar 07, 2022

This patch simplifies a test that checks only used module map files are reported as input files in PCM files.

Instead of using opaque `diff`, this patch uses `clang -module-file-info` and `FileCheck` to verify this.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120463

242b24c1

[PowerPC] Add generic fnmsub intrinsic · b2497e54

Qiu Chaofan authored Mar 07, 2022

Currently in Clang, we have two types of builtins for fnmsub operation:
one for float/double vector, they'll be transformed into IR operations;
one for float/double scalar, they'll generate corresponding intrinsics.

But for the vector version of builtin, the 3 op chain may be recognized
as expensive by some passes (like early cse). We need some way to keep
the fnmsub form until code generation.

This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub
intrinsics.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D116015

b2497e54

[OpenMPIRBuilder] Allocate temporary at the correct block in a nested parallel · 87ec6f41

William S. Moses authored Mar 05, 2022

The OpenMPIRBuilder has a bug. Specifically, suppose you have two nested openmp parallel regions (writing with MLIR for ease)

```
omp.parallel {
  %a = ...
  omp.parallel {
    use(%a)
  }
}
```

As OpenMP only permits pointer-like inputs, the builder will wrap all of the inputs into a stack allocation, and then pass this
allocation to the inner parallel. For example, we would want to get something like the following:

```
omp.parallel {
  %a = ...
  %tmp = alloc
  store %tmp[] = %a
  kmpc_fork(outlined, %tmp)
}
```

However, in practice, this is not what currently occurs in the context of nested parallel regions. Specifically to the OpenMPIRBuilder,
the entirety of the function (at the LLVM level) is currently inlined with blocks marking the corresponding start and end of each
region.

```
entry:
  ...

parallel1:
  %a = ...
  ...

parallel2:
  use(%a)
  ...

endparallel2:
  ...

endparallel1:
  ...
```

When the allocation is inserted, it presently inserted into the parent of the entire function (e.g. entry) rather than the parent
allocation scope to the function being outlined. If we were outlining parallel2, the corresponding alloca location would be parallel1.

This causes a variety of bugs, including https://github.com/llvm/llvm-project/issues/54165 as one example.

This PR allows the stack allocation to be created at the correct allocation block, and thus remedies such issues.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D121061

87ec6f41

Mar 05, 2022

[RISCV] Support k-ext clang intrinsics · fa9c8bab
Shao-Ce SUN authored Mar 03, 2022
```
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D112774
```
fa9c8bab
[WebAssembly] Update Preprocessor/init.c test after 3be9e0ba · c01ec308
Thomas Lively authored Mar 04, 2022

c01ec308

[WebAssembly] Check bulk-memory when adjusting lang opts · 3be9e0ba

Thomas Lively authored Mar 04, 2022

We previously had logic to disable pthreads, set the ThreadModel to Single, and
disable thread-safe statics when the atomics target features is disabled, since
that means that the resulting program will not be used in a threaded context.
Similarly check for the presence of the bulk-memory feature, since that is also
necessary to produce multithreaded programs.

Differential Revision: https://reviews.llvm.org/D121014

3be9e0ba

Mar 04, 2022

Fix test failure in openmp-offload.c · 22b6e817

Yaxun (Sam) Liu authored Mar 04, 2022

Update active offload kind of actions for OpenMP programs.

The change is expected as of e5eb3650.

22b6e817

[CUDA][HIP] Fix offloading kind for linking C++ programs · e5eb3650

Yaxun (Sam) Liu authored Mar 03, 2022

When both CUDA or HIP programs and C++ programs are passed
to clang driver without -c, C++ programs are treated as CUDA
or HIP program, which is incorrect.

This is because action builder sets the offloading kind of input
job actions to the linking action to be the union of offloading
kind of the input job actions, i.e. if there is one HIP or CUDA
input to the linker, then all the input to the linker is marked
as HIP or CUDA.

To fix this issue, the offload action builder tracks the originating
input argument of each host action, which allows it to determine
the active offload kind of each host action. Then the offload
kind of each input action to the linker can be determined
individually.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D120911

e5eb3650

[HIP] Fix job action offloading kind for mixed HIP/C++ compilation · bde13a81

Yaxun (Sam) Liu authored Mar 02, 2022

When both HIP and C++ programs are input files to clang
with -c, clang treats C++ programs as HIP programs,
which is incorrect.

This is due to action builder does not set correct
offloading kind for job actions for C++ programs.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D120910

bde13a81

[clang] [concepts] Check constrained-auto return types for void-returning functions · f0891cd6
Arthur O'Dwyer authored Feb 14, 2022
```
Fixes #49188.

Differential Revision: https://reviews.llvm.org/D119184
```
f0891cd6

[NFC] Divide tests into smaller files · 5a148869

4vtomat authored Mar 03, 2022

This commit divides the large test files(over 30k lines) under clang/test/CodeGen/RISCV including:
rvv-intrinsics/vloxseg.c
rvv-intrinsics/vluxseg.c
rvv-intrinsics-overloaded/vloxseg.c
rvv-intrinsics-overloaded/vluxseg.c
into "non-masked" version and "masked" version which can reduce the test cases by 50% in a single file.

Differential Revision: https://reviews.llvm.org/D120967

5a148869

[Driver] Split up huge arm-cortex-cpus.c test. · fb42e557

Florian Hahn authored Mar 04, 2022

This test file has grown to the point where it takes a huge amount of
time to run. At the moment, this test seems to consistently time out
when running in the pre-commit checks in Phabricator with a 10 minute
timeout. For example see
https://reviews.llvm.org/harbormaster/unit/view/2832724/

While splitting up the test file is not ideal, it is even more
undesirable to have huge test files that time out in common settings.

This patch splits up the test file roughly in the middle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D120876

fb42e557

[Driver] Split up huge aarch64-cpus.c test. · 8f5bdaf4

Florian Hahn authored Mar 04, 2022

This test file has grown to the point where it takes a huge amount of
time to run. At the moment, this test seems to consistently time out
when running in the pre-commit checks in Phabricator with a 10 minute
timeout. For example see
https://reviews.llvm.org/harbormaster/unit/view/2832723/

While splitting up the test file is not ideal, it is even more
undesirable to have huge test files that time out in common settings.

This patch splits up the test file roughly in the middle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D120875

8f5bdaf4

[tests][Driver] Pass an empty sysroot for `DEFAULT_SYSROOT` builds · 4f637c30

Tim Northover authored Mar 04, 2022

The baremetal-sysroot test fails when the toolchain is configured with
DEFAULT_SYSROOT. So, to emulate not having passed one at all, let's
pass an empty sysroot instead.

https://reviews.llvm.org/D119144

Patch by Carlo Cabrera <carlo.antonio.cabrera@gmail.com>

4f637c30

Mar 03, 2022

[analyzer] Done some changes to detect Uninitialized read by the char array manipulation functions · bd1917c8

Shivam authored Mar 03, 2022

Few weeks back I was experimenting with reading the uninitialized values from src , which is actually a bug but the CSA seems to give up at that point . I was curious about that and I pinged @steakhal on the discord and according to him this seems to be a genuine issue and needs to be fix. So I goes with fixing this bug and thanks to @steakhal who help me creating this patch. This feature seems to break some tests but this was the genuine problem and the broken tests also needs to fix in certain manner. I add a test but yeah we need more tests,I'll try to add more tests.Thanks

Reviewed By: steakhal, NoQ

Differential Revision: https://reviews.llvm.org/D120489

bd1917c8

[Clang] Diagnose invalid member variable with template parameters · 942c0391

Corentin Jabot authored Mar 03, 2022

Fixes https://github.com/llvm/llvm-project/issues/54151

Reviewed By: erichkeane, aaron.ballman

Differential Revision: https://reviews.llvm.org/D120881

942c0391

[Concepts] Check constraints for explicit template instantiations · 5aeaabf3

Roy Jacobson authored Mar 03, 2022

The standard requires[0] member function constraints to be checked when
explicitly instantiating classes. This patch adds this constraints
check.

This issue is tracked as #46029 [1].

Note that there's an related open CWG issue (2421[2]) about what to do when
multiple candidates have satisfied constraints. This is particularly an
issue because mangling doesn't contain function constraints, and so the
following code still ICEs with definition with same mangled name
'_ZN1BIiE1fEv' as another definition:

template<class T>
struct B {
  int f() requires std::same_as<T, int> {
    return 0;
  }
  int f() requires (std::same_as<T, int> &&
                    !std::same_as<T, char>) {
    return 1;
  }
};

template struct B<int>;

Also note that the constraints checking while instantiating *functions*
is still not implemented. I started looking at it but It's a bit more
complicated. I believe in such a case we have to consider the partial
constraints order and potentially choose the best candidate out of the
set of multiple valid ones.

[0]: https://eel.is/c++draft/temp.explicit#10
[1]: https://github.com/llvm/llvm-project/issues/46029
[2]: https://cplusplus.github.io/CWG/issues/2421.html

Differential Revision: https://reviews.llvm.org/D120255

5aeaabf3

[analyzer] Improve NoOwnershipChangeVisitor's understanding of deallocators · d8320789

Kristóf Umann authored Feb 04, 2022

The problem with leak bug reports is that the most interesting event in the code
is likely the one that did not happen -- lack of ownership change and lack of
deallocation, which is often present within the same function that the analyzer
inlined anyway, but not on the path of execution on which the bug occured. We
struggle to understand that a function was responsible for freeing the memory,
but failed.

D105819 added a new visitor to improve memory leak bug reports. In addition to
inspecting the ExplodedNodes of the bug pat, the visitor tries to guess whether
the function was supposed to free memory, but failed to. Initially (in D108753),
this was done by checking whether a CXXDeleteExpr is present in the function. If
so, we assume that the function was at least party responsible, and prevent the
analyzer from pruning bug report notes in it. This patch improves this heuristic
by recognizing all deallocator functions that MallocChecker itself recognizes,
by reusing MallocChecker::isFreeingCall.

Differential Revision: https://reviews.llvm.org/D118880

d8320789

[AST] Use RecoveryExpr to model a DeclRefExpr which refers to an invalid Decl. · ba6c71b1

Haojian Wu authored Feb 28, 2022

Previously, we didin't build a DeclRefExpr which refers to an invalid declaration.

In this patch, we handle this case by building an empty RecoveryExpr,
which will preserve more broken code (AST parent nodes that contain the
RecoveryExpr is preserved in the AST).

Differential Revision: https://reviews.llvm.org/D120812

ba6c71b1

[AMDGPU] Add gfx1036 target · 84069581
Aakanksha authored Mar 02, 2022
```
Differential Revision: https://reviews.llvm.org/D120846
```
84069581

Mar 02, 2022

[AMDGPU] Add gfx940 target · 2e2e64df

Stanislav Mekhanoshin authored Feb 28, 2022

This is target definition only.

Differential Revision: https://reviews.llvm.org/D120688

2e2e64df

[clang][CGStmt] fix crash on invalid asm statement · f76d3b80

Tong Zhang authored Mar 02, 2022

Clang is crashing on the following statement

  char var[9];
  __asm__ ("" : "=r" (var) : "0" (var));

This is similar to existing test: crbug_999160_regtest

The issue happens when EmitAsmStmt is trying to convert input to match
output type length. However, that is not guaranteed to be successful all the
time and if the statement itself is invalid like having an array type in
the example, we should give a regular error message here instead of
using assert().

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D120596

f76d3b80

[clang-offload-bundler] Fix typo in a test case · 1e78d07d
Saiyedul Islam authored Mar 02, 2022
```
Intermediate file of one of the test was getting overwritten due
to name clash.
```
1e78d07d

[pseudo] Fix an out-of-bound error in LRTable::find. · 28efb1cc

Haojian Wu authored Mar 01, 2022

The linear scan should not escape the TargetedStates range.

Differential Revision: https://reviews.llvm.org/D120723

28efb1cc

[clang-offload-bundler] HIP and OpenMP comaptibility for linking heterogeneous archive library · 7a02abf0

Saiyedul Islam authored Mar 01, 2022

`hip-openmp-compatible` flag treats hip and hipv4 offload kinds
as compatible with openmp offload kind while extracting code objects
from a heterogenous archive library. Vice versa is also considered
compatible if hip code was compiled with -fgpu-rdc.

This flag only relaxes compatibility criteria on `OffloadKind`,
rest of the components like `Triple` and `GPUArhc` still needs to
be compatible.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D120697

7a02abf0

[AST] Print NTTP args as string-literals when possible · 44eee659

Zhihao Yuan authored Mar 01, 2022

C++20 non-type template parameter prints `MyType<{{116, 104, 105, 115}}>` when the code is as simple as `MyType<"this">`. This patch prints `MyType<{"this"}>`, with one layer of braces preserved for the intermediate structural type to trigger CTAD.

`StringLiteral` handles this case, but `StringLiteral` inside `APValue` code looks like a circular dependency. The proposed patch implements a cheap strategy to emit string literals in diagnostic messages only when they are readable and fall back to integer sequences.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D115031

44eee659

Mar 01, 2022

[HWASAN] erase lifetime intrinsics if tag is outside. · 1d730d80
Florian Mayer authored Feb 23, 2022
```
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D120437
```
1d730d80

[NVPTX] Add ex2.approx.f16/f16x2 support · 510fd283

Nicolas Miller authored Mar 01, 2022

NOTE: this is a follow-up commit with the missing clang-side changes.

This patch adds builtins and intrinsics for the f16 and f16x2 variants of the ex2
instruction.

These two variants were added in PTX7.0, and are supported by sm_75 and above.

Note that this isn't wired with the exp2 llvm intrinsic because the ex2
instruction is only available in its approx variant.

Running ptxas on the assembly generated by the test f16-ex2.ll works as
expected.

Differential Revision: https://reviews.llvm.org/D119157

510fd283

[NVPTX] Add more FMA intriniscs/builtins · a8951823

Jakub Chlanda authored Mar 01, 2022

This patch adds builtins/intrinsics for the following variants of FMA:

NOTE: follow-up commit with the missing clang-side changes.

- f16, f16x2
  - rn
  - rn_ftz
  - rn_sat
  - rn_ftz_sat
  - rn_relu
  - rn_ftz_relu
- bf16, bf16x2
  - rn
  - rn_relu

ptxas (Cuda compilation tools, release 11.0, V11.0.194) is happy with the generated assembly.

Differential Revision: https://reviews.llvm.org/D118977

a8951823

[NVPTX] Expose float tys min, max, abs, neg as builtins · 7a6d692b

Jakub Chlanda authored Mar 01, 2022

Adds support for the following builtins:

abs, neg:
- .bf16,
- .bf16x2
min, max
- {.ftz}{.NaN}{.xorsign.abs}.f16
- {.ftz}{.NaN}{.xorsign.abs}.f16x2
- {.NaN}{.xorsign.abs}.bf16
- {.NaN}{.xorsign.abs}.bf16x2
- {.ftz}{.NaN}{.xorsign.abs}.f32

Differential Revision: https://reviews.llvm.org/D117887

7a6d692b

[SanitizerBounds] Add support for NoSanitizeBounds function · 17ce89fa

Tong Zhang authored Mar 01, 2022

Currently adding attribute no_sanitize("bounds") isn't disabling
-fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang
frontend handles fsanitize=array-bounds which can already be disabled by
no_sanitize("bounds"). However, instrumentation added by the
BoundsChecking pass in the middle-end cannot be disabled by the
attribute.

The fix is very similar to D102772 that added the ability to selectively
disable sanitizer pass on certain functions.

In this patch, if no_sanitize("bounds") is provided, an additional
function attribute (NoSanitizeBounds) is attached to IR to let the
BoundsChecking pass know we want to disable local-bounds checking. In
order to support this feature, the IR is extended (similar to D102772)
to make Clang able to preserve the information and let BoundsChecking
pass know bounds checking is disabled for certain function.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119816

17ce89fa

[Clang] Add `-funstable` flag to enable unstable and experimental features · 3cdc1c15

Egor Zhdan authored Feb 18, 2022

This new flag enables `__has_feature(cxx_unstable)` that would replace libc++ macros for individual unstable/experimental features, e.g. `_LIBCPP_HAS_NO_INCOMPLETE_RANGES` or `_LIBCPP_HAS_NO_INCOMPLETE_FORMAT`.

This would make it easier and more convenient to opt-in into all libc++ unstable features at once.

Differential Revision: https://reviews.llvm.org/D120160

3cdc1c15

[NVPTX] Fix nvvm.match.sync*.i64 intrinsics return type (i64 -> i32) · 57aaab3b

Kristina Bessonova authored Mar 01, 2022

NVVM IR specification defines them with i32 return type:

  declare i32 @llvm.nvvm.match.any.sync.i64(i32 %membermask, i64 %value)
  declare {i32, i1} @llvm.nvvm.match.all.sync.i64(i32 %membermask, i64 %value)
  ...
  The i32 return value is a 32-bit mask where bit position in mask corresponds
  to thread’s laneid.

as well as PTX ISA:

  9.7.12.8. Parallel Synchronization and Communication Instructions: match.sync

  match.any.sync.type  d, a, membermask;
  match.all.sync.type  d[|p], a, membermask;
  ...
  Destination d is a 32-bit mask where bit position in mask corresponds
  to thread’s laneid.

Additionally, ptxas doesn't accept intructions, produced by NVPTX backend.
After this patch, it compiles with no issues.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D120499

57aaab3b

[C++20][Modules][8/8] Amend module visibility rules for partitions. · a29f8dbb

Iain Sandoe authored Apr 04, 2021

Implementation partitions bring two extra cases where we have
visibility of module-private data.

1) When we import a module implementation partition.
2) When a partition implementation imports the primary module intertace.

We maintain a record of direct imports into the current module since
partition decls from direct imports (but not trasitive ones) are visible.

The rules on decl-reachability are much more relaxed (with the standard
giving permission for an implementation to load dependent modules and for
the decls there to be reachable, but not visible).

Differential Revision: https://reviews.llvm.org/D118599

a29f8dbb

[clang][analyzer] Add modeling of 'errno'. · d8a2afb2

Balázs Kéri authored Feb 25, 2022

Add a checker to maintain the system-defined value 'errno'.
The value is supposed to be set in the future by existing or
new checkers that evaluate errno-modifying function calls.

Reviewed By: NoQ, steakhal

Differential Revision: https://reviews.llvm.org/D120310

d8a2afb2

[Clang] Remove redundant init-parens in AST print · d1a59eef

Zhihao Yuan authored Feb 28, 2022

Given a dependent `T` (maybe an undeduced `auto`),

Before:

    new T(z)  -->  new T((z))  # changes meaning with more args
    new T{z}  -->  new T{z}
        T(z)  -->      T(z)
        T{z}  -->      T({z})  # forbidden if T is auto

After:

    new T(z)  -->  new T(z)
    new T{z}  -->  new T{z}
        T(z)   -->     T(z)
        T{z}   -->     T{z}

Depends on D113393

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D120608

d1a59eef

[c++2b] Implement P0849R8 auto(x) · 136b2931

Zhihao Yuan authored Feb 28, 2022

https://wg21.link/p0849

Reviewed By: aaron.ballman, erichkeane

Differential Revision: https://reviews.llvm.org/D113393

136b2931

[OpenMPIRBuilder] Implement static-chunked workshare-loop schedules. · a66f7769

Michael Kruse authored Feb 25, 2022

Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads.

This patch includes the following related changes:
 * Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder.
 * Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder.
 * Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop.
 * Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause.

Differential Revision: https://reviews.llvm.org/D114413

a66f7769

Feb 28, 2022

[HIP] File device library ABI version file name · 092f15ac

Yaxun (Sam) Liu authored Feb 25, 2022

It should be oclc_abi_version* instead of abi_version*.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D120557

092f15ac