- Jul 20, 2020
-
-
Jakub Lichman authored
This commit adds functionality needed for implementation of convolutions with linalg.generic op. Since linalg.generic right now expects indexing maps to be just permutations, offset indexing needed in convolutions is not possible. Therefore in this commit we address the issue by adding support for symbols inside indexing maps which enables more advanced indexing. The upcoming commit will solve the problem of computing loop bounds from such maps. Differential Revision: https://reviews.llvm.org/D83158
-
Fangrui Song authored
This matches LLD and fixes https://sourceware.org/bugzilla/show_bug.cgi?id=26262#c1 .o is a bad choice for save-temps output because it is easy to override the bitcode file (*.o) ``` # Use bfd for the example, -fuse-ld=gold is similar. clang -flto -c a.c # generate bitcode file a.o clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps # override a.o # The user repeats the command but get surprised, because a.o is now a combined module. clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D84132
-
Alexey Bataev authored
Summary: According to OpenMP 5.0, the restrictions for mapping of overlapped data apply only for explicitly mapped data, there is no restriction for implicitly mapped data just like in OpenMP 4.5. Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D83398
-
Vy Nguyen authored
Summary: These don't work with GCC Reviewers: jyknight, #libc! Subscribers: libcxx-commits Tags: #libc Differential Revision: https://reviews.llvm.org/D84183
-
Nick Desaulniers authored
Forked from pr/46523, we were having a hard time running llvm-extract on IR from a thinLTO build of the Linux kernel. $ llvm-extract --func jeq_imm jit-42f488b6.ll llvm-extract: jit-42f488b6.ll:47463:8: error: Expected 'gv', 'module', or 'typeid' at the start of summary entry ^209 = flags: 8 ^ Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D82917
-
Fangrui Song authored
Supersedes D80225. Add --ld-path= to avoid strange target specific prefixes and make -fuse-ld= focus on its intended job: "linker flavor". (-f* affects generated code or language features. --ld-path does not affect codegen, so it is not named -f*) The way --ld-path= works is similar to "Command Search and Execution" in POSIX.1-2017 2.9.1 Simple Commands. If --ld-path= specifies * an absolute path, the value specifies the linker. * a relative path without a path component separator (/), the value is searched using the -B, COMPILER_PATH, then PATH. * a relative path with a path component separator, the linker is found relative to the current working directory. -fuse-ld= and --ld-path= can be composed, e.g. `-fuse-ld=lld --ld-path=/usr/bin/ld.lld` The driver can base its linker option decision on the flavor -fuse-ld=, but it should not do fragile flavor checking with --ld-path=. Reviewed By: whitequark, keith Differential Revision: https://reviews.llvm.org/D83015
-
Vy Nguyen authored
Reviewed By: ldionne,EricWF Tags: #libcxx Differential Revision: https://reviews.llvm.org/D82490
-
Hans Wennborg authored
The test fails in 32-bit Windows builds for unclear reasons: ld.lld: error: failed to open C:\src\llvm_package_1100-rc1\build32_stage0\tools\lld\test\ELF\Output\arm-exidx-range.s.tmp: The parameter is incorrect.
-
David Goldman authored
Summary: We need to detect when certain TypoExprs are not being transformed due to invalid trees, otherwise we risk endlessly trying to fix it. Reviewers: rsmith Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D84067
-
Matt Arsenault authored
-
Matt Arsenault authored
This was failing to add the size of LDS globals that weren't directly used by an instruction. They could be used by constant expressions which are transitively used by the function. This requires a better search, but just abort on this for now for correctness.
-
Benjamin Kramer authored
Even with 300 elements, this still consumes less stack space than the SmallSet. NFCI.
-
Matt Arsenault authored
Return values, and tail calls are not yet handled.
-
Matt Arsenault authored
-
Erich Keane authored
As reported in PR46774, an invalid arithemetic conversion used in a C ternary operator resulted in an assertion. This patch replaces that assertion with a diagnostic stating that the conversion failed. At the moment, I believe the only case of this happening is _ExtInt types.
-
Matt Arsenault authored
These imply stack-like semantics, which doesn't make any sense for entry points.
-
Benjamin Kramer authored
-
Benjamin Kramer authored
This is slightly more efficient. NFC.
-
Frederik Gossen authored
Differential Revision: https://reviews.llvm.org/D84156
-
Frederik Gossen authored
Differential Revision: https://reviews.llvm.org/D84155
-
Alok Kumar Sharma authored
Summary: This support is needed for the Fortran array variables with pointer/allocatable attribute. This support enables debugger to identify the status of variable whether that is currently allocated/associated. for pointer array (before allocation/association) without DW_AT_associated (gdb) pt ptr type = integer (140737345375288:140737354129776) (gdb) p ptr value requires 35017956 bytes, which is more than max-value-size with DW_AT_associated (gdb) pt ptr type = integer (:) (gdb) p ptr $1 = <not associated> for allocatable array (before allocation) without DW_AT_allocated (gdb) pt arr type = integer (140737345375288:140737354129776) (gdb) p arr value requires 35017956 bytes, which is more than max-value-size with DW_AT_allocated (gdb) pt arr type = integer, allocatable (:) (gdb) p arr $1 = <not allocated> Testing - unit test cases added - check-llvm - check-debuginfo Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D83544
-
Matt Arsenault authored
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition. This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch. My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway. This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads. Background: There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind. It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable. The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage. This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments. Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads. The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner. Additional context is in D79744.
-
Simon Pilgrim authored
Move the include down to files that actually depend on MCExpr definitions. Also exposes an implicit dependency on MCContext in AVRAsmBackend.h
-
Simon Pilgrim authored
This is defined in CodeGenTarget.h which we have to explicitly include already.
-
Simon Pilgrim authored
-
Petar Avramovic authored
Legalize using narrowScalar as s16->s32 G_FPEXT followed by s32->s64 G_FPEXT. Differential Revision: https://reviews.llvm.org/D84030
-
Matt Arsenault authored
-
Matt Arsenault authored
This handling didn't make any sense for vectors.
-
Matt Arsenault authored
This was missing an operand from BFE and not erasing the original instruction.
-
Matt Arsenault authored
-
Matt Arsenault authored
-
Pavel Labath authored
The function was fairly complicated and didn't support new bigger integer sizes. Use llvm function for loading an APInt from memory to write a unified implementation for all sizes.
-
Pavel Labath authored
The sed line in the rules was adding the .d file as a target to the dependency rules -- to ensure the file gets rebuild when the sources change. The same thing can be achieved more elegantly with some -M flags.
-
Benjamin Kramer authored
-
Haojian Wu authored
some examples are working already. Differential Revision: https://reviews.llvm.org/D84146
-
Haojian Wu authored
Differential Revision: https://reviews.llvm.org/D84145
-
Haojian Wu authored
-
Benjamin Kramer authored
-
Haojian Wu authored
Adjust an existing diagnostic test, which is an improvement of secondary diagnostic. Differential Revision: https://reviews.llvm.org/D81163
-
Pavel Labath authored
Now that the main test results are reported through lit, and we only have one formatter class, this code is unnecessarily baroque.
-