- Aug 23, 2021
-
-
Jon Chesterfield authored
Add include path to the cmakefiles and set the target_impl enums from the llvm constants instead of copying the values. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108391
-
Stanislav Mekhanoshin authored
D106408 enables rematerialization of instructions with virtual register uses. That has uncovered the bug in the allUsesAvailableAt implementation: https://bugs.llvm.org/show_bug.cgi?id=51516. In the majority of cases canRematerializeAt() called to check if an instruction can be rematerialized before the given UseIdx. However, SplitEditor::enterIntvAtEnd() calls it to rematerialize an instruction at the end of a block passing LIS.getMBBEndIdx() into the check. In the testcase from the bug it has attempted to rematerialize ADDXri after STRXui in bb.17. The use operand %55 of the ADD is killed by the STRX but that is undetected by the check because it adjusts passed UseIdx to the reg slot, before the kill. The value is dead at the index passed to the check however. This change uses a later of passed UseIdx and its reg slot. This shall be correct because if are checking an availability of operands before an instruction that instruction cannot be the one defining these operands. If we are checking for late rematerialization we are really interested if operands live past the instruction. The bug is not exploitable without D106408 but needed to reland reverted D106408. Differential Revision: https://reviews.llvm.org/D108475
-
Zarko Todorovski authored
[PowerPC][AIX] Set the HasAlloca flag in the AIX Traceback Table only if R31 is used as a frame pointer After c0639464 usage of R31 doesn't necessarily mean that alloca is used. The `TracebackTable::IsAllocaUsedMask` flag should be set only when R31 is used as a frame pointer. On AIX the `function calls alloca' bit seems to be set whenever R31 is set up as a frame pointer, even when there is no alloca call. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D108141
-
Sanjay Patel authored
This is NFC-intended when viewed from outside the pass. I was trying to make sure that we don't infinite loop in subtract combines and noticed that we handle the non-canonical forms of add/sub here, but it should not be necessary. Coding it this way seems slightly clearer than mixing all 4 patterns as before.
-
River Riddle authored
This revision fixes a bug where an operation would get replaced with a pre-existing constant that didn't dominate it. This can occur when a pattern inserts operations to be folded at the beginning of the constants insertion block. This revision fixes the bug by moving the existing constant before the replaced operation in such cases. This is fine because if a constant didn't already exist, a new one would have been inserted before this operation anyways. Differential Revision: https://reviews.llvm.org/D108498
-
Alex Langford authored
-
Krzysztof Drewniak authored
Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108135
-
Alfonso Gregory authored
Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D106902
-
Greg Clayton authored
When a function has no line table, but does have debug info (DW_TAG_subprogram), we fall back to creating a line table with a single line entry that has the start address of the function and the source file and line of the function declaration. The bug in this code was that we might have a DW_TAG_subprogram that uses a DW_AT_specification or DW_AT_abstract_origin that points to another DIE, and that DIE might be in another compile unit. The bug was we were grabbing the file index value from the DIE, and that index could be from the other DIE in another compile unit that has its own and compleltely different file table, so we might be using a file index from one compile unit with the file table from another. This was causing a crash in llvm-gsymuil when run against dSYM files. dsymutil, the Apple DWARF linker, will often unique types and can end up with more absolute references across different compile units. The fix is to use the DWARFDie::getDeclFile(...) accessor as it does fetch this information correctly. Differential Revision: https://reviews.llvm.org/D108497
-
Jessica Paquette authored
Same as G_LROUND: destination should always be a GPR, source should always be a FPR. Differential Revision: https://reviews.llvm.org/D108566
-
Chris Bieneman authored
This patch adds `#pragma clang restrict_expansion ` to enable flagging macros as unsafe for header use. This is to allow macros that may have ABI implications to be avoided in headers that have ABI stability promises. Using macros in headers (particularly public headers) can cause a variety of issues relating to ABI and modules. This new pragma logs warnings when using annotated macros outside the main source file. This warning is added under a new diagnostics group -Wpedantic-macros Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D107095
-
Jessica Paquette authored
Same as G_LROUND. Also add a TODO for full fp16 legalization. Differential Revision: https://reviews.llvm.org/D108564
-
Jessica Paquette authored
Translate it using `IRTranslator::translateSimpleIntrinsic`. Differential Revision: https://reviews.llvm.org/D108563
-
Jon Chesterfield authored
Remove redundant fields and replace pointer with virtual function Of fourteen fields, three are dead and four can be computed from the remainder. This leaves a couple of currently dead fields in place as they are expected to be used from the deviceRTL shortly. Two of the fields that can be computed are only used from codegen and require a log2() implementation so are inlined into codegen instead. This change leaves the new methods in the same location in the struct as the previous fields for convenience at review. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108380
-
Florian Hahn authored
-
Florian Hahn authored
This reverts commit 3aa009cc. The reverted commit causes an infinite loop in instcombine. See PR51584.
-
Jinsong Ji authored
We found that AIX was not covered in most of the InstrProfiling tests. So we are trying to enable the tests gradually. This is to add AIX triple to platform tests to make sure the registrations are OK. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D108490
-
Alexander Potapenko authored
Looks like non-x86 bots are unhappy with inclusion of <stdatomic.h> e.g.: clang-armv7-vfpv3-2stage - https://lab.llvm.org/buildbot/#/builders/182/builds/626 clang-ppc64le-linux - https://lab.llvm.org/buildbot/#/builders/76/builds/3619 llvm-clang-win-x-armv7l - https://lab.llvm.org/buildbot/#/builders/60/builds/4514 It seems to be unnecessary, just remove it and replace atomic_load() calls with dereferences of _Atomic*. Differential Revision: https://reviews.llvm.org/D108555
-
Krasimir Georgiev authored
This fixes up a regression we found from https://reviews.llvm.org/D107267: in specific contexts, clang-format stopped breaking after the `)` in TypeScript decorations. There were no test cases covering this, so I added one. Reviewed By: MyDeveloperDay Differential Revision: https://reviews.llvm.org/D108538
-
Peyton, Jonathan L authored
The omp_get_wtime.c test fails intermittently if the recorded times are off by too much which can happen when many tests are run in parallel. Instead of failing if one timing is a little off, take average of 100 timings minus the 10 worst. Differential Revision: https://reviews.llvm.org/D108488
-
Simon Pilgrim authored
Shows LoadedSlice::canMergeExpensiveCrossRegisterBankCopy failure to merge unaligned dereferencable loads. Another candidate for PR45116
-
Andy Wingo authored
This function was defaulting to use the ABI alignment for the LLVM type. Here we change to use the preferred alignment. This will allow unification with GetTempAlloca, which if alignment isn't specified, uses the preferred alignment. Differential Revision: https://reviews.llvm.org/D108450
-
Andy Wingo authored
The LangAS local is only used in the OpenCL case; move its decl inwards. Differential Revision: https://reviews.llvm.org/D108449
-
Andy Wingo authored
Pass a LangAS instead of a target address space to GetOrCreateLLVMGlobal, to remove a place where the frontend assumes that target address space 0 is special. Differential Revision: https://reviews.llvm.org/D108445
-
Matthias Springer authored
Do not apply loop peeling to loops that are contained in the partial iteration of an already peeled loop. This is to avoid code explosion when dealing with large loop nests. Can be controlled with a new pass option `skip-partial`. Differential Revision: https://reviews.llvm.org/D108542
-
Chuanqi Xu authored
It would waste time to specialize a function which would inline finally. This patch did two things: - Don't specialize functions which are always-inline. - Don't spescialize functions whose lines of code are less than threshold (100 by default). For spec2017int, this patch could reduce the number of specialized functions by 33%. Then the compile time didn't increase for every benchmark. Reviewed By: SjoerdMeijer, xbolva00, snehasish Differential Revision: https://reviews.llvm.org/D107897
-
Alexander Potapenko authored
Unlike __attribute__((no_sanitize("thread"))), this one will cause TSan to skip the entire function during instrumentation. Depends on https://reviews.llvm.org/D108029 Differential Revision: https://reviews.llvm.org/D108202
-
Simon Pilgrim authored
Show failure to fold scaled-index into gather/scatter scale operands
-
Florian Hahn authored
This reverts the revert ab9296f1. The issue causing the revert should be fixed in 9baed023.
-
Jay Foad authored
Apparently GCC 11 was warning: AMDGPURegisterBankInfo.cpp:2543:33: warning: enumerated and non-enumerated type in conditional expression [-Wextra]
-
Cullen Rhodes authored
The following scalar FP instructions are legal in streaming mode: 0101 1110 xx1x xxxx 11x1 11xx xxxx xxxx # FMULX/FRECPS/FRSQRTS (scalar) 0101 1110 x10x xxxx 00x1 11xx xxxx xxxx # FMULX/FRECPS/FRSQRTS (scalar, FP16) 01x1 1110 1x10 0001 11x1 10xx xxxx xxxx # FRECPE/FRSQRTE/FRECPX (scalar) 01x1 1110 1111 1001 11x1 10xx xxxx xxxx # FRECPE/FRSQRTE/FRECPX (scalar, FP16) Predicate them on `HasNEONorStreamingSVE`. Full list of affected instructions: FMULX16, FMULX32, FMULX64, FRECPS16, FRECPS32, FRECPS64, FRSQRTS16, FRSQRTS32, FRSQRTS64, FRECPEv1f16, FRECPEv1i32, FRECPEv1i64, FRECPXv1f16, FRECPXv1i32, FRECPXv1i64, FRSQRTEv1f16, FRSQRTEv1i32, FRSQRTEv1i64 Depends on D107902. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06/SIMD-FP-Instructions Execution of NEON instructions that are illegal in streaming mode will cause a trap or exception. Using FMULX [1] as an example, this check is at the top of the pseudocode: if elements == 1 then CheckFPEnabled64(); else CheckFPAdvSIMDEnabled64(); For the legal scalar variants it calls `CheckFPEnabled64`, whereas for the illegal vector variants it calls `CheckFPAdvSIMDEnabled64` which traps. This is useful for observing which instructions are/aren't legal in streaming mode. [1] https://developer.arm.com/documentation/ddi0602/2021-06/SIMD-FP-Instructions/FMULX--Floating-point-Multiply-extended- Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D108039
-
Cullen Rhodes authored
Split out from D107903 to remove dependency for D108039 and D108279. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D108293
-
Michael Kruse authored
Code outside the SCoP will be executed recardless of the code versioning runtime check introduced by CodeGeneration. Assumption made based on that these are never executed in Polly-optimized code does not hold. This fixes the miscompilation of MultiSource/Applications/lambda-0.1.3
-
Siva Chandra Reddy authored
A corresponding adjustment to mtx_lock has also been made.
-
Min-Yih Hsu authored
Cleanup the formats of the MC tests that were just migrated. NFC
-
Min-Yih Hsu authored
Migrate some MOVE instruction MC tests from test/CodeGen/M68k. Unfortunately the tests touched in this commit were failed due to lacking of the `abs.W` operand, which forces any memory address parsed from assembly being represented in 32-bits. We're temporarily allowing these unwanted widening in the tests until the support for `abs.W` is there.
-
Siva Chandra Reddy authored
These functions will be used in a future patch to implement trigonometric functions. Unit tests have been added but to the libc-long-running-tests suite. The unit tests long running because we compare against MPFR computations performed at 1280 bits of precision. Some cleanups or elimination of repeated patterns can be done as follow up changes. Differential Revision: https://reviews.llvm.org/D104817
-
Shilei Tian authored
-
Kai Luo authored
This is the first step to enable PPC64 support huge frame size(>2G). Also fix an assertion error for frame size, i.e.,`int x; !isInt<32>(x);` should be always evaluated false, so the guard code for frame size is impossible to hit. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D107435
-
Michael Kruse authored
The new pass manager does not allow adding module passes at the -polly-position=before-vectorizer extension point. Introduce a DumpFunctionPass that dumps only current function. In contrast to the legacy pass manager's -polly-dump-before, each function will be dumped into its own file. -polly-dump-before-file is still not supported. The DumpFunctionPass uses llvm::CloneModule to copy the current function into a new module and then write it into a file.
-