- Jun 19, 2020
-
-
Kristof Beyls authored
A "BTI c" instruction only allows jumping/calling to using a BLR* instruction. However, the SLSBLR mitigation changes a BLR to a BR to implement the function call. Therefore, a "BTI c" check that passed before could trigger after the BLR->BL change done by the SLSBLR mitigation. However, if the register used in BR is X16 or X17, this trigger will not fire (see ArmARM for further details). Therefore, this patch simply changes the function stubs for the SLSBLR mitigation from __llvm_slsblr_thunk_x<N>: br x<N> SpeculationBarrier to __llvm_slsblr_thunk_x<N>: mov x16, x<N> br x16 SpeculationBarrier Differential Revision: https://reviews.llvm.org/D81405
-
Ronak Chauhan authored
Summary: This allows targets to also consider the symbol's type and/or address if needed. Reviewers: scott.linder, jhenderson, MaskRay, aardappel Reviewed By: scott.linder, MaskRay Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82090
-
Francesco Petrogalli authored
Reviewers: efriedma, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80741
-
Nemanja Ivanovic authored
We currently miss a number of opportunities to emit single-instruction VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although this in itself is not a huge performance opportunity since loading the permute vector for a VPERM can always be pulled out of loops, producing such merge instructions is useful to downstream optimizations. Since VPERM is essentially opaque to all subsequent optimizations, we want to avoid it as much as possible. Other permute instructions have semantics that can be reasoned about much more easily in later optimizations. This patch does the following: - Canonicalize shuffles so that the first element comes from the first vector (since that's what most of the mask matching functions want) - Switch the elements that come from splat vectors so that they match the corresponding elements from the other vector (to allow for merges) - Adds debugging messages for when a shuffle is matched to a VPERM so that anyone interested in improving this further can get the info for their code Differential revision: https://reviews.llvm.org/D77448
-
Carl Ritson authored
Add code to respect mad-mac-f32-insts target feature. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D81990
-
Vitaly Buka authored
Summary: Extend StackLifetime with option to calculate liveliness where alloca is only considered alive on basic block entry if all non-dead predecessors had it alive at terminators. Depends on D82043. Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82124
-
Nathan James authored
-
Vitaly Buka authored
Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82043
-
- Jun 18, 2020
-
-
Matt Arsenault authored
Don't do this in the MachineFunctionInfo constructor. Also, ensure the alignment rather than overwriting it outright. I vaguely remember there was another place to enforce the target minimum alignment, but I couldn't find it (it's there for instructions).
-
Matt Arsenault authored
We probably need to move where intrinsics are lowered to copies to make this useful.
-
Matt Arsenault authored
I don't know anything about debug info, but this seems like more work should be necessary. This constructs a new IRBuilder and reconstructs the original divides rather than moving the original. One problem this has is if a div/rem pair are handled, both end up with the same debugloc. I'm not sure how to fix this, since this uses a cache when it sees the same input operands again, which will have the first instance's location attached.
-
Amy Kwan authored
This patch implements builtins for the following prototypes: vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long); vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b); unsigned long long __builtin_pdepd (unsigned long long, unsigned long long); unsigned long long __builtin_pextd (unsigned long long, unsigned long long); Revision Depends on D80758 Differential Revision: https://reviews.llvm.org/D80935
-
Matt Arsenault authored
This was passing in all the parameters needed to construct a LegalizerHelper in the custom legalization, when it's simpler to just pass in the existing helper. This is slightly more annoying to use in the common case where you don't need the legalizer helper, but we could add back the common parameters back in addition to the helper. I didn't propagate this to all the internal target changes that this logically implies, but did update a sample one for legalizeMinNumMaxNum. This is in preparation for moving AMDGPU load/store legalization entirely into custom lowering. The current set of legalization actions is really constraining and not really capable of expressing all the actions needed to legalize loads/stores. In particular there's no way to express when the memory access itself needs to change size vs. the result type. There's also a lot of redundancy since the same split/widen actions need to be applied in both vector and scalar cases. All of the sub-cases logically belong as steps in the legalizer helper, but it will be easier to consider everything at once in custom lowering.
-
Christopher Tetreault authored
Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82057
-
Alexandre Ganea authored
[CodeView] Revert 8374bf43 and 403f9537 This reverts: 8374bf43 [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string. 403f9537 [CodeView] Add full repro to LF_BUILDINFO record This is causing the lld/test/COFF/pdb-relative-source-lines.test to fail: http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/1096/steps/test-check-all/logs/FAIL%3A%20lld%3A%3Apdb-relative-source-lines.test And clang/test/CodeGen/debug-info-codeview-buildinfo.c fails as well: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33346/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c
-
Kirill Naumov authored
This functionality is very similar to Function compatibility with AnnotationWriter. This change allows us to use AnnotationWriter with BasicBlock through BB.print() method. Reviewed-By: apilipenko Differntial Revision: https://reviews.llvm.org/D81321
-
Sanjay Patel authored
The predicate can always be used to distinguish between icmp and fcmp, so we don't need to keep repeating this check in the callers.
-
Davide Italiano authored
Sometimes a dead block gets folded and the debug information is still retained. This manifests as jumpy stepping in lldb, see the bugzilla PR for an end-to-end C testcase. Fixes https://bugs.llvm.org/show_bug.cgi?id=46008 Differential Revision: https://reviews.llvm.org/D82062
-
Michael Liao authored
Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82025
-
serge-sans-paille authored
Move code that may update the IR after precondition, so that if precondition fail, the IR isn't modified. Differential Revision: https://reviews.llvm.org/D81225
-
Matt Arsenault authored
These don't really modify any memory, and should not expect memory operands.
-
Stanislav Mekhanoshin authored
Nothing breaks yet, but all encodings shall be in the map. Differential Revision: https://reviews.llvm.org/D81974
-
Arthur Eubanks authored
When possible (e.g. internal linkage), strip preallocated attribute off parameters/arguments. This requires removing the "preallocated" operand bundle from the call site, replacing @llvm.call.preallocated.arg() with an alloca and a bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since @llvm.call.preallocated.arg() can be called multiple times with the same arg index, we create an alloca per arg index. We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was and a @llvm.stackrestore() after the preallocated call to prevent the stack from blowing up. This is valid because the argument would normally not exist on the stack after the call before the transformation. This does not currently handle all possible preallocated calls. We will need to figure out where to put @llvm.stackrestore() in the cases where there is no obvious place to put it, for example conditional preallocated calls, invokes. This sort of transformation may need to be moved to somewhere more accessible to accomodate similar transformations (like inlining) in the future. Reviewers: efriedma, hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80951
-
Alexandros Lamprineas authored
This patch adds basic support for BFloat in the Arm backend. For now the code generation relies on fullfp16 being present. Briefly: * adds the bfloat scalar and vector types in the necessary register classes, * adjusts the calling convention to cope with bfloat argument passing and return, * adds codegen patterns for moves, loads and stores. It's tested mostly by the intrinsic patches that depend on it (load/store, convert/copy). The following people contributed to this patch: * Alexandros Lamprineas * Ties Stuij Differential Revision: https://reviews.llvm.org/D81373
-
Simon Pilgrim authored
[TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes. If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.
-
Matt Arsenault authored
-
Ayke van Laethem authored
Code like the following: define i32 @foo(i32 %a, i1 zeroext %b) addrspace(1) { entry: %conv = zext i1 %b to i32 %add = add nsw i32 %conv, %a ret i32 %add } Would compile to the following (incorrect) code: foo: mov r18, r20 clr r19 add r22, r18 adc r23, r19 sbci r24, 0 sbci r25, 0 ret Those sbci instructions are clearly wrong, they should have been adc instructions. This commit improves codegen to use adc instead: foo: mov r18, r20 clr r19 ldi r20, 0 ldi r21, 0 add r22, r18 adc r23, r19 adc r24, r20 adc r25, r21 ret This code is not optimal (it could be just 5 instructions instead of the current 9) but at least it doesn't miscompile. Differential Revision: https://reviews.llvm.org/D78439
-
Matt Arsenault authored
This was depending on the MachineFunction at MachineFunctionInfo construction, which will soon be disallowed.
-
Simon Pilgrim authored
Allow combineSetCCMOVMSK to handle 'allof' X == 0 patterns to be replaced with PTESTZ This is a preliminary patch before properly handling PR35129
-
Alexandre Ganea authored
[CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string. Previously, the DIA SDK didn't like the empty reference in the 'pdb' entry.
-
Kamlesh Kumar authored
Since i32 is not legal in riscv64, it always promoted to i64 before emitting lib call and for conversions like float/double to int and float/double to unsigned int wrong lib call was emitted. This commit fix it using custom lowering. Differential Revision: https://reviews.llvm.org/D80526
-
Igor Kudrin authored
The patch renames MakeStartMinusEndExpr() to makeEndMinusStartExpr() to better reflect an expression it creates and fix a naming style issue. Differential Revision: https://reviews.llvm.org/D82079
-
Alexandre Ganea authored
This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable). Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding. For more information see PR36198 and D43002. Differential Revision: https://reviews.llvm.org/D80833
-
Alexandre Ganea authored
The API is not called in this patch. This is to simply/support https://reviews.llvm.org/D80833
-
Alexandre Ganea authored
This patch is to support/simplify https://reviews.llvm.org/D80833
-
Florian Hahn authored
This patch updates LowerMatrixIntrinsics to preserve the alignment specified at the original load/stores and the align attribute for the pointer argument of the column.major.load/store intrinsics. We can always use the specified alignment for the load of the first column. For subsequent columns, the alignment may need to be reduced. For ConstantInt strides, compute the offset for the start of the column in bytes and use commonAlignment to get the largest valid alignment. For non-ConstantInt strides, we need to take the common alignment of the initial alignment and the element size in bytes. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D81960
-
Lucas Prates authored
Summary: As half-precision floating point arguments and returns were previously coerced to either float or int32 by clang's codegen, the CMSE handling of those was also performed in clang's side by zeroing the unused MSBs of the coercer values. This patch moves this handling to the backend's calling convention lowering, making sure the high bits of the registers used by half-precision arguments and returns are zeroed. Reviewers: chill, rjmccall, ostannard Reviewed By: ostannard Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81428
-
Lucas Prates authored
Summary: Half-precision floating point arguments and returns are currently promoted to either float or int32 in clang's CodeGen and there's no existing support for the lowering of `half` arguments and returns from IR in AArch32's backend. Such frontend coercions, implemented as coercion through memory in clang, can cause a series of issues in argument lowering, as causing arguments to be stored on the wrong bits on big-endian architectures and incurring in missing overflow detections in the return of certain functions. This patch introduces the handling of half-precision arguments and returns in the backend using the actual "half" type on the IR. Using the "half" type the backend is able to properly enforce the AAPCS' directions for those arguments, making sure they are stored on the proper bits of the registers and performing the necessary floating point convertions. Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer Reviewed By: ostannard Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75169
-
Paul Walker authored
Adds aarch64-sve-vector-bits-{min,max} to allow the size of SVE data registers (in bits) to be specified. This allows the code generator to make assumptions it normally couldn't. As a starting point this information is used to mark fixed length vector types that can fit within the specified size as legal. Reviewers: rengolin, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80384
-
Sameer Sahasrabuddhe authored
For a loop, a join block is a block that is reachable along multiple disjoint paths from the exiting block of a loop. If the exit condition of the loop is divergent, then such join blocks must also be marked divergent. This currently fails in some cases because not all join blocks are identified correctly. The workaround is to conservatively mark every join block of any branch (not necessarily the exiting block of a loop) as divergent. https://bugs.llvm.org/show_bug.cgi?id=46372 Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D81806
-