- Nov 13, 2019
-
-
Craig Topper authored
-
Craig Topper authored
[X86] Move all the FP_TO_XINT/XINT_TO_FP setOperationActions into the same !useSoftFloat block. Qualify all of the Promote actions for these with !useSoftFloat too. NFCI The Promote action doesn't apply until LegalizeDAG. By the time we get there, we would have already softened all the FP operations if useSoftFloat was true. So there wouldn't be any operation left to Promote.
-
Adrian Prantl authored
This avoids confusing them with fission-related functionality. I also moved two accessor functions from DWARFDIE into static functions in DWARFASTParserClang were their only use is located.
-
Hiroshi Yamauchi authored
Summary: This temporarily disables the large working set size behavior in profile guided size optimization due to internal benchmark regressions. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70207
-
Marek Kurdej authored
-
Sanjay Patel authored
-
Adrian Prantl authored
Because that is what this function really does. The old name is misleading.
-
Richard Smith authored
This change causes test failures for builds configured with -DCLANG_DEFAULT_RTLIB=compiler-rt. This reverts commit 3289352e.
-
Sanjay Patel authored
The bug manifests as replacing a reduction operand with an undef value. The problem appears to be limited to cases where a min/max reduction has extra uses of the compare operand to the select. In the general case, we are tracking "ExternallyUsedValues" and an "IgnoreList" of the reduction operations, but those may not apply to the final compare+select in a min/max reduction. For that, we use replaceAllUsesWith (RAUW) to ensure that the new vectorized reduction values are transferred to all subsequent users. Differential Revision: https://reviews.llvm.org/D70148
-
mydeveloperday authored
Summary: Review comments in {D69854} recommended a simpler approach of creating the SMDiagnostics to remove much of the complexity. (thanks @thakis) @vlad.tsyrklevich I've rebuilt on both Windows and Linux (running Linux with Address and Undefined sanitizers) over the clang code base Reviewers: thakis, klimek, mitchell-stellar, vlad.tsyrklevich Reviewed By: thakis Subscribers: cfe-commits, thakis, vlad.tsyrklevich Tags: #clang-format, #clang Differential Revision: https://reviews.llvm.org/D69921
-
Martin Storsjö authored
This broke in 51dcb292, "[lld-link] diagnose undefined symbols before LTO when possible" (very soon after the 9.0 branch, so luckily the 9.0 release is unaffected). The code for loading objects we believe might be needed for autoimport (loadMinGWAutomaticImports()) does run before the new reportUnresolvable() function, but it had a condition to only operate on symbols from regular object files. This condition came from resolveRemainingUndefines(), but as loadMinGWAutomaticImports() now has to operate before the LTO, it has to operate on undefineds from LTO objects as well. Differential Revision: https://reviews.llvm.org/D70166
-
Dimitry Andric authored
Summary: The option allows to disable specific target library builtin functions, instead of -disable-simplify-libcalls, which disables all of them. This is a prerequisite for D70143, which fixes PR43081. Reviewers: xbolva00, spatel, jdoerfert, efriedma Reviewed By: efriedma Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70193
-
Jonas Devlieghere authored
-
Francis Visoiu Mistrih authored
-
Craig Topper authored
[TargetLowering] Increase the storage size of NumRegistersForVT to allow the type break down for v256i1 and other types to be stored correctly v256i1 on X86 without avx512 breaks down to 256 i8 values when passed between basic blocks. But the NumRegistersForVT was sized at a byte for each VT. This results in 256 being stored as 0. This patch enlarges the type to 16 bits and adds an assert to ensure that no information is lost when the entry is stored. Differential Revision: https://reviews.llvm.org/D70138
-
Simon Atanasyan authored
-
Simon Atanasyan authored
-
Simon Atanasyan authored
-
Quentin Colombet authored
During register coalescing, we update the live-intervals on-the-fly. To do that we are in this strange mode where the live-intervals can be slightly out-of-sync (more precisely they are forward looking) compared to what the IR actually represents. This happens because the register coalescer only updates the IR when it is done with updating the live-intervals and it has to do it this way because updating the IR on-the-fly would actually clobber some information on how the live-ranges that are being updated look like. This is problematic for updates that rely on the IR to accurately represents the state of the live-ranges. Right now, we have only one of those: stripValuesNotDefiningMask. To reconcile this need of out-of-sync IR, this patch introduces a new argument to LiveInterval::refineSubRanges that allows the code doing the live range updates to reason about how the code should look like after the coalescer will have rewritten the registers. Essentially this captures how a subregister index with be offseted to match its position in a new register class. E.g., let say we want to merge: V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32> We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32> overlap, i.e., by choosing a class where we can find "offset + 1 == 3". Put differently we align V2's sub3 with V1's sub1: V2: sub0 sub1 sub2 sub3 V1: <offset> sub0 sub1 This offset will look like a composed subregidx in the the class: V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32> => V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32> Now if we didn't rewrite the uses and def of V1, all the checks for V1 need to account for this offset to match what the live intervals intend to capture. Prior to this patch, we would fail to recognize the uses and def of V1 and would end up with machine verifier errors: No live segment at def. This could lead to miscompile as we would drop some live-ranges and thus, miss some interferences. For this problem to trigger, we need to reach stripValuesNotDefiningMask while having a mismatch between the IR and the live-ranges (i.e., we have to apply a subreg offset to the IR.) This requires the following three conditions: 1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1> 2. An update with Tuple registers with a possibility to coalesce the subreg index: e.g., v1.dsub_1 == v2.dsub_3 3. Subreg liveness enabled. looking at the IR to decide what is alive and what is not, i.e., calling stripValuesNotDefiningMask. coalescer maintains for the live-ranges information. None of the targets that currently use subreg liveness (i.e., the targets that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and and #2, so this patch also artificial enables subreg liveness for ARM, so that a nice test case can be attached.
-
Michael Liao authored
- Only split vector types when both src and dst types are splittable.
-
Francis Visoiu Mistrih authored
With all the previous refactorings this slipped through and now we always dump the contents of the bitcode files, even if -dump is not passed.
-
Ahmed Bougacha authored
RETA always implicitly uses LR, unlike RET which merely has an alias that defaults it to LR. Additionally, RETA implicitly uses SP as well, which it uses as a discriminator to authenticate LR. This isn't usually noticeable, because RET_ReallyLR is used in most of the backend. However, the post-RA scheduler, if enabled, will cause miscompiles if the imp-uses are missing. While there, fix a typo in the lone affected testcase.
-
Ahmed Bougacha authored
The instruction definition has been retroactively expanded to allow for an alias for '[xN, 0]!' as '[xN]!'. That wouldn't make sense on LDR, but does for LDRA.
-
Sanjay Patel authored
-
Yonghong Song authored
Depending on different cmake configures, clang may generate different IR name for slot variables. Let us use the regex instead of hard coding the name. I did the same for other bpf-attr-preserve-access-index tests with such an approach, but somehow did not do for this one.
-
Edward Jones authored
If a GCC installation is not detected, then this attempts to use compiler-rt and the compiler-rt crtbegin/crtend implementations as a fallback. Differential Revision: https://reviews.llvm.org/D68407
-
David Stenberg authored
-
David Tenty authored
Summary: when building plugins, as AIX has symbols in it's standard library that must be garbage collected or we will see link errors. Export lists will handle this instead on AIX. Reviewers: stevewan, sfertile, jasonliu, xingxue, DiggerLin Reviewed By: DiggerLin Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70130
-
Yonghong Song authored
Add the newly supported BPF specific __attribute__((preserve_access_index) in the pragma-attribute-supported-attributes-list.test.
-
Sanjay Patel authored
-
Yonghong Song authored
This is a resubmission for the previous reverted commit 94343604 with the same subject. This commit fixed the segfault issue and addressed additional review comments. This patch introduced a new bpf specific attribute which can be added to struct or union definition. For example, struct s { ... } __attribute__((preserve_access_index)); union u { ... } __attribute__((preserve_access_index)); The goal is to simplify user codes for cases where preserve access index happens for certain struct/union, so user does not need to use clang __builtin_preserve_access_index for every members. The attribute has no effect if -g is not specified. When the attribute is specified and -g is specified, any member access defined by that structure or union, including array subscript access and inner records, will be preserved through __builtin_preserve_{array,struct,union}_access_index() IR intrinsics, which will enable relocation generation in bpf backend. The following is an example to illustrate the usage: -bash-4.4$ cat t.c #define __reloc__ __attribute__((preserve_access_index)) struct s1 { int c; } __reloc__; struct s2 { union { struct s1 b[3]; }; } __reloc__; struct s3 { struct s2 a; } __reloc__; int test(struct s3 *arg) { return arg->a.b[2].c; } -bash-4.4$ clang -target bpf -g -S -O2 t.c A relocation with access string "0:0:0:0:2:0" will be generated representing access offset of arg->a.b[2].c. forward declaration with attribute is also handled properly such that the attribute is copied and populated in real record definition. Differential Revision: https://reviews.llvm.org/D69759
-
Matthew Malcomson authored
-
Vedant Kumar authored
Split out the logic to get the size of a merged profile and to do a compatibility check. This can be shared with both the continuous+merging mode implementation, as well as the runtime-allocated counters implementation planned for Fuchsia. Lifted out of D69586. Differential Revision: https://reviews.llvm.org/D70135
-
Sanjay Patel authored
As noted by the FIXME comment, this is not correct based on our current FMF semantics. We should be propagating FMF from the final value in a sequence (in this case the 'select'). So the behavior even without this patch is wrong, but we did not allow FMF on 'select' until recently. But if we do the correct thing right now in this patch, we'll inevitably introduce regressions because we have not wired up FMF propagation for 'phi' and 'select' in other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a better incremental way to make progress. That said, the potential extra damage over the existing wrong behavior from this patch is very limited. AFAIK, the only way to have different FMF on IR in the same function is if we have LTO inlined IR from 2 modules that were compiled using different fast-math settings. As seen in the tests, we may actually see some improvements with this patch because adding the FMF to the 'select' allows matching to min/max intrinsics that were previously missed (in the common case, the 'fcmp' and 'select' should have identical FMF to begin with). Next steps in the transition: Make similar changes in instcombine as needed. Enable phi-to-select FMF propagation in SimplifyCFG. Remove dependencies on fcmp with FMF. Deprecate FMF on fcmp. Differential Revision: https://reviews.llvm.org/D69720
-
Pavel Labath authored
Summary: This avoid the need to duplicate the location lists searching logic in various users. The "inline location list dumping" code (which is the only user actually updated to handle DWARF v5 location lists) is switched to this method. After adding v4 location list support, I'll switch other users too. Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70084
-
Simon Pilgrim authored
-
Simon Pilgrim authored
-
Simon Pilgrim authored
-
Simon Pilgrim authored
-
Simon Pilgrim authored
-