- Jun 07, 2021
-
-
Guillaume Chatelet authored
Differential Revision: https://reviews.llvm.org/D103251
-
Jingu Kang authored
This pass transforms loops that contain a conditional branch with induction variable. For example, it transforms left code to right code: newbound = min(n, c) while (iv < n) { while(iv < newbound) { A A if (iv < c) B B C C } } if (iv != n) { while (iv < n) { A C } } Differential Revision: https://reviews.llvm.org/D102234
-
Florian Hahn authored
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail. If the tail is not folded, we know that End - Start >= Step (either statically or through the minimum iteration checks). We also know that both Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV + %Step == %End. Hence we must exit the loop before %IV + %Step unsigned overflows and we can mark the induction increment as NUW. This should make SCEV return more precise bounds for the created vector loops, used by later optimizations, like late unrolling. At the moment quite a few tests still need to be updated, but before doing so I'd like to get initial feedback to make sure I am not missing anything. Note that this could probably be further improved by using information from the original IV. Attempt of modeling of the assumption in Alive2: https://alive2.llvm.org/ce/z/H_DL_g Part of a set of fixes required for PR50412. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D103255
-
Esme-Yi authored
error: ambiguous overload for 'operator==' (operand types are 'llvm::yaml::Hex16' and 'llvm::XCOFF::MagicNumber') Is64Bit = Obj.Header.Magic == XCOFF::XCOFF64;
-
Esme-Yi authored
Summary: The patch implements the mapping of the Yaml information to XCOFF object file to enable the yaml2obj tool for XCOFF. Currently only 32-bit is supported. Reviewed By: jhenderson, shchenz Differential Revision: https://reviews.llvm.org/D95505
-
- Jun 06, 2021
-
-
Simon Pilgrim authored
-
Simon Pilgrim authored
Add missing v16f32/v8f64 costs and adjust other costs as well based off the SkylakeServer model
-
Craig Topper authored
We should be exiting when the shift amount is greater than the bit width regardless of whether it is a power of 2. Reported by Simon Pilgrim here https://reviews.llvm.org/D96661 This requires getting a shift amount that is out of bounds that wasn't already optimized by SelectionDAG. This would be pretty trick to construct a test for. Or it would require a non-power of 2 shift amount and a mask that has runs of ones and zeros of the next lowest power of 2 from that shift amount. I tried a little to produce a test for this, but didn't get it to work.
-
Simon Pilgrim authored
Non-Strict v2f32->v2i64 cases have already early-returned to be handled by legalization.
-
Simon Pilgrim authored
OutSVT is guaranteed to be i8/i16 and we accept any InSVT that isn't i64
-
maekawatoshiki authored
This reverts commit 21653600. To fix the crash problem in legacy pass manager
-
Simon Pilgrim authored
The MCSymbol data should always be present for non-absolute sections so assert that it is to silence static analysis warnings.
-
Nikita Popov authored
Don't require a specific kind of IRBuilder for TargetLowering hooks. This allows us to drop the IRBuilder.h include from TargetLowering.h. Differential Revision: https://reviews.llvm.org/D103759
-
Simon Pilgrim authored
-
Simon Pilgrim authored
Use cast<> instead which will assert that the cast is correct and not just return null - the match() should have already failed if the cast isn't valid anyhow. Fixes static analysis warning.
-
Nikita Popov authored
Move methods using IRBuilder out of line, so we can drop the dependency on the header.
-
Nikita Popov authored
These currently rely on the IRBuilder.h include in TargetLowering.h. Make them explicit.
-
Simon Pilgrim authored
We've already checked that ScanIdx == 0 a few lines above.
-
Simon Pilgrim authored
-
Simon Pilgrim authored
Use cast<> instead which will assert that the cast is correct and not just return null. Fixes static analysis warnings.
-
Liqiang Tao authored
-
Liqiang Tao authored
This patch abstract Calls in Inliner:run() to InlineOrder. With this patch, it's possible to customize the inlining order, i.e. use queue or priority queue. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D103315
-
- Jun 05, 2021
-
-
Simon Pilgrim authored
Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h (necessary for gcc builds but not MSVC)
-
David Green authored
This NEG node is just a vector negation, easily represented as a SUB zero. Removing it from the one place it is generated is essentially an NFC, but can allow some extra folding. The updated tests are now loading different constant literals, which have already been negated. Differential Revision: https://reviews.llvm.org/D103703
-
Simon Pilgrim authored
-
Simon Pilgrim authored
Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h
-
Simon Pilgrim authored
Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h
-
Roman Lebedev authored
[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper We might want to use it when creating SCEV proper in createSCEV(), now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`, which might have caused us to loose some optimization potential.
-
Nikita Popov authored
Loop peeling is currently performed as part of UnrollLoop(). Outside test scenarios, it is always performed with an unroll count of 1. This means that unrolling doesn't actually do anything apart from performing post-unroll simplification. When testing, it's currently possible to specify both an explicit peel count and an explicit unroll count. This doesn't perform any sensible operation and may result in miscompiles, see https://bugs.llvm.org/show_bug.cgi?id=45939. This patch moves peeling from UnrollLoop() into tryToUnrollLoop(), so that peeling does not also perform a susequent unroll. We only run the post-unroll simplifications. Specifying both an explicit peel count and unroll count is forbidden. In the future, we may want to support both (non-PGO) peeling a loop and unrolling it, but this needs to be done by first performing the peel and then recalculating unrolling heuristics on a now possibly analyzable loop. Differential Revision: https://reviews.llvm.org/D103362
-
Vitaly Buka authored
Revert "Update and improve compiler-rt tests for -mllvm -asan_use_after_return=(never|[runtime]|always)." Windows is still broken. This reverts commit 927688a4.
-
Kevin Athey authored
In addition: - optionally add global flag to capture compile intent for UAR: __asan_detect_use_after_return_always. The global is a SANITIZER_WEAK_ATTRIBUTE. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D103304
-
Fangrui Song authored
-
Jim Lin authored
This is for D100288 to reduce the changes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103682
-
Vitaly Buka authored
Revert "Update and improve compiler-rt tests for -mllvm -asan_use_after_return=(never|[runtime]|always)." Reverts commits of D103304, it breaks Darwin. This reverts commit 60e5243e. This reverts commit 26b3ea22. This reverts commit 17600ec3.
-
Kevin Athey authored
In addition: - optionally add global flag to capture compile intent for UAR: __asan_detect_use_after_return_always. The global is a SANITIZER_WEAK_ATTRIBUTE. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D103304
-
Roman Lebedev authored
While the IndVars issue (PR50384) has been resolved, and the compile performance improved, a new blocker emerged, the codegen machine instruction scheduling is also quadratic. So we still can't really specify the right value here. Filed PR50584.
-
- Jun 04, 2021
-
-
Fangrui Song authored
`__profd_*` variables are referenced by code only when value profiling is enabled. If disabled (e.g. default -fprofile-instr-generate), the symbols just waste space on ELF/Mach-O. We change the comdat symbol from `__profd_*` to `__profc_*` because an internal symbol does not provide deduplication features on COFF. The choice doesn't matter on ELF. (In -DLLVM_BUILD_INSTRUMENTED_COVERAGE=on build, there is now no `__profd_*` symbols.) On Windows this enables further optimization. We are no longer affected by the link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE can cause duplicate definition error. https://lists.llvm.org/pipermail/llvm-dev/2021-May/150758.html We can thus use llvm.compiler.used instead of llvm.used like ELF (D97585). This avoids many `/INCLUDE:` directives in `.drectve`. Here is rnk's measurement for Chrome: ``` This reduced object file size of base_unittests.exe, compiled with coverage, optimizations, and gmlt debug info by 10%: #BEFORE $ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}' 1047758867 $ du -cksh base_unittests.exe 82M base_unittests.exe 82M total # AFTER $ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}' 937886499 $ du -cksh base_unittests.exe 78M base_unittests.exe 78M total ``` The change is NFC for Mach-O. Reviewed By: davidxl, rnk Differential Revision: https://reviews.llvm.org/D103372
-
Nikita Popov authored
When SimplifyIndVars infers IR nowrap flags from SCEV, this may happen in two ways: Either nowrap flags were already present in SCEV and just get transferred to IR. Or zero/sign extension of addrecs infers additional nowrap flags, and those get transferred to IR. In the latter case, calling forgetValue() ensures that the newly inferred nowrap flags get propagated to any other SCEV expressions based on the addrec. However, the invalidation can also have a major compile-time effect in some cases. For https://bugs.llvm.org/show_bug.cgi?id=50384 with n=512 compile- time drops from 7.1s to 0.8s without this invalidation. At the same time, removing the invalidation doesn't affect any codegen in test-suite. Differential Revision: https://reviews.llvm.org/D103424
-
Rong Xu authored
This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is for llvm-profdata part of change. It sets the bit masks for the profile reader in llvm-profdata. Also add an internal option "-fs-discriminator-pass" for show and merge command to process the profile offline. This patch also moved setDiscriminatorMaskedBitFrom() to SampleProfileReader::create() to simplify the interface. Differential Revision: https://reviews.llvm.org/D103550
-
Adam Nemet authored
Don't add it to FusedInsts in this case. Differential Revision: https://reviews.llvm.org/D103627
-