- Oct 20, 2021
-
-
Jeremy Morse authored
Here's another performance patch for InstrRefBasedLDV: rather than processing all variable values in a scope at a time, instead, process one variable at a time. The benefits are twofold: * It's easier to reason about one variable at a time in your mind, * It improves performance, apparently from increased locality. The downside is that the value-propagation code gets indented one level further, plus there's some churn in the unit tests. Differential Revision: https://reviews.llvm.org/D111799
-
Nicolas Vasilache authored
This revision uses the newly refactored StructuredGenerator to create a simple vectorization for conv1d_nwc_wcf. Note that the pattern is not specific to the op and is technically not even specific to the ConvolutionOpInterface (modulo minor details related to dilations and strides). The overall design follows the same ideas as the lowering of vector::ContractionOp -> vector::OuterProduct: it seeks to be minimally complex, composable and extensible while avoiding inference analysis. Instead, we metaprogram the maps/indexings we expect and we match against them. This is just a first stab and still needs to be evaluated for performance. Other tradeoffs are possible that should be explored. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D111894
-
PZ Read authored
Previously, when the fuzzing loop replaced an input in the corpus, it didn't update the execution time of the input. Therefore, some schedulers (e.g. Entropic) would adjust weights based on the incorrect execution time. This patch updates the execution time of the input when replacing it. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D111479
-
Louis Dionne authored
This temporary FIXME really belongs to the testing config, not to the specific CMake cache that enables that configuration. Differential Revision: https://reviews.llvm.org/D112031
-
Bjorn Pettersson authored
With legacy PM being deprecated it should be enough to verify the scalarizer pass using the new-PM syntax when invoking opt.
-
Bjorn Pettersson authored
The legacy PM is deprecated, so use the new PM syntax in lit tests running the vector-combine pass.
-
Bjorn Pettersson authored
The legacy PM is deprecated, so use the new PM syntax in lit tests running the bounds-checking pass.
-
Bjorn Pettersson authored
The legacy PM is deprecated, so use the new PM syntax in lit tests running the speculative-execution pass.
-
Bjorn Pettersson authored
-
Michał Górny authored
gdbserver does not expose combined ymm* registers but rather XSAVE-style split xmm* and ymm*h portions. Extend value_regs to support combining multiple registers and use it to create user-friendly ymm* registers that are combined from split xmm* and ymm*h portions. Differential Revision: https://reviews.llvm.org/D108937
-
Michał Górny authored
Fix incorrect values for value_regs, and incomplete values for invalidate_regs in RegisterInfos_arm. The value_regs entry needs to list only one base (i.e. larger) register that needs to be read to get the value for this register, while invalidate_regs needs to list all other registers (including pseudo-register) whose values would change when this register is written to. 7a8ba4ff fixed a similar problem for ARM64. Differential Revision: https://reviews.llvm.org/D112066
-
Michał Górny authored
Support arbitrarily-sized FPR writes on ARM in order to fix writing qN registers directly. Currently, writing them works only by accident due to value_regs splitting them into smaller writes via dN and sN registers. Differential Revision: https://reviews.llvm.org/D112131
-
Sander de Smalen authored
When inserting a scalable subvector into a scalable vector through the stack, the index to store to needs to be scaled by vscale. Before this patch, that didn't yet happen, so it would generate the wrong offset, thus storing a subvector to the incorrect address and overwriting the wrong lanes. For some insert: nxv8f16 insert_subvector(nxv8f16 %vec, nxv2f16 %subvec, i64 2) The offset was not scaled by vscale: orr x8, x8, #0x4 st1h { z0.h }, p0, [sp] st1h { z1.d }, p1, [x8] ld1h { z0.h }, p0/z, [sp] And is changed to: mov x8, sp st1h { z0.h }, p0, [sp] st1h { z1.d }, p1, [x8, #1, mul vl] ld1h { z0.h }, p0/z, [sp] Differential Revision: https://reviews.llvm.org/D111633
-
Louis Dionne authored
This commit switches libunwind from using the complicated logic in libc++'s testing configuration to a from-scratch configuration. I tried to make sure that all cases that were handled in the old config were handled by this one too, so hopefully this shouldn't break anyone. However, if you encounter issues with this change, please let me know and feel free to revert if I don't reply quickly. This change was engineered to be easily revertable. Differential Revision: https://reviews.llvm.org/D112082
-
Simon Pilgrim authored
Add PR51436 test as well as some basic multiply tests, and include SSE2 division coverage
-
Simon Pilgrim authored
These are folded to left shifts in the backend. We should be able to extend this for multiply-by-negpow2 after D111968 has landed to resolve PR51436
-
Aaron Ballman authored
When we added support for if consteval, we accidentally formed a discarded statement evaluation context for the branch-not-taken. However, a discarded statement is a property of an if constexpr statement, not an if consteval statement (https://eel.is/c++draft/stmt.if#2.sentence-2). This turned out to cause issues when deducing the return type from a function with a consteval if statement -- we wouldn't consider the branch-not-taken when deducing the return type. This fixes PR52206. Note, there is additional work left to be done. We need to track discarded statement and immediate evaluation contexts separately rather than as being mutually exclusive.
-
Michał Górny authored
Add a FPU_QREG macro to define qN registers. This is a piece-wise attempt of reconstructing D112066 with the goal of figuring out which part of the larger change breaks the buildbot. Differential Revision: https://reviews.llvm.org/D112066
-
Simon Pilgrim authored
Replace X86ProcFamilyEnum::IntelSLM enum with a TuningUseSLMArithCosts flag instead, matching what we already do for Goldmont. This just leaves X86ProcFamilyEnum::IntelAtom to replace with general Tuning/Feature flags and we can finally get rid of the old X86ProcFamilyEnum enum. Differential Revision: https://reviews.llvm.org/D112079
-
Raphael Isemann authored
`log` is just some IO object that gets printed as `<_io.TextIOWrapper = filename` but the intention here was to print the actual found log contents.
-
Pavel Labath authored
None of the commands we run really rely on shell features. Running them with shell=False, simplifies the code as there is no need for elaborate quoting. Differential Revision: https://reviews.llvm.org/D111990
-
Sven van Haastregt authored
-
Pavel Labath authored
specifically, ignore addresses that point before the first code section. This resurrects D87172 with several notable changes: - it fixes a bug where the early exits in InitializeObject left m_first_code_address "initialized" to LLDB_INVALID_ADDRESS (0xfff..f), which caused _everything_ to be ignored. - it extends the line table fix to function parsing as well, where it replaces a similar check which was checking the executable permissions of the section. This was insufficient because some position-independent elf executables can have an executable segment mapped at file address zero. (What makes this fix different is that it checks for the executable-ness of the sections contained within that segment, and those will not be at address zero.) - It uses a different test case, with an elf file with near-zero addresses, and checks for both line table and function parsing. Differential Revision: https://reviews.llvm.org/D112058
-
Daniel Kiss authored
autiasp, autibsp instructions are the counterpart of paciasp/pacibsp instructions therefore let's emit .cfi_negate_ra_state for these too. In case of Armv8.3 instruction set the retaa/retbb will do the return and authentication in one step here we can't emit the . cfi_negate_ra_state because that would be point after the ret* instruction. Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D111780
-
Joerg Sonnenberger authored
Reviewed By: LemonBoy Differential Revision: https://reviews.llvm.org/D96311
-
David Green authored
Copied from the X86 tests, these give a better test coveraged than the existing tests.
-
Paulo Matos authored
This change implements new DAG nodes TABLE_GET/TABLE_SET, and lowering methods for load and stores of reference types from IR arrays. These global LLVM IR arrays represent tables at the Wasm level. Differential Revision: https://reviews.llvm.org/D111154
-
mydeveloperday authored
[clang-format] [PR52015] clang-format should put __attribute__((foo)) on its own line before @interface / @implementation / @protocol https://bugs.llvm.org/show_bug.cgi?id=52015 A newline should be place between attribute and @ for objectivec Reviewed By: benhamilton, HazardyKnusperkeks Differential Revision: https://reviews.llvm.org/D111975
-
mydeveloperday authored
Following a change {D111273} to allow git-clang-format to see single lines being removed, we introduced a regression such that if you are removing a whole file it will assert in clang-format as its given the -lines=0:0 (lines are 1 based) Reviewed By: HazardyKnusperkeks Differential Revision: https://reviews.llvm.org/D112056
-
Josh Mottley authored
This patch replaces the uses of std::map with llvm::DenseMap in the flang-omp-report plugin. It also removed the 'constructClauseCount' map due to no longer being needed after the plugin was stripped down. This is a one of several patches focusing on switching containers from STL to LLVM's ADT library. Reviewed By: kiranchandramohan, clementval Differential Revision: https://reviews.llvm.org/D111977
-
Josh Mottley authored
This patch makes the following changes to flang-omp-report: - Update 'normalize_clause_name' parameter to use llvm::StringRef instead of std::sting. - Change usages of std::tolower to llvm::toLower from "ADT/StringExtras.h". This is a one of several patches focusing on switching containers from STL to LLVM's ADT library. Reviewed By: Leporacanthicus, clementval Differential Revision: https://reviews.llvm.org/D111980
-
Zi Xuan Wu authored
Complete the basic integer instruction set and add related predictor in CSKY.td. And it includes the instruction definition and asm parser support. Differential Revision: https://reviews.llvm.org/D111701
-
Evgeniy Brevnov authored
To guarantee convergence of the algorithm each optimization step should decrease number of instructions when IR is modified. This property is not held in this test case. The problem is that SCEV Expander may do "unexpected" reassociation what results in creation of new min/max chains and introduction of extra instructions. As a result on each step we indefinitely optimize back and forth. The solution is to restrict SCEV Expander to perform uncontrolled reassociations by means of "Unknown" expressions. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D112060
-
Wenlei He authored
We incorrectly use duplication factor for total samples even though we already accumulate samples instead of taking MAX. It causes profile to have bloated total samples for functions with loop unrolled or vectorized. The change fix the issue for total sample, head sample and call target samples. Differential Revision: https://reviews.llvm.org/D112042
-
Arthur Eubanks authored
Some downstream users have plugins that -clear-ast-before-backend may affect. Add an option to opt out. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D112100
-
Shao-Ce SUN authored
-
Lang Hames authored
This is an ORC runtime counterpart to a01f772d, which introduced the same functionality into LLVM.
-
Lang Hames authored
Aligns this template with the corresponding one in LLVM.
-
Lang Hames authored
WrapperFunctionResult can already convey serialization errors as out-of-band error values, so there's no need to wrap it in an Expected here. Removing the wrapper simplifies the plumbing and call sites.
-
Zhi An Ng authored
Add i8x16 relaxed_swizzle instructions. These are only exposed as builtins, and require user opt-in. Differential Revision: https://reviews.llvm.org/D112022
-