- May 07, 2021
-
-
Alexander Belyaev authored
Differential Revision: https://reviews.llvm.org/D102088
-
Alexander Belyaev authored
Differential Revision: https://reviews.llvm.org/D102089
-
Emilio Cota authored
b614ada0 ("[mlir] add support for index type in vectors.") removed this limitation. Differential Revision: https://reviews.llvm.org/D102081
-
Arthur Eubanks authored
This reverts commit 0791f968. Causing crashes: https://crbug.com/1206764
-
Florian Hahn authored
I think currently isImpliedViaMerge can incorrectly return true for phis in a loop/cycle, if the found condition involves the previous value of Consider the case in exit_cond_depends_on_inner_loop. At some point, we call (modulo simplifications) isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1). The existing code tries to prove IncV <= -1 for all incoming values InvV using the found condition (%call <= -1). At the moment this succeeds, but only because it does not compare the same runtime value. The found condition checks the value of the last iteration, but the incoming value is from the *previous* iteration. Hence we incorrectly determine that the *previous* value was <= -1, which may not be true. I think we need to be more careful when looking at the incoming values here. In particular, we need to rule out that a found condition refers to any value that may refer to one of the previous iterations. I'm not sure there's a reliable way to do so (that also works of irreducible control flow). So for now this patch adds an additional requirement that the incoming value must properly dominate the phi block. This should ensure the values do not change in a cycle. I am not entirely sure if will catch all cases and I appreciate a through second look in that regard. Alternatively we could also unconditionally bail out in this case, instead of checking the incoming values Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101829
-
Thomas Lively authored
To improve hygiene, consistency, and usability, it would be good to replace all the macro intrinsics in wasm_simd128.h with functions. The reason for using macros in the first place was to enforce the use of constants for some arguments using `_Static_assert` with `__builtin_constant_p`. This commit switches to using functions and uses the `__diagnose_if__` attribute rather than `_Static_assert` to enforce constantness. The remaining macro intrinsics cannot be made into functions until the builtin functions they are implemented with can be replaced with normal code patterns because the builtin functions themselves require that their arguments are constants. This commit also fixes a bug with the const_splat intrinsics in which the f32x4 and f64x2 variants were incorrectly producing integer vectors. Differential Revision: https://reviews.llvm.org/D102018
-
Fangrui Song authored
-
Krzysztof Parzyszek authored
This will allow writing propagateMetadata(Inst, collectInterestingValues(...)) without concern about empty lists. In case of an empty list, Inst is returned without any changes.
-
Fangrui Song authored
-
Louis Dionne authored
Jobs that test with a more recent standard version run more tests, so they take longer. We'll decrease the average latency by running them first instead of last.
-
Saleem Abdulrasool authored
Revert the 32-process cap on Windows. When testing with Swift, we found that there was a time reduction for testing with the higher load. This should hopefully not matter much in practice. In the case that the original problem with python remains with a high subprocess count, we can easily revert this change.
-
Roman Lebedev authored
-
Roman Lebedev authored
It measures as such, and the reference docs agree. I can't easily add a MCA test, because there's no mnemonic for it, it can only be disassembled or created as a MCInst.
-
Fangrui Song authored
Similar to X86 D73230 & 46788a21 With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode, for default visibility external linkage non-ifunc-non-COMDAT definitions. For such dso_local definitions, variable access/taking the address of a function/calling a function will go through a local alias to avoid GOT/PLT. Note: the 'S' inline assembly constraint refers to an absolute symbolic address or a label reference (D46745). Differential Revision: https://reviews.llvm.org/D101872
-
Matt Morehouse authored
Fix function return type and remove check for SUMMARY, since it doesn't seem to be output in Windows.
-
Whitney Tsang authored
induction variable to be perfect This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect. Reviewed By: bmahjour, sidbav Differential Revision: https://reviews.llvm.org/D94717
-
Simon Pilgrim authored
Ensure we don't try to fold when one might be an opaque constant - the constant fold will fail and then the reverse fold will happen in DAGCombine.....
-
Roman Lebedev authored
Sometimes disassembler picks _REV variants of instructions over the plain ones, which in this case exposed an issue that the _REV variants aren't being modelled as optimizable moves.
-
Roman Lebedev authored
-
Sebastian Poeplau authored
Address sanitizer can detect stack exhaustion via its SEGV handler, which is executed on a separate stack using the sigaltstack mechanism. When libFuzzer is used with address sanitizer, it installs its own signal handlers which defer to those put in place by the sanitizer before performing additional actions. In the particular case of a stack overflow, the current setup fails because libFuzzer doesn't preserve the flag for executing the signal handler on a separate stack: when we run out of stack space, the operating system can't run the SEGV handler, so address sanitizer never reports the issue. See the included test for an example. This commit fixes the issue by making libFuzzer preserve the SA_ONSTACK flag when installing its signal handlers; the dedicated signal-handler stack set up by the sanitizer runtime appears to be large enough to support the additional frames from the fuzzer. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D101824
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D102034
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D102041
-
Joseph Tremoulet authored
Pointers escape when converted to integers, so a pointer produced by converting an integer to a pointer must not be a local non-escaping object. Reviewed By: nikic, nlopes, aqjune Differential Revision: https://reviews.llvm.org/D101541
-
Sanjay Patel authored
This is a reduction of the example in: https://llvm.org/PR50256
-
Joseph Huber authored
Summary: The allocator interface added in D97883 allows the RTL to allocate shared and host-pinned memory from the cuda plugin. This patch adds support for these to the runtime. Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D102000
-
Tobias Gysi authored
Remove the builder signature taking a signed dimension identifier. Reviewed By: ergawy Differential Revision: https://reviews.llvm.org/D102055
-
Tres Popp authored
This it to make more clear the difference between this and an AliasAnalysis. For example, given a sequence of subviews that create values A -> B -> C -> d: BufferViewFlowAnalysis::resolve(B) => {B, C, D} AliasAnalysis::resolve(B) => {A, B, C, D} Differential Revision: https://reviews.llvm.org/D100838
-
Ahsan Saghir authored
Vector pair intrinsics and builtins were renamed in https://reviews.llvm.org/D91974 to replace the _mma_ prefix by _vsx_. However, some projects used the _mma_ version, so this patch adds these intrinsics to provide compatibility. Fixes Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=50159 Reviewed By: nemanjai, amyk Differential Revision: https://reviews.llvm.org/D100482
-
Roman Lebedev authored
Drop it just enough so it still produces the right IPC.
-
Roman Lebedev authored
They are resolved at the register rename stage without using any execution units.
-
Roman Lebedev authored
I've verified this with llvm-exegesis. This is not limited to zero registers.
-
Roman Lebedev authored
I've verified this with llvm-exegesis. This is not limited to zero registers.
-
Roman Lebedev authored
I've verified this with llvm-exegesis. This is not limited to zero registers. Refs: AMD SOG 19h, 2.9.4 Zero Cycle Move The processor is able to execute certain register to register mov operations with zero cycle delay. Agner, 22.13 Instructions with no latency Register-to-register move instructions are resolved at the register rename stage without using any execution units. These instructions have zero latency. It is possible to do six such register renamings per clock cycle, and it is even possible to rename the same register multiple times in one clock cycle.
-
Roman Lebedev authored
-
Roman Lebedev authored
-
Roman Lebedev authored
-
Roman Lebedev authored
They are resolved at the register rename stage without using any execution units.
-
Roman Lebedev authored
-
Roman Lebedev authored
So the IPC actually stabilizes at 6.
-
Arthur O'Dwyer authored
And remove the dedicated debug-iterator tests; we want to test this in all modes. We have a CI step for testing the whole test suite with `--debug_level=1` now. Part of https://reviews.llvm.org/D102003
-