- May 08, 2021
-
-
Xiang1 Zhang authored
-
Michael Liao authored
-
Arthur Eubanks authored
Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797
-
RamNalamothu authored
UnwindTable::parseRows() may return successfully if the CFIProgram has either no CFI instructions or only DW_CFA_nop instructions and the UnwindRow return argument will be empty. But currently, the callers are not checking for this case which is leading to incorrect dumps in the unwind tables in such cases i.e. CFA=unspecified Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D101892
-
River Riddle authored
The current design uses a unique entry for each argument/result attribute, with the name of the entry being something like "arg0". This provides for a somewhat sparse design, but ends up being much more expensive (from a runtime perspective) in-practice. The design requires building a string every time we lookup the dictionary for a specific arg/result, and also requires N attribute lookups when collecting all of the arg/result attribute dictionaries. This revision restructures the design to instead have an ArrayAttr that contains all of the attribute dictionaries for arguments and another for results. This design reduces the number of attribute name lookups to 1, and allows for O(1) lookup for individual element dictionaries. The major downside is that we can end up with larger memory usage, as the ArrayAttr contains an entry for each element even if that element has no attributes. If the memory usage becomes too problematic, we can experiment with a more sparse structure that still provides a lot of the wins in this revision. This dropped the compilation time of a somewhat large TensorFlow model from ~650 seconds to ~400 seconds. Differential Revision: https://reviews.llvm.org/D102035
-
Arthur Eubanks authored
At 61 or over, I see messages like File "...\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait res = _winapi.WaitForMultipleObjects(L, False, timeout) ValueError: need at most 63 handles, got a sequence of length 64 60 seems to work for me. If this causes issues for anybody else, feel free to revert.
-
River Riddle authored
This provides information when the user hovers over a part of the source .mlir file. This revision adds the following hover behavior: * Operation: - Shows the generic form. * Operation Result: - Shows the parent operation name, result number(s), and type(s). * Block: - Shows the parent operation name, block number, predecessors, and successors. * Block Argument: - Shows the parent operation name, parent block, argument number, and type. Differential Revision: https://reviews.llvm.org/D101113
-
Arthur Eubanks authored
This reverts commit d319005a. Causing messages like: File "...\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait res = _winapi.WaitForMultipleObjects(L, False, timeout) ValueError: need at most 63 handles, got a sequence of length 74
-
Arthur Eubanks authored
-
thomasraoux authored
Previous change caused another warning in some build configuration: "default label in switch which covers all enumeration values"
-
Amara Emerson authored
We never bothered to have a separate set of combines for -O0 in the prelegalizer before. This results in some minor performance hits for a mode where performance isn't a concern (although not regressing code size significantly is still preferable). This also removes the CSE option since we don't need it for -O0. Through experiments, I've arrived at a set of combines that gets the most code size improvement at -O0, while reducing the amount of time spent in the combiner by around 35% give or take. Differential Revision: https://reviews.llvm.org/D102038
-
Amara Emerson authored
For importing patterns, we only support matching G_LOAD, not G_ZEXTLOAD or G_SEXTLOAD. Differential Revision: https://reviews.llvm.org/D101932
-
Weston Carvalho authored
Differential Revision: https://reviews.llvm.org/D101572
-
Weston Carvalho authored
This will make it possible for more code to use it.
-
Arthur Eubanks authored
We're trying to move DebugLogging into instrumentation, rather than being part of PassManagers/AnalysisManagers. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D102093
-
- May 07, 2021
-
-
Jessica Paquette authored
Using `clampScalar` here because we ought to mark s128 as custom eventually. (Right now, it will just fall back.) With this legalization, we get the same code as SDAG: https://godbolt.org/z/TneoPKrKG Differential Revision: https://reviews.llvm.org/D100908
-
Adrian Prantl authored
-
Petr Hosek authored
This addresses an issue introduced in D91559. We would invoke the compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both locations contain libraries with the same name, but we expect linker to pick up the library in path/to/lib since that version is more specialized. This was the case before D91559 where the sysroot path would be ignored, but after that change linker would now pick up the library from the sysroot which resulted in unexpected behavior. The sysroot path should always come after any user provided library paths, followed by compiler runtime paths. We want for libraries in user provided library paths to always take precedence over sysroot libraries. This matches the behavior of other toolchains used with other targets. Differential Revision: https://reviews.llvm.org/D102049
-
Nico Weber authored
Before this, if an inline function was defined in several input files, lld would write each copy of the inline function the output. With this patch, it only writes one copy. Reduces the size of Chromium Framework from 378MB to 345MB (compared to 290MB linked with ld64, which also does dead-stripping, which we don't do yet), and makes linking it faster: N Min Max Median Avg Stddev x 10 3.9957051 4.3496981 4.1411121 4.156837 0.10092097 + 10 3.908154 4.169318 3.9712729 3.9846753 0.075773012 Difference at 95.0% confidence -0.172162 +/- 0.083847 -4.14165% +/- 2.01709% (Student's t, pooled s = 0.0892373) Implementation-wise, when merging two weak symbols, this sets a "canOmitFromOutput" on the InputSection belonging to the weak symbol not put in the symbol table. We then don't write InputSections that have this set, as long as they are not referenced from other symbols. (This happens e.g. for object files that don't set .subsections_via_symbols or that use .alt_entry.) Some restrictions: - not yet done for bitcode inputs - no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) -- Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs) (that is, catch block unwind information) and Personality Routines associated with weak functions still not stripped. This is wasteful, but harmless. - However, this does strip weaks from __unwind_info (which is needed for correctness and not just for size) - This nopes out on InputSections that are referenced form more than one symbol (eg from .alt_entry) for now Things that work based on symbols Just Work: - map files (change in MapFile.cpp is no-op and not needed; I just found it a bit more explicit) - exports Things that work with inputSections need to explicitly check if an inputSection is written (e.g. unwind info). This patch is useful in itself, but it's also likely also a useful foundation for dead_strip. I used to have a "canoncialRepresentative" pointer on InputSection instead of just the bool, which would be handy for ICF too. But I ended up not needing it for this patch, so I removed that again for now. Differential Revision: https://reviews.llvm.org/D102076
-
thomasraoux authored
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D102091
-
Petr Hosek authored
This reverts commit 6b00b34b.
-
Florian Hahn authored
The comment incorrectly states that the PHI is recorded. That's not accurate, only the recipe for the incoming value is recorded. Suggested post-commit for 4ba8720f.
-
Andrea Di Biagio authored
The register file should always check if the destination register is from a register class that allows move elimination. Before this change, the check on the register class was only performed in a few very specific cases. However, it should have always been performed. This patch fixes the issue. Note that none of the upstream scheduling models is currently affected by this bug, so there is no test for it. The issue was found by Roman while working on the znver3 model. I was able to reproduce the issue locally by tweaking the btver2 model. I then verified that this patch fixes the issue.
-
Olivier Goffart authored
Commit 5baea056 set the CurCodeDecl because it was needed to pass the assert in CodeGenFunction::EmitLValueForLambdaField, But this was not right to do as CodeGenFunction::FinishFunction passes it to EmitEndEHSpec and cause corruption of the EHStack. Revert the part of the commit that changes the CurCodeDecl, and instead adjust the assert to check for a null CurCodeDecl. Differential Revision: https://reviews.llvm.org/D102027
-
Florian Hahn authored
Currently sinking a replicate region into another replicate region is not supported. Add an assert, to make the problem more obvious, should it occur. Discussed post-commit for ccebf7a1.
-
Florian Hahn authored
Adjust the name to make it clearer this is the region containing the target recipe, similar to SinkRegion below. Suggested post-commit for ccebf7a1.
-
peter klausler authored
Implement the reduction transformational intrinsic function NORM2 in the runtime, using infrastructure already in place for MAXVAL & al. Differential Revision: https://reviews.llvm.org/D102024
-
Petr Hosek authored
This addresses an issue introduced in D91559. We would invoke the compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both locations contain libraries with the same name, but we expect linker to pick up the library in path/to/lib since that version is more specialized. This was the case before D91559 where the sysroot path would be ignored, but after that change linker would now pick up the library from the sysroot which resulted in unexpected behavior. The sysroot path should always come after any user provided library paths, followed by compiler runtime paths. We want for libraries in user provided library paths to always take precedence over sysroot libraries. This matches the behavior of other toolchains used with other targets. Differential Revision: https://reviews.llvm.org/D102049
-
Hsiangkai Wang authored
We have vector operations on double vector and float scalar. For example, vfwadd.wf is such a instruction. vfloat64m1_t vfwadd_wf(vfloat64m1_t op0, float op1, size_t op2); We should specify F and D extensions for it. Differential Revision: https://reviews.llvm.org/D102051
-
Vyacheslav Zakharin authored
I want to start using LLVM component libraries in libomptarget to stop duplicating implementations already available in LLVM (e.g. LLVMObject, LLVMSupport, etc.). Without relying on LLVM in all libomptarget builds one has to provide fallback implementation for each used LLVM feature. This is an attempt to stop supporting out-of-llvm-tree builds of libomptarget. I understand that I may need to revert this, if this affects downstream projects in a bad way. Differential Revision: https://reviews.llvm.org/D101509
-
Alexander Belyaev authored
Differential Revision: https://reviews.llvm.org/D102088
-
Alexander Belyaev authored
Differential Revision: https://reviews.llvm.org/D102089
-
Emilio Cota authored
b614ada0 ("[mlir] add support for index type in vectors.") removed this limitation. Differential Revision: https://reviews.llvm.org/D102081
-
Arthur Eubanks authored
This reverts commit 0791f968. Causing crashes: https://crbug.com/1206764
-
Florian Hahn authored
I think currently isImpliedViaMerge can incorrectly return true for phis in a loop/cycle, if the found condition involves the previous value of Consider the case in exit_cond_depends_on_inner_loop. At some point, we call (modulo simplifications) isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1). The existing code tries to prove IncV <= -1 for all incoming values InvV using the found condition (%call <= -1). At the moment this succeeds, but only because it does not compare the same runtime value. The found condition checks the value of the last iteration, but the incoming value is from the *previous* iteration. Hence we incorrectly determine that the *previous* value was <= -1, which may not be true. I think we need to be more careful when looking at the incoming values here. In particular, we need to rule out that a found condition refers to any value that may refer to one of the previous iterations. I'm not sure there's a reliable way to do so (that also works of irreducible control flow). So for now this patch adds an additional requirement that the incoming value must properly dominate the phi block. This should ensure the values do not change in a cycle. I am not entirely sure if will catch all cases and I appreciate a through second look in that regard. Alternatively we could also unconditionally bail out in this case, instead of checking the incoming values Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101829
-
Thomas Lively authored
To improve hygiene, consistency, and usability, it would be good to replace all the macro intrinsics in wasm_simd128.h with functions. The reason for using macros in the first place was to enforce the use of constants for some arguments using `_Static_assert` with `__builtin_constant_p`. This commit switches to using functions and uses the `__diagnose_if__` attribute rather than `_Static_assert` to enforce constantness. The remaining macro intrinsics cannot be made into functions until the builtin functions they are implemented with can be replaced with normal code patterns because the builtin functions themselves require that their arguments are constants. This commit also fixes a bug with the const_splat intrinsics in which the f32x4 and f64x2 variants were incorrectly producing integer vectors. Differential Revision: https://reviews.llvm.org/D102018
-
Fangrui Song authored
-
Krzysztof Parzyszek authored
This will allow writing propagateMetadata(Inst, collectInterestingValues(...)) without concern about empty lists. In case of an empty list, Inst is returned without any changes.
-
Fangrui Song authored
-