- Jun 01, 2020
-
-
James Henderson authored
This will ensure that nothing can ever start parsing data from a future sequence and part-read data will be returned as 0 instead. Reviewed by: aprantl, labath Differential Revision: https://reviews.llvm.org/D80796
-
Sanjay Patel authored
This is effectively reverting rGbfdc2552664d to avoid test churn while we figure out a better way forward. We at least salvage the warning on name conflict from that patch though. If we change the default string again, we may want to mass update tests at the same time. Alternatively, we could live with the poor naming if we change -instnamer. This also adds a test to LLVM as suggested in the post-commit review. There's a clang test that is also affected. That seems like a layering violation, but I have not looked at fixing that yet. Differential Revision: https://reviews.llvm.org/D80584
-
James Henderson authored
The debug_line_invalid.test test case was previously using the interpreted line table dumping to identify which opcodes have been parsed. This change moves to looking for the expected opcodes explicitly. This is probably a little clearer and also allows for testing some cases that wouldn't be easily identifiable from the interpreted table. Reviewed by: MaskRay Differential Revision: https://reviews.llvm.org/D80795
-
Igor Kudrin authored
For most tables, we already use commas in headers. This set of patches unifies dumping the remaining ones. Differential Revision: https://reviews.llvm.org/D80806
-
Igor Kudrin authored
For most tables, we already use commas in headers. This set of patches unifies dumping the remaining ones. Differential Revision: https://reviews.llvm.org/D80806
-
Igor Kudrin authored
For most tables, we already use commas in headers. This set of patches unifies dumping the remaining ones. Differential Revision: https://reviews.llvm.org/D80806
-
Ehud Katz authored
This is a reimplementation of the `orderNodes` function, as the old implementation didn't take into account all cases. The new implementation uses SCCs instead of Loops to take account of irreducible loops. Fix PR41509 Differential Revision: https://reviews.llvm.org/D79037
-
Georgii Rymar authored
This improves the next points for broken hash tables: 1) Use reportUniqueWarning to prevent duplication when --hash-table and --elf-hash-histogram are used together. 2) Dump nbuckets and nchain fields. It is often possible to dump them even when the table itself goes past the EOF etc. Differential revision: https://reviews.llvm.org/D80373
-
Tim Northover authored
When a stack offset was too big to materialize in a single instruction, we were trying to do it in stages: adds xD, sp, #imm adds xD, xD, #imm Unfortunately, if xD is xzr then the second instruction doesn't exist and wouldn't do what was needed if it did. Instead we can use a temporary register for all but the last addition.
-
Li Rong Yi authored
Summary: Exploit vabsd* for for absolute difference of vectors on P9, for example: void foo (char *restrict p, char *restrict q, char *restrict t) { for (int i = 0; i < 16; i++) t[i] = abs (p[i] - q[i]); } this case should be matched to the HW instruction vabsdub. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D80271
-
- May 31, 2020
-
-
Craig Topper authored
Previously we walked the users of any vector binop looking for more binops with the same opcode or phis that eventually ended up in a reduction. While this is simple it also means visiting the same nodes many times since we'll do a forward walk for each BinaryOperator in the chain. It was also far more general than what we have tests for or expect to see. This patch replaces the algorithm with a new method that starts at extract elements looking for a horizontal reduction. Once we find a reduction we walk through backwards through phis and adds to collect leaves that we can consider for rewriting. We only consider single use adds and phis. Except for a special case if the Add is used by a phi that forms a loop back to the Add. Including other single use Adds to support unrolled loops. Ultimately, I want to narrow the Adds, Phis, and final reduction based on the partial reduction we're doing. I still haven't figured out exactly what that looks like yet. But restricting the types of graphs we expect to handle seemed like a good first step. As does having all the leaves and the reduction at once. Differential Revision: https://reviews.llvm.org/D79971
-
Simon Pilgrim authored
-
Simon Pilgrim authored
This matches what we do for the full sized vector ops at the start of combineX86ShufflesRecursively, and helps getFauxShuffleMask extract more INSERT_SUBVECTOR patterns.
-
Matt Arsenault authored
I inverted the mask when I ported to the new form of G_PTRMASK in 8bc03d21. I don't think this really broke anything, since G_VASTART isn't handled for types with an alignment higher than the stack alignment.
-
Sanjay Patel authored
-
Simon Pilgrim authored
As suggested on D79987.
-
Sanjay Patel authored
Goes with proposal in D80885. This is adapted from the InstCombine tests that were added for D50992 But these should be adjusted further to provide more interesting scenarios for x86-specific codegen. Eg, vector types/sizes will have different costs depending on ISA attributes. We also need to add tests that include a load of the scalar variable and add tests that include extra uses of the insert to further exercise the cost model.
-
Simon Pilgrim authored
-
Sanjay Patel authored
Motivating test for vector-combine enhancement in D80885. Make sure that vectorization and canonicalization are working together as expected.
-
Kang Zhang authored
This case wll be failed on some machines which enable expensive-checks. This reverts commit af3abbf7.
-
Kang Zhang authored
-
Jay Foad authored
Differential Revision: https://reviews.llvm.org/D80813
-
Jay Foad authored
-
Changpeng Fang authored
Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D80853
-
Craig Topper authored
-
Craig Topper authored
-
Craig Topper authored
The instruction is defined to only produce high result if both destinations are the same. We can exploit this to avoid unnecessarily clobbering a register. In order to hide this from register allocation we use a pseudo instruction and expand the result during MCInst creation. Differential Revision: https://reviews.llvm.org/D80500
-
- May 30, 2020
-
-
Whitney Tsang authored
rG7873376bb36b fixes a build failure for allyesconfig. The problem happened when the single exiting block doesn't dominate the loop latch, then the immediate dominator of the exit block should not be the exiting block after unrolling. As the exiting block of different unrolled iteration can branch to the exit block, and the ith exiting block doesn't dominate (i+1)th exiting block, the immediate dominator of the exit block should not the nearest common dominator of the exiting block and the loop latch of the same iteration. Differential Revision: https://reviews.llvm.org/D80477
-
Philip Reames authored
-
zoecarver authored
Adds a simple fast-path check for the pattern: v = load ptr store v to ptr I took the tests from the bugzilla post, I can add more if needed (but I think these should be sufficent). Refs: https://bugs.llvm.org/show_bug.cgi?id=45795 Differential Revision: https://reviews.llvm.org/D79391
-
Florian Hahn authored
Currently, BasicAA does not exploit information about value ranges of indexes. For example, consider the 2 pointers %a = %base and %b = %base + %stride below, assuming they are used to access 4 elements. If we know that %stride >= 4, we know the accesses do not alias. If %stride is a constant, BasicAA currently gets that. But if the >= 4 constraint is encoded using an assume, it misses the NoAlias. This patch extends DecomposedGEP to include an additional MinOtherOffset field, which tracks the constant offset similar to the existing OtherOffset, which the difference that it also includes non-negative lower bounds on the range of the index value. When checking if the distance between 2 accesses exceeds the access size, we can use this improved bound. For now this is limited to using non-negative lower bounds for indices, as this conveniently skips cases where we do not have a useful lower bound (because it is not constrained). We potential miss out in cases where the lower bound is constrained but negative, but that can be exploited in the future. Reviewers: sanjoy, hfinkel, reames, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D76194
-
Craig Topper authored
-
Martin Storsjö authored
-
Martin Storsjö authored
[AArch64] Treat x18 as callee-saved in functions with windows calling convention on non-windows OSes Treat it as callee-saved, and always back it up. When windows code calls entry points in unix code, marked with the windows calling convention, that unix code can call other functions that isn't compiled with -ffixed-x18 which may clobber x18 freely. By backing it up and restoring it on return, we preserve the register across the function call, fulfilling this part of the windows calling convention on another OS. This isn't enough for making sure that x18 is preseved when non-windows code does a callback to windows code, but is a clear improvement over the current status quo. Additionally, wine is nowadays building many modules as PE DLLs, which avoids the callback issue altogether for those DLLs. Differential Revision: https://reviews.llvm.org/D61892
-
Sourabh Singh Tomar authored
This patch adds support for emission of following DWARFv5 macro forms in .debug_macro.dwo section: - DW_MACRO_start_file - DW_MACRO_end_file - DW_MACRO_define_strx - DW_MACRO_undef_strx Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D78866
-
Eric Christopher authored
Last we looked at this and couldn't come up with a reason to change it, but with a pragma for full loop unrolling we bypass every other loop unroll and then fail to fully unroll a loop when the pragma is set. Move the OnlyWhenForced out of the check and into the initialization of the full unroll pass in the new pass manager. This doesn't show up with the old pass manager. Add a new option to opt so that we can turn off loop unrolling manually since this is a difference between clang and opt. Tested with check-clang and check-llvm.
-
Carl Ritson authored
Summary: Replace an assertion that blocks S1024 SGPR to VGPR spill. The assertion pre-dates S1024 and is not wave size dependent. Reviewers: arsenm, sameerds, rampitec Reviewed By: arsenm Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80783
-
Matt Arsenault authored
This is a custom inserter because it was less work than teaching tablegen a way to indicate that it is sometimes OK to have a no side effect instruction in the output of a side effecting pattern. The asm is needed to look like a read of the mode register to prevent it from being deleted. However, there seems to be a bug where the mode register def instructions are moved across the asm sideeffect by the post-RA scheduler. Another oddity is the immediate is formatted differently between s_denorm_mode and s_round_mode.
-
Matt Arsenault authored
Most of these should be identical and use a common prefix, but update_llc_test_checks is failing to generate shared checks for some reason.
-
Matt Arsenault authored
-