- Feb 08, 2018
-
-
Craig Topper authored
[DAGCombiner] Fix a couple mistakes from r324311 by really passing the original load to ExtendSetCCUses. We're passing the binary op that uses the load instead of the load. Noticed by inspection. Not sure how to test this because this just prevents the introduction of an extend that will later be truncated and will probably be combined out. llvm-svn: 324568
-
Craig Topper authored
[DAGCombiner] Don't create truncate nodes in (aext (zextload x)) -> (zextload x) and similar folds. NFCI The truncate is being used to replace other users of of the load, but we checked that the load only has one use so there are no other uses to replace. llvm-svn: 324567
-
Peter Collingbourne authored
llvm-svn: 324565
-
Francis Visoiu Mistrih authored
Instead of: %bb.1: derived from LLVM BB %for.body print: bb.1.for.body: Also use MIR syntax for MBB attributes like "align", "landing-pad", etc. llvm-svn: 324563
-
Craig Topper authored
[DAGCombiner] Avoid creating truncate nodes in (zext (and (load)))->(and (zextload)) fold until we know for sure we're going to need it. NFCI The truncate is only needed if the load has additional users. It used to get passed to extendSetCCUses so was created early, but that's no longer the case. llvm-svn: 324562
-
Craig Topper authored
We were calling a load LN0 but it came from N0.getOperand(0) so its really more like LN00 if we follow the name used in other places. llvm-svn: 324561
-
Yonghong Song authored
LowerSELECT_CC is not generating optimal Select_Ri pattern at the moment. It is not guaranteed to place ConstantNode at RHS which would miss matching Select_Ri. A new testcase added into the existing select_ri.ll, also there is an existing case in cmp.ll which would be improved to use Select_Ri after this patch, it is adjusted accordingly. Reported-by:
Alexei Starovoitov <alexei.starovoitov@gmail.com> Reviewed-by:
Yonghong Song <yhs@fb.com> Signed-off-by:
Jiong Wang <jiong.wang@netronome.com> llvm-svn: 324560
-
Peter Collingbourne authored
The LTO opt level should not affect the codegen opt level, and indeed it does not affect it in lld. Ideally the codegen opt level should be controlled by an IR-level attribute based on the compile-time opt level, but that hasn't been implemented yet. Differential Revision: https://reviews.llvm.org/D43040 llvm-svn: 324557
-
Matt Arsenault authored
Defs of operands outside of the instruction's explicit defs need to be checked. llvm-svn: 324554
-
Rafael Espindola authored
The issue is that clang was first creating a extern_weak hidden GV and then changing the linkage to external. Once we know it is not extern_weak we know it must be dso_local. This patch refactors the code that sets the implicit dso_local to a helper private function that is used every time we change the linkage or visibility. I will commit a patch to clang in a minute. llvm-svn: 324551
-
Matt Arsenault authored
llvm-svn: 324550
-
Justin Lebar authored
llvm-svn: 324549
-
Stanislav Mekhanoshin authored
The code reusing existing wait counts is incorrect since it keeps adding new operands to an old instruction instead of replacing the immediate. It was also effectively switched off by the condition that wait count is not an AMDGPU::S_WAITCNT. Also switched to BuildMI instead of creating instructions directly. Differential Revision: https://reviews.llvm.org/D42997 llvm-svn: 324547
-
Chandler Carruth authored
hit from IR but creates a minefield for MI passes. The x86 backend has fairly powerful logic to try and fold loads that feed register operands to instructions into a memory operand on the instruction. This is almost always a good thing, but there are specific relocated loads that are only allowed to appear in specific instructions. Notably, R_X86_64_GOTTPOFF is only allowed in `movq` and `addq`. This patch blocks folding of memory operands using this relocation unless the target is in fact `addq`. The particular relocation indicates why we simply don't hit this under normal circumstances. This relocation is only used for TLS, and it gets used in very specific ways in conjunction with %fs-relative addressing. The result is that loads using this relocation are essentially never eligible for folding into an instruction's memory operands. Unless, of course, you have an MI pass that inserts usage of such a load. I have exactly such an MI pass and was greeted by truly mysterious miscompiles where the linker replaced my instruction with a completely garbage byte sequence. Go team. This is the only such relocation I'm aware of in x86, but there may be others that need to be similarly restricted. Fixes PR36165. Differential Revision: https://reviews.llvm.org/D42732 llvm-svn: 324546
-
Mircea Trofin authored
Summary: Loops with inequality comparers, such as: // unsigned bound for (unsigned i = 1; i < bound; ++i) {...} have getSmallConstantMaxTripCount report a large maximum static trip count - in this case, 0xffff fffe. However, profiling info may show that the trip count is much smaller, and thus counter-recommend vectorization. This change: - flips loop-vectorize-with-block-frequency on by default. - validates profiled loop frequency data supports vectorization, when static info appears to not counter-recommend it. Absence of profile data means we rely on static data, just as we've done so far. Reviewers: twoh, mkuper, davidxl, tejohnson, Ayal Reviewed By: davidxl Subscribers: bkramer, llvm-commits Differential Revision: https://reviews.llvm.org/D42946 llvm-svn: 324543
-
- Feb 07, 2018
-
-
Craig Topper authored
[X86] Prune some unreachable 'return SDValue()' paths from LowerSIGN_EXTEND/LowerZERO_EXTEND/LowerANY_EXTEND. We were doing a lot of whitelisting of what we handle in these routines, but setOperationAction constrains what we can get here. So just add some asserts and prune the unreachable paths. llvm-svn: 324538
-
Craig Topper authored
[X86] Remove dead code from EmitTest that looked for an i1 type which should have already been type legalized away. NFC llvm-svn: 324536
-
Craig Topper authored
[X86] When doing callee save/restore for k-registers make sure we don't use KMOVQ on non-BWI targets If we are saving/restoring k-registers, the default behavior of getMinimalRegisterClass will find the VK64 class with a spill size of 64 bits. This will cause the KMOVQ opcode to be used for save/restore. If we don't have have BWI instructions we need to constrain the class returned to give us VK16 with a 16-bit spill size. We can do this by passing the either v16i1 or v64i1 into getMinimalRegisterClass. Also add asserts to make sure BWI is enabled anytime we use KMOVD/KMOVQ. These are what caught this bug. Fixes PR36256 Differential Revision: https://reviews.llvm.org/D42989 llvm-svn: 324533
-
Craig Topper authored
llvm-svn: 324530
-
Momchil Velikov authored
Revert commit r324489, it broke LLDB tests. llvm-svn: 324511
-
Alexey Bataev authored
llvm-svn: 324510
-
Zachary Turner authored
This patch enables PDB generation for Release build, which has slightly different optimize option with RelWithDebInfo on windows. This helps to know slow part of Release build when profiling. Patch by Takuto Ikuta Differential Revision: https://reviews.llvm.org/D42632 llvm-svn: 324504
-
Craig Topper authored
llvm-svn: 324497
-
Rafael Espindola authored
This reverts commit r324487. It broke clang tests. llvm-svn: 324494
-
Jonas Devlieghere authored
Revert "[dsymutil][test] Check the updated dSYM instead of companion file." Revert "[dsymutil] Upstream update feature." llvm-svn: 324493
-
Nirav Dave authored
Travel all chains paths to first non-tokenfactor node can be exponential work. Add simple redundency check to avoid this. Fixes PR36264. llvm-svn: 324491
-
Momchil Velikov authored
This patch is the LLVM part of fixing the issues, described in https://bugs.llvm.org/show_bug.cgi?id=36168 * The representation of enumerator values in the debug info metadata now contains a boolean flag isUnsigned, which determines how the bits of the value are interpreted. * The DW_TAG_enumeration type DIE now always (for DWARF version >= 3) includes a DW_AT_type attribute, which refers to the underlying integer type, as suggested in DWARFv4 (5.7 Enumeration Type Entries). * The debug info metadata for enumeration type contains (in flags) indication whether this is a C++11 "fixed enum". * For C++11 enumeration with a fixed underlying type, the DIE also includes the DW_AT_enum_class attribute (for DWARF version >= 4). * Encoding of enumerator constants uses DW_FORM_sdata for signed values and DW_FORM_udata for unsigned values, as suggested by DWARFv4 (7.5.4 Attribute Encodings). The changes should be backwards compatible: * the isUnsigned attribute is optional and defaults to false. * if the underlying type for the enumeration is not available, the enumerator values are considered signed. * the FixedEnum flag defaults to clear. * the bitcode format for DIEnumerator stores the unsigned flag bit #1 of the first record element, so the format does not change and the zero previously stored there is consistent with the false default for IsUnsigned. Differential Revision: https://reviews.llvm.org/D42734 llvm-svn: 324489
-
Marek Olsak authored
Note: This is a candidate for LLVM 6.0, because it was planned to be in that release but was delayed due to a long review period. Merge conflict in release_60 - resolution: Add "-p6:32:32" into the second (non-amdgiz) string. Only scalar loads support 32-bit pointers. An address in a VGPR will fail to compile. That's OK because the results of loads will only be used in places where VGPRs are forbidden. Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC. The tests cover all uses cases we need for Mesa. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D41651 llvm-svn: 324487
-
Marek Olsak authored
Summary: I checked the AMD closed source compiler and the workaround is only needed when x3 is emulated as x4, which we don't do in LLVM. SMEM x3 opcodes don't exist, and instead there is a possibility to use x4 with the last component being unused. If the last component is out of buffer bounds and falls on the next 4K page, the hw hangs. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D42756 llvm-svn: 324486
-
Simon Pilgrim authored
SSE and shorter vector sizes will have to wait until we can add support for general SMIN/SMAX matching. llvm-svn: 324485
-
Jonas Devlieghere authored
This patch has llvm-dwarfdump check the whole dSYM, rather than the hard-coded path to the Mach-O companion file. This might be what's causing the Windows bot to fail. llvm-svn: 324483
-
Clement Courbet authored
llvm-svn: 324482
-
Jonas Devlieghere authored
Now that dsymutil can generate accelerator tables, we can upstream the update logic that, as the name implies, updates the accelerator tables in an existing dSYM bundle. In combination with `-minimize` this can be used to remove redundant .debug_(inlines|pubtypes|pubnames). Differential revision: https://reviews.llvm.org/D42880 llvm-svn: 324480
-
Simon Pilgrim authored
llvm-svn: 324479
-
Benjamin Kramer authored
llvm-svn: 324478
-
Simon Atanasyan authored
llvm-svn: 324477
-
Simon Atanasyan authored
Both operand codes now work the same way in case of register or memory operands. It print high-order or low-order word in a double-word register or memory location. llvm-svn: 324476
-
Pavel Labath authored
The implementation of the function was deleted in r324426. This also removes the declaration. llvm-svn: 324474
-
Max Kazantsev authored
The failures happened because of assert which was overconfident about SCEV's proving capabilities and is generally not valid. Differential Revision: https://reviews.llvm.org/D42835 llvm-svn: 324473
-
Clement Courbet authored
With fixes from rL324341. Original commit message: [MergeICmps] Enable the MergeICmps Pass by default. Summary: Now that PR33325 is fixed, this should always improve the generated code. Reviewers: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42793 llvm-svn: 324465
-