- Mar 08, 2019
-
-
Shoaib Meenai authored
LLVM is always built; including it in LLVM_ENABLE_PROJECTS has no effect, but since it's in LLVM_ALL_PROJECTS, we produce a confusing message about it being disabled. Drop it from LLVM_ALL_PROJECTS to avoid this. Pointed out by David Greene on the mailing list [1]. [1] http://lists.llvm.org/pipermail/llvm-dev/2019-March/130854.html llvm-svn: 355735
-
Mitch Phillips authored
llvm-svn: 355734
-
Michael Kruse authored
Commit r355068 "Fix IR/Analysis layering issue with OptBisect" uses the template return Gate.isEnabled() && !Gate.shouldRunPass(this, getDescription(...)); for all pass kinds. For the RegionPass, it left out the not operator, causing region passes to be skipped as soon as a pass gate is used. llvm-svn: 355733
-
Matt Arsenault authored
When matching half of the build_vector to a load, there could still be a hidden dependency on the other half of the build_vector the pattern wouldn't detect. If there was an additional chain dependency on the other value, a cycle could be introduced. I don't think a tablegen pattern is capable of matching the necessary conditions, so move this into PreprocessISelDAG. Check isPredecessorOf for the other value to avoid a cycle. This has a warning that it's expensive, so this should probably be moved into an MI pass eventually that will have more freedom to reorder instructions to help match this. That is currently complicated by the lack of a computeKnownBits type mechanism for the selected function. llvm-svn: 355731
-
Matt Arsenault authored
This avoids breaking possible value dependencies when sorting loads by offset. AMDGPU has some load instructions that write into the high or low bits of the destination register, and have a tied input for the other input bits. These can easily have the same base pointer, but be a swizzle so the high address load needs to come first. This was inserting glue forcing the opposite ordering, producing a cycle the InstrEmitter would assert on. It may be potentially expensive to look for the dependency between the other loads, so just skip any where this could happen. Fixes bug 40936 by reverting r351379, which added a hacky attempt to fix this by adding chains in this case, which I think was just working around broken glue before the InstrEmitter. The core of the patch is re-implementing the fix for that problem. llvm-svn: 355728
-
Sanjay Patel authored
llvm-svn: 355727
-
Matthew Voss authored
This broke the windows bots. This reverts commit 28302c66. llvm-svn: 355725
-
Matt Arsenault authored
Also fix a few cases that weren't testing what they were supposed to. llvm-svn: 355724
-
Matt Arsenault authored
This is only called in contexts that are verifying the chain itself, and the query itself is only asking about the address. llvm-svn: 355723
-
Matt Arsenault authored
This was checking the wrong operands for the base register and the offsets. The indexes are shifted by the number of output registers from the machine instruction definition, and the chain is moved to the end. llvm-svn: 355722
-
Alexey Bataev authored
Summary: If the LLVM module shows that it has debug info, but the file is actually empty and the real debug info is not emitted, the ptxas tool emits error 'Debug information not found in presence of .target debug'. We need at leas one empty debug section to silence this message. Section `.debug_loc` is not emitted for PTX and we can emit empty `.debug_loc` section if `debug` option was emitted. Reviewers: tra Subscribers: jholewinski, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D57250 llvm-svn: 355719
-
Amaury Sechet authored
Summary: This pattern is sometime created after legalization. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58874 llvm-svn: 355716
-
George Burgess IV authored
Patch by Enna1! Differential Revision: https://reviews.llvm.org/D58756 llvm-svn: 355715
-
Wei Mi authored
many valnos. Recently we found compile time out problem in several cases when SpeculativeLoadHardening was enabled. The significant compile time was spent in register coalescing pass, where register coalescer tried to join many other live intervals with some very large live intervals with many valnos. Specifically, every time JoinVals::mapValues is called, computeAssignment will be called by getNumValNums() times of the target live interval. If the large live interval has N valnos and has N copies associated with it, trying to coalescing those copies will at least cost N^2 complexity. The patch adds some limit to the effort trying to join those very large live intervals with others. By default, for live interval with > 100 valnos, and when it has been coalesced with other live interval by more than 100 times, we will stop coalescing for the live interval anymore. That put a compile time cap for the N^2 algorithm and effectively solves the compile time problem we saw. Differential revision: https://reviews.llvm.org/D59143 llvm-svn: 355714
-
Sanjay Patel authored
llvm-svn: 355713
-
Simon Pilgrim authored
llvm-svn: 355712
-
Diogo N. Sampaio authored
The indexed variant of vfmal.f16 and vfmsl.f16 instructions use the uppser bits of the indexed operand to store the index (1 bit for the double variant, 2 bits for the quad). This limits the usable registers to d0 - d7 or s0 - s15. This patch enforces this limitation. Differential Revision: https://reviews.llvm.org/D59021 llvm-svn: 355707
-
Simon Pilgrim authored
llvm-svn: 355699
-
James Henderson authored
llvm-readelf prints relocation addends as: <symbol value>[+-]<absolute addend> where [+-] is determined from whether addend is less than zero or not. However, it does not print the +/- if there is no symbol, which meant that negative addends became their positive value with no indication that this had happened. This patch stops the absolute conversion when addends are negative and there is no associated symbol. Reviewed by: Higuoxing, mattd, MaskRay Differential Revision: https://reviews.llvm.org/D59095 llvm-svn: 355696
-
Nico Weber authored
llvm-svn: 355695
-
Nico Weber authored
From the Python subprocess docs: If shell is True, it is recommended to pass args as a string rather than as a sequence. [...] If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional arguments to the shell itself. Prior to this change, the `--version` would be passed to the shell, not to a potential gn binary on $PATH, and running `gn` without any arguments makes it exit with an exit code != 0, so the script would think that there wasn't a working gn binary on $PATH. Fix this by following the documentation's recommendation of using a string now that we pass shell=True. I tested this on macOS and Windows, each with the three cases of - no gn on PATH (should run gn downloaded by get.py if present, else suggest running get.py) - broken gn wrapper on PATH (should behave like the previous item) - working gn on PATH (should use gn on PATH) llvm-svn: 355694
-
Nico Weber authored
`os.uname()` doesn't exist on Windows, so use `platform.machine()` which returns `os.uname()[4]` on non-Win and (on 64-bit systems) "AMD64" on Windows. Also use `sys.platform` instead of `platform` to check for Windows-ness for the file extension in gn.py (get.py got this right). Differential Revision: https://reviews.llvm.org/D59115 llvm-svn: 355693
-
Simon Pilgrim authored
llvm-svn: 355690
-
Simon Pilgrim authored
llvm-svn: 355689
-
Simon Pilgrim authored
llvm-svn: 355688
-
Michael Platings authored
Use this feature to fix a bug on ARM where 4 byte alignment is incorrectly assumed. Differential Revision: https://reviews.llvm.org/D57335 llvm-svn: 355685
-
Clement Courbet authored
Summary: Right now, when we encounter a string equality check, e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a small compile-time constant, and fall back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms that support `bcmp`. `bcmp` can be made much more efficient than `memcmp` because equality compare is trivially parallel while lexicographic ordering has a chain dependency. Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits Differential Revision: https://reviews.llvm.org/D56593 llvm-svn: 355672
-
Carl Ritson authored
Summary: Fix a bug in the scheduling model where V_CVT_F32_UBYTE{0,1,2,3} are incorrectly marked as quarter rate instructions. Reviewers: arsenm, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59091 llvm-svn: 355671
-
Craig Topper authored
We were just checking pointer size and type primitive size. But this caused unintended things like vectors of half being accepted by masked load/store. For FP we now explicitly check for only double and float. For pointers we now let any pointer through. Trusting that only 32 and 64 would be used to generate assembly. We only check bitwidth after checking that the type is an integer. llvm-svn: 355667
-
Petr Hosek authored
This change is a consequence of the discussion in "RFC: Place libs in Clang-dedicated directories", specifically the suggestion that libunwind, libc++abi and libc++ shouldn't be using Clang resource directory. Tools like clangd make this assumption, but this is currently not true for the LLVM_ENABLE_PER_TARGET_RUNTIME_DIR build. This change addresses that by moving the output of these libraries to lib/<target> and include/ directories, leaving resource directory only for compiler-rt runtimes and Clang builtin headers. Differential Revision: https://reviews.llvm.org/D59013 llvm-svn: 355665
-
Steven Wu authored
Summary: In r349534, objc arc implementation is switched to use intrinsics and at the same time, clang.arc.use is renamed to llvm.objc.clang.arc.use to make the naming more consistent. The side-effect of that is llvm no longer recognize it as intrinsics and codegen external references to it instead. Rather than upgrade the old intrinsics name to the new one and wait for the arc-contract pass to remove it, simply remove it in the bitcode upgrader. rdar://problem/48607063 Reviewers: pete, ahatanak, erik.pilkington, dexonsmith Reviewed By: pete, dexonsmith Subscribers: jkorous, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59112 llvm-svn: 355663
-
Sanjay Patel authored
llvm-svn: 355655
-
Adrian Prantl authored
to get the modules bots running again. The LLVM_DEBUG macro only plays well with a modular build of LLVM when the header is marked as textual, but doing so causes redefinition errors. llvm-svn: 355653
-
Adrian Prantl authored
I think the problem is that it uses the LLVM_DEBUG macro in funciton bodies. llvm-svn: 355652
-
- Mar 07, 2019
-
-
Adrian Prantl authored
llvm-svn: 355646
-
Mitch Phillips authored
Use the system shell to see if we can find a 'gn' binary on $PATH. This solves the error wherein subprocess.call fails ungracefully if the binary doesn't exist. llvm-svn: 355645
-
Hubert Tong authored
Summary: The date-based approach to detecting unsupported versions of libstdc++ does not handle bug fix releases of older versions. As an example, the `__GLIBCXX__` value associated with version 5.1, `20150422`, is less than the values associated with versions 4.8.5 and 4.9.3. This patch adds secondary checks based on certain properties in sufficiently new versions of libstdc++. Reviewers: jfb, tstellar, rnk, sfertile, nemanjai Reviewed By: jfb Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58682 llvm-svn: 355638
-
Craig Topper authored
Rotate with explicit immediate is a single uop from Haswell on. An immediate of 1 has a dependency on the previous writer of flags, but the other immediate values do not. The implicit rotate by 1 instruction is 2 uops. But the flags are merged after the rotate uop so the data result does not see the flag dependency. But I don't think we have any way of modeling that. RORX is 1 uop without the load. 2 uops with the load. We currently model these with WriteShift/WriteShiftLd. Differential Revision: https://reviews.llvm.org/D59077 llvm-svn: 355636
-
Craig Topper authored
Haswell and possibly Sandybridge have an optimization for ADC/SBB with immediate 0 to use a single uop flow. This only applies GR16/GR32/GR64 with an 8-bit immediate. It does not apply to GR8. It also does not apply to the implicit AX/EAX/RAX forms. Differential Revision: https://reviews.llvm.org/D59058 llvm-svn: 355635
-
Brian Gesiak authored
Summary: The logic in the -unreachableblockelim pass does the following: 1. It traverses the function it's given in depth-first order and creates a set of basic blocks that are unreachable from the function's entry node. 2. It iterates over each of those unreachable blocks and (1) removes any successors' references to the dead block, and (2) replaces any uses of instructions from the dead block with null. The logic in (2) above is identical to what the `llvm::DeleteDeadBlocks` function from `BasicBlockUtils.h` does. The only difference is that `llvm::DeleteDeadBlocks` replaces uses of instructions from dead blocks not with null, but with undef. Replace the duplicate logic in the -unreachableblockelim pass with a call to `llvm::DeleteDeadBlocks`. This results in less code but no functional change (NFC). Reviewers: mkazantsev, wmi, davidxl, silvas, davide Reviewed By: davide Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59064 llvm-svn: 355634
-