- Jun 27, 2018
-
-
Fangrui Song authored
llvm-svn: 335771
-
Ben Hamilton authored
Summary: This PR adds a few acronyms related to hashing algorithms to the standard list in `objc-property-declaration`. Reviewers: Wizard Reviewed By: Wizard Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D48652 llvm-svn: 335770
-
Fangrui Song authored
llvm-svn: 335769
-
Craig Topper authored
[X86] Teach the disassembler to use %eiz/%riz instead of NoRegister when the SIB byte is present, but doesn't encode an index register and there was another shorter encoding that would achieve the same result. The %eiz/%riz are dummy registers that force the encoder to emit a SIB byte when it normally wouldn't. By emitting them in the disassembly output we ensure that assembling the disassembler output would also produce a SIB byte. This should match the behavior of objdump from binutils. llvm-svn: 335768
-
Daniel Sanders authored
Now that we have the ability to legalize based on MMO's. Add support for legalizing based on AtomicOrdering and use it to correct the legalization of the atomic instructions. Also extend all() to be a variadic template as this ruleset now requires 3 and 4 argument versions. llvm-svn: 335767
-
Teresa Johnson authored
Fix test changes added in r335760. Even though we are invoking llvm-lto2 in single threaded mode, the order of processing the modules in the backend is apparently not deterministic. Handle the expected debug messages in any order. (The determinism would be good to fix, but not related to this change.) This also undoes the change I made in r335764 to help debug this. llvm-svn: 335766
-
Aaron Enye Shi authored
Summary: Use oclc_daz_opt_on.amdgcn.bc bitcode when option fcuda-flush-denormal-to-zero is enabled, otherwise use oclc_daz_opt_off.amdgcn.bc bitcode. Added lit tests to verify that the correct bitcode is linked when -fcuda-flush-denormal-to-zero option is enabled or disabled. Reviewers: yaxunl, scchan, b-sumner Reviewed By: yaxunl, scchan, b-sumner Subscribers: cfe-commits, yaxunl Differential Revision: https://reviews.llvm.org/D48493 llvm-svn: 335765
-
Teresa Johnson authored
I am getting bot failures from r335760 that are difficult to diagnose since the stderr is getting redirected to FileCheck. Save and dump the debug output to stderr to help debug the issue. llvm-svn: 335764
-
Artem Belevich authored
This matches the way NVCC does it. Doing module cleanup at global destructor phase used to work, but is, apparently, too late for the CUDA runtime in CUDA-9.2, which ends up crashing with double-free. Differential Revision: https://reviews.llvm.org/D48613 llvm-svn: 335763
-
Matt Morehouse authored
Summary: Setting UBSAN_OPTIONS=silence_unsigned_overflow=1 will silence all UIO reports. This feature, combined with -fsanitize-recover=unsigned-integer-overflow, is useful for providing fuzzing signal without the excessive log output. Helps with https://github.com/google/oss-fuzz/issues/910. Reviewers: kcc, vsk Reviewed By: vsk Subscribers: vsk, kubamracek, Dor1s, llvm-commits Differential Revision: https://reviews.llvm.org/D48660 llvm-svn: 335762
-
Sanjay Patel authored
As noted in the D44909 review, the transform from (fptosi+sitofp) to ftrunc can produce -0.0 where the original code does not: #include <stdio.h> int main(int argc) { float x; x = -0.8 * argc; printf("%f\n", (float)((int)x)); return 0; } $ clang -O0 -mavx fp.c ; ./a.out 0.000000 $ clang -O1 -mavx fp.c ; ./a.out -0.000000 Ideally, we'd use IR/node flags to predicate the transform, but the IR parser doesn't currently allow fast-math-flags on the cast instructions. So for now, just use the function attribute that corresponds to clang's "-fno-signed-zeros" option. Differential Revision: https://reviews.llvm.org/D48085 llvm-svn: 335761
-
Teresa Johnson authored
Summary: Rather than just print the GUID, when it is available in the index, print the global name as well in the function import thin link debug messages. Names will be available when the combined index is being built by the same process, e.g. a linker or "llvm-lto2 run". Reviewers: davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits Differential Revision: https://reviews.llvm.org/D48612 llvm-svn: 335760
-
Justin Bogner authored
If you've already loaded an IRObjectFile and need access to the Modules themselves you shouldn't have to reparse a byte stream to do it. Adds an accessor for the modules in IRObjectFile. llvm-svn: 335759
-
Jessica Paquette authored
It isn't safe to outline sequences of instructions where x16/x17/nzcv live across the sequence. This teaches the outliner to check whether or not a specific canidate has x16/x17/nzcv live across it and discard the candidate in the case that that is true. https://bugs.llvm.org/show_bug.cgi?id=37573 https://reviews.llvm.org/D47655 llvm-svn: 335758
-
Jonas Devlieghere authored
As brought up during the discussion of the DWARF5 accelerator tables, there is currently no way to associate Objective-C methods with the interface they belong to, other than the .apple_objc accelerator table. After due consideration we came to the conclusion that it makes more sense to follow Pavel's suggestion of just emitting this information in the .debug_info section. One concern was that categories were emitted in the .apple_names as well, but it turns out that LLDB doesn't rely on the accelerator tables for this information. This patch changes the codegen behavior to emit subprograms for structure types, like we do for C++. This will result in the DW_TAG_subprogram being nested as a child under its DW_TAG_structure_type. This behavior is only enabled for DWARF5 and later, so we can have a unique code path in LLDB with regards to obtaining the class methods. This was tested on the LLDB side and doesn't lead to a regression. There's already code in place to deal with member functions in C++, which deals with this transparently. For more background please refer to the discussion on the mailing list: http://lists.llvm.org/pipermail/llvm-dev/2018-June/123986.html Differential revision: https://reviews.llvm.org/D48241 llvm-svn: 335757
-
Sanjay Patel authored
llvm-svn: 335756
-
Petr Hosek authored
The zx_cprng_draw system call no longer takes the output argument. Differential Revision: https://reviews.llvm.org/D48657 llvm-svn: 335755
-
Craig Topper authored
If we are just modifying a single bit at a variable bit position we can use the BT* instructions to make the change instead of shifting a 1(or rotating a -1) and doing a binop. These instruction also ignore the upper bits of their index input so we can also remove an and if one is present on the index. Fixes PR37938. llvm-svn: 335754
-
Craig Topper authored
llvm-svn: 335753
-
Mikhail R. Gadelha authored
This broke a number of bots. This reverts commit 5e1a89912d37a21c3b49ccf30600d7f498dffa9c. llvm-svn: 335752
-
Jakub Kuderski authored
Summary: AliasSet::print uses `I->printAsOperand` to print UnknownInstructions. The problem is that not all UnknownInstructions have names (e.g. call instructions). When such instructions are printed, they appear as `<badref>` in AliasSets, which is very confusing, as the values are perfectly valid. This patch fixes that by printing UnknownInstructions without a name using `print` instead of `printAsOperand`. Reviewers: asbirlea, chandlerc, sanjoy, grosser Reviewed By: asbirlea Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48609 llvm-svn: 335751
-
Francis Visoiu Mistrih authored
Fails on Green Dragon: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50174/consoleFull UNRESOLVED: Clang :: CodeGen/vld_dup.c (5546 of 38947) ******************** TEST 'Clang :: CodeGen/vld_dup.c' FAILED ******************** Test has no run line! llvm-svn: 335750
-
Jonas Devlieghere authored
This patch splits off some abstractions used by dsymutil's dwarf linker and moves them into separate header and implementation files. This almost halves the number of LOC in DwarfLinker.cpp and makes it a lot easier to understand what functionality lives where. Differential revision: https://reviews.llvm.org/D48647 llvm-svn: 335749
-
Matt Davis authored
Summary: This patch removes a few callbacks from Pipeline. It comes at the cost of registering Listeners with all Stages. Not all stages need listeners or issue callbacks, this registration is a bit redundant. However, as we build-out the API, this redundancy can disappear. The main purpose here is to move callback code from the Pipeline and into the stages that actually issue those callbacks. This removes the back-pointer to the Pipeline that was put into a few Stage subclasses. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D48576 llvm-svn: 335748
-
Vedant Kumar authored
On Darwin/x86_64, asan may report the crashing line of NullDeref as line 19 (i.e the closing brace of the function), whereas on other targets we see line 15 ("ptr[10]++"). The optimized debug info here isn't reliable enough to check. rdar://problem/41526369 llvm-svn: 335747
-
Simon Pilgrim authored
Increase coverage to make sure we're not doing anything stupid without AVX512BW llvm-svn: 335746
-
Craig Topper authored
llvm-svn: 335745
-
Craig Topper authored
[X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics that don't take a mask as input to exclude '.mask.' from their name. I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask". llvm-svn: 335744
-
Fangrui Song authored
Post commit review at D48406 llvm-svn: 335743
-
Stanislav Mekhanoshin authored
If a source of rcp instruction is a result of any conversion from an integer convert it into rcp_iflag instruction. No FP exception can ever happen except division by zero if a single precision rcp argument is a representation of an integral number. Differential Revision: https://reviews.llvm.org/D48569 llvm-svn: 335742
-
Vedant Kumar authored
On some ARM platforms this test depends on debug locations being present on constant materialization code, which was eliminated in r335497. Relax the test to allow two outcomes: the backtrace either contains the right line numbers, or no line numbers. llvm-svn: 335741
-
Alexander Kornienko authored
Summary: Add an extension point to allow registration of statically-linked Clang Static Analyzer checkers that are not a part of the Clang tree. This extension point employs the mechanism used when checkers are registered from dynamically loaded plugins. Reviewers: george.karpenkov, NoQ, xazax.hun, dcoughlin Reviewed By: george.karpenkov Subscribers: mgorny, mikhail.ramalho, rnkovacs, xazax.hun, szepet, a.sidorin, cfe-commits Differential Revision: https://reviews.llvm.org/D45718 llvm-svn: 335740
-
Mikhail R. Gadelha authored
Signed-off-by:
Mikhail Ramalho <mikhail.ramalho@gmail.com> llvm-svn: 335739
-
George Rimar authored
Currently, ICF does not enable threading if we have less than 1024 sections in each equivalence class. And the following code is uncovered by our test cases: https://github.com/llvm-mirror/lld/blob/master/ELF/ICF.cpp#L404 This patch adds a test case that triggers the mentioned code to execute. llvm-svn: 335738
-
Luke Geeson authored
llvm-svn: 335737
-
Alexander Kornienko authored
[clang-tidy] Add ExprMutationAnalyzer, that analyzes whether an expression is mutated within a statement. Summary: (Originally started as a clang-tidy check but there's already D45444 so shifted to just adding ExprMutationAnalyzer) `ExprMutationAnalyzer` is a generally useful helper that can be used in different clang-tidy checks for checking whether a given expression is (potentially) mutated within a statement (typically the enclosing compound statement.) This is a more general and more powerful/accurate version of isOnlyUsedAsConst, which is used in ForRangeCopyCheck, UnnecessaryCopyInitialization. It should also be possible to construct checks like D45444 (suggest adding const to variable declaration) or https://bugs.llvm.org/show_bug.cgi?id=21981 (suggest adding const to member function) using this helper function. This function is tested by itself and is intended to stay generally useful instead of tied to any particular check. Reviewers: hokein, alexfh, aaron.ballman, ilya-biryukov, george.karpenkov Reviewed By: aaron.ballman Subscribers: lebedev.ri, shuaiwang, rnkovacs, hokein, alexfh, aaron.ballman, a.sidorin, Eugene.Zelenko, xazax.hun, JonasToth, klimek, mgorny, cfe-commits Tags: #clang-tools-extra Patch by Shuai Wang. Differential Revision: https://reviews.llvm.org/D45679 llvm-svn: 335736
-
Adhemerval Zanella authored
This patch adds a custom trunc store lowering for v4i8 vector types. Since there is not v.4b register, the v4i8 is promoted to v4i16 (v.4h) and default action for v4i8 is to extract each element and issue 4 byte stores. A better strategy would be to extended the promoted v4i16 to v8i16 (with undef elements) and extract and store the word lane which represents the v4i8 subvectores. The construction: define void @foo(<4 x i16> %x, i8* nocapture %p) { %0 = trunc <4 x i16> %x to <4 x i8> %1 = bitcast i8* %p to <4 x i8>* store <4 x i8> %0, <4 x i8>* %1, align 4, !tbaa !2 ret void } Can be optimized from: umov w8, v0.h[3] umov w9, v0.h[2] umov w10, v0.h[1] umov w11, v0.h[0] strb w8, [x0, #3] strb w9, [x0, #2] strb w10, [x0, #1] strb w11, [x0] ret To: xtn v0.8b, v0.8h str s0, [x0] ret The patch also adjust the memory cost for autovectorization, so the C code: void foo (const int *src, int width, unsigned char *dst) { for (int i = 0; i < width; i++) *dst++ = *src++; } can be vectorized to: .LBB0_4: // %vector.body // =>This Inner Loop Header: Depth=1 ldr q0, [x0], #16 subs x12, x12, #4 // =4 xtn v0.4h, v0.4s xtn v0.8b, v0.8h st1 { v0.s }[0], [x2], #4 b.ne .LBB0_4 Instead of byte operations. llvm-svn: 335735
-
Ivan A. Kosarev authored
This patch reworks the support for dup NEON intrinsics as described in D48439. Differential Revision: https://reviews.llvm.org/D48440 llvm-svn: 335734
-
Ivan A. Kosarev authored
This patch adds support for the q versions of the dup (load-to-all-lanes) NEON intrinsics, such as vld2q_dup_f16() for example. Currently, non-q versions of the dup intrinsics are implemented in clang by generating IR that first loads the elements of the structure into the first lane with the lane (to-single-lane) intrinsics, and then propagating it other lanes. There are at least two problems with this approach. First, there are no double-spaced to-single-lane byte-element instructions. For example, there is no such instruction as 'vld2.8 { d0[0], d2[0] }, [r0]'. That means we cannot rely on the to-single-lane intrinsics and instructions to implement the q versions of the dup intrinsics. Note that to-all-lanes instructions do support all sizes of data items, including bytes. The second problem with the current approach is that we need a separate vdup instruction to propagate the structure to each lane. So for vld4q_dup_f16() we would need four vdup instructions in addition to the initial vld instruction. This patch introduces dup LLVM intrinsics and reworks handling of the currently supported (non-q) NEON dup intrinsics to expand them into those LLVM intrinsics, thus eliminating the need for using to-single-lane intrinsics and instructions. Additionally, this patch adds support for u64 and s64 dup NEON intrinsics. These are marked as Arch64-only in the ARM NEON Reference, but it seems there are no reasons to not support them in AArch32 mode. Please correct, if that is wrong. That's what we generate with this patch applied: vld2q_dup_f16: vld2.16 {d0[], d2[]}, [r0] vld2.16 {d1[], d3[]}, [r0] vld3q_dup_f16: vld3.16 {d0[], d2[], d4[]}, [r0] vld3.16 {d1[], d3[], d5[]}, [r0] vld4q_dup_f16: vld4.16 {d0[], d2[], d4[], d6[]}, [r0] vld4.16 {d1[], d3[], d5[], d7[]}, [r0] Differential Revision: https://reviews.llvm.org/D48439 llvm-svn: 335733
-
Zaara Syeda authored
The local dynamic TLS access on PPC64 ELF v2 ABI uses R_PPC64_GOT_DTPREL16* relocations when a TLS variables falls outside 2 GB of the thread storage block. This patch adds support for these relocations by adding a new RelExpr called R_TLSLD_GOT_OFF which emits a got entry for the TLS variable relative to the dynamic thread pointer using the relocation R_PPC64_DTPREL64. It then evaluates the R_PPC64_GOT_DTPREL16* relocations as the got offset for the R_PPC64_DTPREL64 got entries. Differential Revision: https://reviews.llvm.org/D48484 llvm-svn: 335732
-