- Jun 08, 2018
-
-
Roman Lebedev authored
Summary: `%ret = add nuw i8 %x, C` From [[ https://llvm.org/docs/LangRef.html#add-instruction | langref ]]: nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”, respectively. If the nuw and/or nsw keywords are present, the result value of the add is a poison value if unsigned and/or signed overflow, respectively, occurs. So if `C` is `-1`, `%x` can only be `0`, and the result is always `-1`. I'm not sure we want to use `KnownBits`/`LVI` here, because there is exactly one possible value (all bits set, `-1`), so some other pass should take care of replacing the known-all-ones with constant `-1`. The `test/Transforms/InstCombine/set-lowbits-mask-canonicalize.ll` change *is* confusing. What happening is, before this: (omitting `nuw` for simplicity) 1. First, InstCombine D47428/rL334127 folds `shl i32 1, %NBits`) to `shl nuw i32 -1, %NBits` 2. Then, InstSimplify D47883/rL334222 folds `shl nuw i32 -1, %NBits` to `-1`, 3. `-1` is inverted to `0`. But now: 1. *This* InstSimplify fold `%ret = add nuw i32 %setbit, -1` -> `-1` happens first, before InstCombine D47428/rL334127 fold could happen. Thus we now end up with the opposite constant, and it is all good: https://rise4fun.com/Alive/OA9 https://rise4fun.com/Alive/sldC Was mentioned in D47428 review. Follow-up for D47883. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47908 llvm-svn: 334298
-
Simon Pilgrim authored
These aren't true zero-idiom instructions (just dependency breaking). llvm-svn: 334297
-
Simon Pilgrim authored
[X86][BtVer2] Add tests for scalar SUB/XOR instructions that should match the dependency-breaking 'zero-idiom' As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions). llvm-svn: 334296
-
Alexander Kornienko authored
Summary: The function `llvm::sys::commandLineFitsWithinSystemLimits` appears to be overestimating the system limits. This issue was discovered while attempting to enable response files in the Swift compiler. When the compiler submits its frontend jobs, those jobs are subjected to the system limits on command line length. `commandLineFitsWithinSystemLimits` is used to determine if the job's arguments need to be wrapped in a response file. There are some cases where the argument size for the job passes `commandLineFitsWithinSystemLimits`, but actually exceeds the real system limit, and the job fails. `clang` also uses this function to decide whether or not to wrap it's job arguments in response files. See: https://github.com/llvm-mirror/clang/blob/master/lib/Driver/Driver.cpp#L1341. Clang will also fail for response files who's size falls within a certain range. I wrote a script that should find a failure point for `clang++`. All that is needed to run it is Python 2.7, and a simple "hello world" program for `test.cc`. It should run on Linux and on macOS. The script is available here: https://gist.github.com/dabelknap/71bd083cd06b91c5b3cef6a7f4d3d427. When it hits a failure point, you should see a `clang: error: unable to execute command: posix_spawn failed: Argument list too long`. The proposed solution is to mirror the behavior of `xargs` in `commandLinefitsWithinSystemLimits`. `xargs` defaults to 128k for the command line length size (See: https://fossies.org/dox/findutils-4.6.0/buildcmd_8c_source.html#l00551). It adjusts this depending on the value of `ARG_MAX`. Reviewers: alexfh Reviewed By: alexfh Subscribers: llvm-commits Tags: #clang Patch by Austin Belknap! Differential Revision: https://reviews.llvm.org/D47795 llvm-svn: 334295
-
Zachary Turner authored
NFC here, this just raises some platform specific ifdef hackery out of a class and creates proper platform-independent typedefs for the relevant things. This allows these typedefs to be reused in other places without having to reinvent this preprocessor logic. llvm-svn: 334294
-
Zachary Turner authored
O_CLOEXEC is the right default, but occasionally you don't want this. This is especially true for tools like debuggers where you might need to spawn the child process with specific files already open, but it's occasionally useful in other scenarios as well, like when you want to do some IPC between parent and child. llvm-svn: 334293
-
Simon Pilgrim authored
Reduces output size and we're only wanting to check that the instructions are fast-path'd (just Dispatch+Retire) anyhow llvm-svn: 334292
-
Simon Pilgrim authored
llvm-svn: 334291
-
Simon Pilgrim authored
Now that we're custom lowering vector rotates for SSE in general we should be testing the combines with them as well. llvm-svn: 334290
-
Simon Pilgrim authored
Simplify combineVectorTruncationWithPACKUS to mask the upper bits followed by calling truncateVectorWithPACK instead of duplicating with similar code. This results in the codegen using (V)PACKUSDW on SSE41+ targets for vXi64/vXi32 inputs where before it always used PACKUSWB (along with a lot more bitcasting). I've raised PR37749 as until we avoid unnecessary concats back to 256-bit for bitwise ops, we can't avoid splitting the input value into 128-bit subvectors for masking. llvm-svn: 334289
-
Sanjay Patel authored
The description got deleted along with the FIXME note in rL334268. llvm-svn: 334288
-
Sam McCall authored
Summary: Macros are terribly spammy at the moment and this offers some relief. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D47936 llvm-svn: 334287
-
Artur Pilipenko authored
Currently the loop branch heuristic is applied before the invoke heuristic which makes us overestimate the probability of the unwind destination of invokes inside loops. This in turn makes us grossly underestimate the frequencies of loops with invokes. Reviewed By: skatkov, vsk Differential Revision: https://reviews.llvm.org/D47371 llvm-svn: 334285
-
Florian Hahn authored
This first step separates VPInstruction-based and VPRecipe-based VPlan creation, which should make it easier to migrate to VPInstruction based code-gen step by step. Reviewers: Ayal, rengolin, dcaballe, hsaito, mkuper, mzolotukhin Reviewed By: dcaballe Subscribers: bollu, tschuett, rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D47477 llvm-svn: 334284
-
Henry Wong authored
Summary: Add `StringRef::rsplit(StringRef Separator)` to achieve the function of getting the tail substring according to the separator. A typical usage is to get `data` in `std::basic_string::data`. Reviewers: mehdi_amini, zturner, beanz, xbolva00, vsk Reviewed By: zturner, xbolva00, vsk Subscribers: vsk, xbolva00, llvm-commits, MTC Differential Revision: https://reviews.llvm.org/D47406 llvm-svn: 334283
-
Tatyana Krasnukha authored
Summary: Default copy/move constructors and assignment operators leave wrong m_sets[i].registers pointers. Made the class movable and non-copyable (it's difficult to imagine when it needs to be copied). Reviewers: clayborg Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D47728 llvm-svn: 334282
-
Jonas Hahnfeld authored
CGM.GetAddrOfConstantCString() sets the adress of the created GlobalValue to unnamed. When emitting the object file LLVM will mark the surrounding section as SHF_MERGE iff the string is nul-terminated and contains no other nuls (see IsNullTerminatedString). This results in problems when saving temporaries because LLVM doesn't set an EntrySize, so reading in the serialized assembly file fails. This never happened for the GPU binaries because they usually contain a nul-character somewhere. Instead this only affected the module ID when compiling relocatable device code. However, this points to a potentially larger problem: If we put a constant string into a named section, we really want the data to end up in that section in the object file. To avoid LLVM merging sections this patch unmarks the GlobalVariable's address as unnamed which also fixes the problem of invalid serialized assembly files when saving temporaries. Differential Revision: https://reviews.llvm.org/D47902 llvm-svn: 334281
-
Simon Dardis authored
Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47638 llvm-svn: 334280
-
Pavel Labath authored
r334215 changed the error message the tool prints for invalid thread arguments to -exec-next command. This adjust the test to match that. llvm-svn: 334279
-
Alex Bradbury authored
The instruction makes use of a previously ignored field in the fence instruction. It is introduced in the version 2.3 draft of the RISC-V specification after much work by the Memory Model Task Group. As clarified here <https://github.com/riscv/riscv-isa-manual/issues/186>, the fence.tso assembler mnemonic does not have operands. llvm-svn: 334278
-
Pavel Labath authored
This also fixes a bug where SymbolFileDWARF was returning the same function multiple times - this can happen if both mangled and demangled names match the regex. Other lookup lookup functions had code to handle this case, but it was forgotten here. llvm-svn: 334277
-
Simon Pilgrim authored
We have some combines/lowerings that attempt to use PACKSS-then-PACKUS and others that use PACKUS-then-PACKSS. PACKUS is much easier to combine with if we know the upper bits are zero as ComputeKnownBits can easily see through BITCASTs etc. especially now that rL333995 and rL334007 have landed. It also effectively works at byte level which further simplifies shuffle combines. The only (minor) annoyances are that ComputeKnownBits can sometimes take longer as it doesn't fail as quickly as ComputeNumSignBits (but I'm not seeing any actual regressions in tests) and PACKUSDW only became available after SSE41 so we have more codegen diffs between targets. llvm-svn: 334276
-
Florian Hahn authored
Reviewers: dsanders, craig.topper, stoklund, nhaehnle Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D47525 llvm-svn: 334275
-
Sam McCall authored
Reviewers: ilya-biryukov Subscribers: klimek, ioeric, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D47707 llvm-svn: 334274
-
Pavel Labath authored
Summary: This patch implements the non-regex variant of GetFunctions. To share more code with the Apple implementation, I've extracted the common filtering code from that class into a utility function on the DWARFIndex base class. The new implementation also searching the accelerator table multiple times -- previously it could happen that the apple table would return the same die more than once if one specified multiple search flags in name_type_mask. This way, I separate table iteration from filtering, and so we can be sure each die is inserted at most once. Reviewers: clayborg, JDevlieghere Subscribers: aprantl, lldb-commits Differential Revision: https://reviews.llvm.org/D47881 llvm-svn: 334273
-
David Carlier authored
pthread.h missing for pthread_key* functions. Reviewers: dberris Reviewed By: dberris Differential Revison: https://reviews.llvm.org/D47933 llvm-svn: 334272
-
Roman Shirokiy authored
There could be more than one PHIs in exit block using same loop recurrence. Don't assume there is only one and fix each user. Differential Revision: https://reviews.llvm.org/D47788 llvm-svn: 334271
-
Haojian Wu authored
Summary: This patch improves the check to match the desugared "string" type (so that it can handle custom-implemented string classes), see the newly-added test. Reviewers: alexfh Subscribers: klimek, xazax.hun, cfe-commits Differential Revision: https://reviews.llvm.org/D47704 llvm-svn: 334270
-
Matt Arsenault authored
These won't work as expected now, so error on them to avoid wasting time debugging this in the future. llvm-svn: 334269
-
Sam Parker authored
While trying to propagate AND masks back to loads, we currently allow one non-load node to be included as a leaf in chain. This fix now limits that node to produce only a single data value. Differential Revision: https://reviews.llvm.org/D47878 llvm-svn: 334268
-
Dean Michael Berris authored
This change uses 'const' for the retryingWriteAll(...) API and removes unnecessary 'static' local variables in getting the temporary filename. llvm-svn: 334267
-
Craig Topper authored
llvm-svn: 334266
-
Craig Topper authored
[X86] Add builtins for pshufd, pshuflw, and pshufhw to enable target feature and immediate range checking. llvm-svn: 334265
-
Craig Topper authored
This was caught by the Header tests, but not the CodeGen tests. llvm-svn: 334264
-
Hiroshi Inoue authored
llvm-svn: 334263
-
Dean Michael Berris authored
Summary: This fixes http://llvm.org/PR32274. This change adds a test to ensure that we're able to link XRay modes and the runtime to binaries that don't need to depend on the C++ standard library or a C++ ABI library. In particular, we ensure that this will work with C programs compiled+linked with XRay. To make the test pass, we need to change a few things in the XRay runtime implementations to remove the reliance on C++ ABI features. In particular, we change the thread-safe function-local-static initialisation to use pthread_* instead of the C++ features that ensure non-trivial thread-local/function-local-static initialisation. Depends on D47696. Reviewers: dblaikie, jfb, kpw, eizan Reviewed By: kpw Subscribers: echristo, eizan, kpw, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D46998 llvm-svn: 334262
-
Craig Topper authored
[X86] Add subvector insert and extract builtins to enable target feature checking and immediate range checking. Test changes are due to differences in how we generate undef elements now. We also changed the types used for extractf128_si256/insertf128_si256 to match the signature of the builtin that previously existed which this patch resurrects. This also matches gcc. llvm-svn: 334261
-
Aaron Smith authored
Summary: The patch adds support of splitted functions (when MSVC is used with PGO) and function-level linking feature. SymbolFilePDB::ParseCompileUnitLineTable function relies on fact that ranges of compiled source files in the binary are continuous and don't intersect each other. The function creates LineSequence for each file and inserts it into LineTable, and implementation of last one relies on continuity of the sequence. But it's not always true when function-level linking enabled, e.g. in added input test file test-pdb-function-level-linking.exe there is xstring's std__basic_string_char_std__char_traits_char__std__allocator_char_____max_size (.00454820) between test-pdb-function-level-linking.cpp's foo (.00454770) and main (.004548F0). To fix the problem this patch renews the sequence on each address gap. Reviewers: asmith, zturner Reviewed By: asmith Subscribers: aleksandr.urakov, labath, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D47708 llvm-svn: 334260
-
Raphael Isemann authored
Subscribers: lldb-commits Differential Revision: https://reviews.llvm.org/D47923 llvm-svn: 334259
-
Craig Topper authored
[X86] Improve some shuffle decoding code to remove a conditional from a loop and reduce the number of temporary variables. NFCI The NumControlBits variable was definitely sketchy. I think that only worked because the expected value was 1 or 2 and the number of lanes was 2 or 4. Had their been 8 lanes the number of bits should have been 3 not 4 as the previous code would have given. llvm-svn: 334258
-