- Dec 27, 2016
-
-
Mehdi Amini authored
[doc] Add mention of the difference in optimization level between Release and RelWithDebInfo in Cmake.rst This is surprising to many people. llvm-svn: 290556
-
Chandler Carruth authored
pattern easier to write. Differential Revision: https://reviews.llvm.org/D28120 llvm-svn: 290555
-
Simon Pilgrim authored
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. Differential Revision: https://reviews.llvm.org/D28119 llvm-svn: 290554
-
Chandler Carruth authored
wrapper. llvm-svn: 290553
-
- Dec 26, 2016
-
-
Richard Smith authored
llvm-svn: 290552
-
Daniel Berlin authored
llvm-svn: 290551
-
Daniel Berlin authored
Mostly use a bit more idiomatic C++ where we can, so we can combine some things later. Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28111 llvm-svn: 290550
-
Daniel Berlin authored
llvm-svn: 290549
-
Davide Italiano authored
NewGVN can be tested passing `-mllvm -enable-newgvn` to clang. Differential Revision: https://reviews.llvm.org/D28059 llvm-svn: 290548
-
Simon Pilgrim authored
llvm-svn: 290547
-
Davide Italiano authored
The current GVN algorithm folds unconditional branches to, it claims, expose more PRE oportunities. The folding, if really needed, (which is not sure, as it's not really proved it improves analysis) can be done by an earlier cleanup pass instead of GVN itself. Ack'ed/SGTM'd by Daniel Berlin. Differential Revision: https://reviews.llvm.org/D28117 llvm-svn: 290546
-
Simon Pilgrim authored
llvm-svn: 290545
-
Simon Pilgrim authored
llvm-svn: 290544
-
Davide Italiano authored
llvm-svn: 290543
-
Bryant Wong authored
Differential Revision: https://reviews.llvm.org/D28110 llvm-svn: 290542
-
Marina Yatsina authored
llvm-svn: 290541
-
Marina Yatsina authored
Updated test according to commit 290539: According to extended asm syntax, a case where the clobber list includes a variable from the inputs or outputs should be an error - conflict. for example: const long double a = 0.0; int main() { char b; double t1 = a; __asm__ ("fucompp": "=a" (b) : "u" (t1), "t" (t1) : "cc", "st", "st(1)"); return 0; } This should conflict with the output - t1 which is st, and st which is st aswell. The patch fixes it. Commit on behald of Ziv Izhar. Differential Revision: https://reviews.llvm.org/D15075 llvm-svn: 290540
-
Marina Yatsina authored
According to extended asm syntax, a case where the clobber list includes a variable from the inputs or outputs should be an error - conflict. for example: const long double a = 0.0; int main() { char b; double t1 = a; __asm__ ("fucompp": "=a" (b) : "u" (t1), "t" (t1) : "cc", "st", "st(1)"); return 0; } This should conflict with the output - t1 which is st, and st which is st aswell. The patch fixes it. Commit on behald of Ziv Izhar. Differential Revision: https://reviews.llvm.org/D15075 llvm-svn: 290539
-
Tobias Grosser authored
This update improves isl's ability to coalesce different convex sets/maps, especially when the contain existentially quantified variables. llvm-svn: 290538
-
Chandler Carruth authored
systematically and document in the test what all is going on. This replaces the PR-named test that was the only coverage for GlobalDCE and comdats previously. I wrote this because I wasn't certain how comdat DCE was supposed to work and wanted to step through what GlobalDCE did to fully understand it. After talking to folks and reading the code and really staring at things it all makes sense but it seemed good to help write down some of this in a more explicit and fully covering test case. For example, it seemed like a bug that GlobalDCE didn't consider comdat participation of ifuncs. Specifically it seemed like an accident because testing didn't really cover that case. But in fact, ifuncs specifically cannot participate in a comdat despite having that API. The new test case covers this and explicitly documents that DCE gets to fire here even though there are comdats involved. Also, we didn't have any positive tests for the challenging cases such as usage cycles between comdat participants that might make them seem alive except that there is no external edge into the cycle. llvm-svn: 290537
-
Craig Topper authored
llvm-svn: 290536
-
Craig Topper authored
[AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION. Summary: I only do this for unmasked cases for now because isel is failing to fold the mask. I'll try to fix that soon. I'll do the same thing for packed add/sub/mul/div in a future patch. Reviewers: delena, RKSimon, zvi, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27879 llvm-svn: 290535
-
Saleem Abdulrasool authored
llvm-svn: 290534
-
Saleem Abdulrasool authored
Use of these flags would result in the use of ELF-style PIE/PIC code which is incorrect on Windows. Windows is inherently PIC by means of the DLL slide that occurs at load. This also mirrors the behaviour on GCC for MinGW. Currently, the Windows x86_64 forces the relocation model to PIC (Level 2). This is unchanged for now, though we should remove any assumptions on that and change it to a static relocation model. llvm-svn: 290533
-
Craig Topper authored
[AVX-512] Don't assume that the rounding mode argument to intrinsics is a constant. While clang will guarantee this, nothing in the backend will. A non-constant value will now result in an isel error instead of just asserting or crashing due to a bad cast during lowering. llvm-svn: 290532
-
Chandler Carruth authored
llvm-svn: 290531
-
Craig Topper authored
[AVX-512][InstCombine] Teach InstCombine to converted masked vpermv intrinsics into shufflevector instructions Summary: This patch adds support for converting the masked vpermv intrinsics into shufflevector instructions if the indices are constants. We also need to wrap a select instruction around the shuffle to take care of the masking part. InstCombine will take care of optimizing the select if the mask is constant so I didn't bother checking for that. Reviewers: zvi, delena, spatel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27825 llvm-svn: 290530
-
Bryant Wong authored
Differential Revision: https://reviews.llvm.org/D26619 llvm-svn: 290529
-
Chandler Carruth authored
This recommits r290512 that was reverted when MSVC failed to compile it. Since then I've played with various approaches using rextester.com (where I was able to reproduce the failure) and think that I have a solution thanks in part to the help of Dave Blaikie! It seems MSVC just has a defective `decltype` in this version. Manually writing out the type seems to do the trick, even though it is .... quite complicated. Original commit message: This allows both defining convenience iterator/range accessors on types which walk across N different independent ranges within the object, and more direct and simple usages with range based for loops such as shown in the unittest. The same facilities are used for both. They end up quite small and simple as it happens. I've also switched an iterator on `Module` to use this. I would like to add another convenience iterator that includes even more sequences as part of it and seeing this one already present motivated me to actually abstract it away and introduce a general utility. Differential Revision: https://reviews.llvm.org/D28093 llvm-svn: 290528
-
Bryant Wong authored
Differential Revision: https://reviews.llvm.org/D26661 llvm-svn: 290527
-
- Dec 25, 2016
-
-
Bryant Wong authored
Differential Revision: https://reviews.llvm.org/D27034 llvm-svn: 290526
-
Daniel Berlin authored
Value number stores and memory states so we can detect when memory states are equivalent (IE store of same value to memory). Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28084 llvm-svn: 290525
-
Daniel Berlin authored
llvm-svn: 290524
-
Lang Hames authored
multiple asynchronous RPC calls. ParallelCallGroup allows multiple asynchronous calls to be dispatched, and provides a wait method that blocks until all asynchronous calls have been executed on the remote and all return value handlers run on the local machine. This will allow, for example, the JIT client to issue memory allocation calls for all sections in parallel, then block until all memory has been allocated on the remote and the allocated addresses registered with the client, at which point the JIT client can proceed to applying relocations. llvm-svn: 290523
-
Richard Smith authored
llvm-svn: 290522
-
Kuba Mracek authored
This allows compiler-rt to be built on older macOS SDKs, where there symbols are not defined. Patch by Jeremy Huddleston Sequoia <jeremyhu@apple.com>. llvm-svn: 290521
-
Lang Hames authored
Some of the recent RPC call type-checking changes weren't formatted prior to commit. llvm-svn: 290520
-
Greg Clayton authored
llvm-svn: 290519
-
Roman Gareev authored
If the parameters of the target cache (i.e., cache level sizes, cache level associativities) are not specified or have wrong values, we use ones for parameters of the macro-kernel and do not perform data-layout optimizations of the matrix multiplication. In this patch we specify the default values of the cache parameters to be able to apply the pattern matching optimizations even in this case. Since there is no typical values of this parameters, we use the parameters of Intel Core i7-3820 SandyBridge that also help to attain the high-performance on IBM POWER System S822 and IBM Power 730 Express server. Reviewed-by:
Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D28090 llvm-svn: 290518
-
Michael Zuckerman authored
llvm-svn: 290517
-