- Nov 16, 2016
-
-
Tim Northover authored
One half of the shifts obviously needed conditional selection based on whether the shift amount is more than 32-bits, but leaving the other half as the natural shift isn't acceptable either: it's undefined behaviour to shift a 32-bit value by more than 31. llvm-svn: 287149
-
Rong Xu authored
We fail to produce bit-to-bit matching stage2 and stage3 compiler in PGO bootstrap build. The reason is because LoopBlockSet is of SmallPtrSet type whose iterating order depends on the pointer value. This patch fixes this issue by changing to use SmallSetVector. Differential Revision: http://reviews.llvm.org/D26634 llvm-svn: 287148
-
Sanjay Patel authored
llvm-svn: 287147
-
Matt Arsenault authored
This fixes a probably unintended divergence from the default scheduler behavior. llvm-svn: 287146
-
Sanjay Patel authored
llvm-svn: 287145
-
Mike Aizatsky authored
Subscribers: kubabrecka Differential Revision: https://reviews.llvm.org/D26756 llvm-svn: 287144
-
Davide Italiano authored
Apparently this is wrong because it's legal to have a filename on UNIX which contains a backslash. Differential Revision: https://reviews.llvm.org/D26734 llvm-svn: 287143
-
Geoff Berry authored
Summary: Extend replaceZeroVectorStore to handle more vector type stores, floating point zero vectors and set alignment more accurately on split stores. This is a follow-up change to r286875. This change fixes PR31038. Reviewers: MatzeB Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26682 llvm-svn: 287142
-
Adrian Prantl authored
This removes checks that are irrelevant for what is being tested. llvm-svn: 287141
-
Rui Ueyama authored
TaskGroup has a fairly high overhead, so we don't want to partition tasks into too small tasks. This patch partition tasks into up to 1024 tasks. I compared this patch with the original LLD's parallel_for_each. I reverted r287042 locally for comparison. With this patch, time to self-link lld with debug info changed from 6.23 seconds to 4.62 seconds (-25.8%), with -threads and without -build-id. With both -threads and -build-id, it improved from 11.71 seconds to 4.94 seconds (-57.8%). Full results are below. BTW, GNU gold takes 11.65 seconds to link the same binary. NOW --no-threads --build-id=none 6789.847776 task-clock (msec) # 1.000 CPUs utilized ( +- 1.86% ) 685 context-switches # 0.101 K/sec ( +- 2.82% ) 4 cpu-migrations # 0.001 K/sec ( +- 31.18% ) 1,424,690 page-faults # 0.210 M/sec ( +- 1.07% ) 21,339,542,522 cycles # 3.143 GHz ( +- 1.49% ) 13,092,260,230 stalled-cycles-frontend # 61.35% frontend cycles idle ( +- 2.23% ) <not supported> stalled-cycles-backend 21,462,051,828 instructions # 1.01 insns per cycle # 0.61 stalled cycles per insn ( +- 0.41% ) 3,955,296,378 branches # 582.531 M/sec ( +- 0.39% ) 75,699,909 branch-misses # 1.91% of all branches ( +- 0.08% ) 6.787630744 seconds time elapsed ( +- 1.86% ) --threads --build-id=none 14767.148697 task-clock (msec) # 3.196 CPUs utilized ( +- 2.56% ) 28,891 context-switches # 0.002 M/sec ( +- 1.99% ) 905 cpu-migrations # 0.061 K/sec ( +- 5.49% ) 1,262,122 page-faults # 0.085 M/sec ( +- 1.68% ) 43,116,163,217 cycles # 2.920 GHz ( +- 3.07% ) 33,690,171,242 stalled-cycles-frontend # 78.14% frontend cycles idle ( +- 3.67% ) <not supported> stalled-cycles-backend 22,836,731,536 instructions # 0.53 insns per cycle # 1.48 stalled cycles per insn ( +- 1.13% ) 4,382,712,998 branches # 296.788 M/sec ( +- 1.33% ) 78,622,295 branch-misses # 1.79% of all branches ( +- 0.54% ) 4.621228056 seconds time elapsed ( +- 1.90% ) --threads --build-id=sha1 24594.457135 task-clock (msec) # 4.974 CPUs utilized ( +- 1.78% ) 29,902 context-switches # 0.001 M/sec ( +- 2.62% ) 1,097 cpu-migrations # 0.045 K/sec ( +- 6.29% ) 1,313,947 page-faults # 0.053 M/sec ( +- 2.36% ) 70,516,415,741 cycles # 2.867 GHz ( +- 0.78% ) 47,570,262,296 stalled-cycles-frontend # 67.46% frontend cycles idle ( +- 0.86% ) <not supported> stalled-cycles-backend 73,124,599,029 instructions # 1.04 insns per cycle # 0.65 stalled cycles per insn ( +- 0.33% ) 10,495,266,104 branches # 426.733 M/sec ( +- 0.41% ) 91,444,149 branch-misses # 0.87% of all branches ( +- 0.83% ) 4.944291711 seconds time elapsed ( +- 1.72% ) PREVIOUS --threads --build-id=none 7307.437544 task-clock (msec) # 1.160 CPUs utilized ( +- 2.34% ) 3,128 context-switches # 0.428 K/sec ( +- 4.37% ) 352 cpu-migrations # 0.048 K/sec ( +- 5.98% ) 1,354,450 page-faults # 0.185 M/sec ( +- 2.20% ) 22,081,733,098 cycles # 3.022 GHz ( +- 1.46% ) 13,709,991,267 stalled-cycles-frontend # 62.09% frontend cycles idle ( +- 1.77% ) <not supported> stalled-cycles-backend 21,634,468,895 instructions # 0.98 insns per cycle # 0.63 stalled cycles per insn ( +- 0.86% ) 3,993,062,361 branches # 546.438 M/sec ( +- 0.83% ) 76,188,819 branch-misses # 1.91% of all branches ( +- 0.19% ) 6.298101157 seconds time elapsed ( +- 2.03% ) --threads --build-id=sha1 12845.420265 task-clock (msec) # 1.097 CPUs utilized ( +- 1.95% ) 4,020 context-switches # 0.313 K/sec ( +- 2.89% ) 369 cpu-migrations # 0.029 K/sec ( +- 6.26% ) 1,464,822 page-faults # 0.114 M/sec ( +- 1.37% ) 40,668,449,813 cycles # 3.166 GHz ( +- 0.96% ) 18,863,982,388 stalled-cycles-frontend # 46.38% frontend cycles idle ( +- 1.82% ) <not supported> stalled-cycles-backend 71,560,499,058 instructions # 1.76 insns per cycle # 0.26 stalled cycles per insn ( +- 0.14% ) 10,044,152,441 branches # 781.925 M/sec ( +- 0.19% ) 87,835,773 branch-misses # 0.87% of all branches ( +- 0.09% ) 11.711773314 seconds time elapsed ( +- 1.51% ) llvm-svn: 287140
-
Adrian Prantl authored
llvm-svn: 287139
-
Yaron Keren authored
llvm-svn: 287138
-
Rui Ueyama authored
Also add a comment saying that check() returns a value. llvm-svn: 287136
-
Mandeep Singh Grang authored
Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718 Reviewers: mssimpso Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D26727 llvm-svn: 287135
-
Adrian Prantl authored
This was a latent bug that was recently uncovered by r286400. llvm-svn: 287134
-
George Rimar authored
This change separates all versioned locals to be a separate list in config, that was suggested by Rafael and simplifies the logic a bit. Differential revision: https://reviews.llvm.org/D26754 llvm-svn: 287132
-
Tom Stellard authored
Summary: 1. Don't try to copy values to and from the same register class. 2. Replace copies with of registers with immediate values with v_mov/s_mov instructions. The main purpose of this change is to make MachineSink do a better job of determining when it is beneficial to split a critical edge, since the pass assumes that copies will become move instructions. This prevents a regression in uniform-cfg.ll if we enable critical edge splitting for AMDGPU. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23408 llvm-svn: 287131
-
Eugene Zelenko authored
llvm-svn: 287130
-
Sean Callanan authored
As outlined in a previous RFC, the test/ASTMerge/Inputs folder is getting full and the tests are starting to become interdependent. This is undesirable because - it makes it harder to write new tests - it makes it harder to figure out at a glance what old tests are doing, and - it adds the risk of breaking one test while changing a different one, because of the interdependencies. To fix this, according to the conversation in the RFC, I have changed the layout from a.c Inputs/a1.c Inputs/a2.c to a/test.c a/Inputs/a1.c a/Inputs/a2.c for all existing tests. I have also eliminated interdependencies by replicating the input files for each test that uses them. https://reviews.llvm.org/D26571 llvm-svn: 287129
-
Benjamin Kramer authored
[Frontend] Allow attaching an external sema source to compiler instance and extra diags to TypoCorrections This can be used to append alternative typo corrections to an existing diag. include-fixer can use it to suggest includes to be added. Differential Revision: https://reviews.llvm.org/D26745 llvm-svn: 287128
-
Sanjay Patel authored
llvm-svn: 287127
-
Eugene Zelenko authored
[ExecutionEngine] Fix some Clang-tidy modernize-use-default, modernize-use-equals-delete and Include What You Use warnings; other minor fixes. Differential revision: https://reviews.llvm.org/D26729 llvm-svn: 287126
-
Rafael Espindola authored
Turns out some systems do define it. Not producing an error in this case matches gold and bfd. llvm-svn: 287125
-
George Rimar authored
Previously we did not support them, patch implements this functionality Differential revision: https://reviews.llvm.org/D26604 llvm-svn: 287124
-
George Rimar authored
Forgot about that, I am sorry. llvm-svn: 287123
-
Sanjay Patel authored
We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions. Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of compilers, but logically equivalent int, float, and double variants of bitwise-logic instructions are reality in x86, and the float variant may be a shorter instruction depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all the time. This is a preliminary step towards solving PR6137: https://llvm.org/bugs/show_bug.cgi?id=6137 Differential Revision: https://reviews.llvm.org/D26712 llvm-svn: 287122
-
Lang Hames authored
This unit test infinite-looped on s390x due to a thread_yield being optimized out. I've updated the QueueChannel class (where thread_yield was called) to use a condition variable instead. This should cause the unit test to behave correctly. llvm-svn: 287121
-
George Rimar authored
Particulaty "cannot preempt symbol" message is extended with locations now. Differential revision: https://reviews.llvm.org/D26738 llvm-svn: 287120
-
Rui Ueyama authored
Our build-id is a tree hash anyway, so I'll define this as a synonym for sha1. GNU gold takes this parameter, so this is for compatibility with that. llvm-svn: 287119
-
Eric Liu authored
[change-namespace] handle constructor initializer: Derived : Base::Base() {} and added conflict detections Summary: namespace nx { namespace ny { class Base { public: Base(i) {}} } } namespace na { namespace nb { class X : public nx::ny { public: X() : Base::Base(1) {} }; } } When changing from na::nb to x::y, "Base::Base" will be changed to "nx::ny::Base" and "Base::" in "Base::Base" will be replaced with "nx::ny::Base" too, which causes conflict. This conflict should've been detected when adding replacements but was hidden by `addOrMergeReplacement`. We now also detect conflict when adding replacements where conflict must not happen. The namespace lookup is tricky here, we simply replace "Base::Base()" with "nx::ny::Base()" as a workaround, which compiles but not perfect. Reviewers: hokein Subscribers: bkramer, cfe-commits Differential Revision: https://reviews.llvm.org/D26637 llvm-svn: 287118
-
Reid Kleckner authored
If the global name doesn't start with __sancov_gen, ASan will insert unecessary red zones around it. llvm-svn: 287117
-
Daniil Fukalov authored
llvm-svn: 287116
-
Pekka Jaaskelainen authored
llvm-svn: 287115
-
Simon Pilgrim authored
We only need to check that the bitstream entry is a Record. llvm-svn: 287114
-
Adrian McCarthy authored
With the cross-platform minidump plugin working, the Windows-specific one is no longer needed. This eliminates the unnecessary code. This does not eliminate the Windows-specific tests, as they hit a few cases the general tests don't. (The Windows-specific tests are currently passing.) I'll look into a separate patch to make sure we're not doing too much duplicate testing. After that I might do a little re-org in the Windows plugin, as there was some factoring there (Common & Live) that probably isn't necessary anymore. Differential Revision: https://reviews.llvm.org/D26697 llvm-svn: 287113
-
Pekka Jaaskelainen authored
llvm-svn: 287112
-
Pekka Jaaskelainen authored
llvm-svn: 287111
-
Simon Pilgrim authored
Shows missed opportunity to recognise reduced integer division result size llvm-svn: 287110
-
Eric Fiselier authored
llvm-svn: 287109
-
Simon Pilgrim authored
Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic SINT_TO_FP/UINT_TO_FP calls instead of x86 intrinsics without affecting final codegen. LLVM counterpart to D26686 Differential Revision: https://reviews.llvm.org/D26736 llvm-svn: 287108
-