Commits · 5ae83a21b57f8fe77628a6a7aa228c0aad8ad889 · Roger Ferrer / llvm-epi

Aug 20, 2018

[InstCombine] add tests for insertelement+binop; NFC · 5ae83a21
Sanjay Patel authored Aug 20, 2018
```
llvm-svn: 340184
```
5ae83a21

[llvm-mca] Make the LSUnit a HardwareUnit, and allow derived classes to... · 0875e759

Andrea Di Biagio authored Aug 20, 2018

[llvm-mca] Make the LSUnit a HardwareUnit, and allow derived classes to implement a different memory consistency model.

The LSUnit is now a HardwareUnit, and it is owned by the mca::Context.
Derived classes can now implement a different consistency model by overriding
method `LSUnit::isReady()`.

This patch also slightly refactors the Scheduler interface in the attempt to
simplifying the interaction between ExecuteStage and the underlying Scheduler.

llvm-svn: 340176

0875e759

[SelectionDAG] Reuse the Op's VT. NFCI. · 1a000422
Simon Pilgrim authored Aug 20, 2018
```
llvm-svn: 340173
```
1a000422

AMDGPU: fix compilation errors since r340171 · 216a2da5

Samuel Pitoiset authored Aug 20, 2018

Some buildbot slaves reports compilation errors, but it
compiled fine on my side, sorry for the breakage.

llvm-svn: 340172

216a2da5

AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space · c95ef77d

Samuel Pitoiset authored Aug 20, 2018

32-bit constant address space is declared as 6, so the
maximum number of address spaces is 6, not 5.

Fixes "LLVM ERROR: Pointer address space out of range".

v3: use static_assert()
v2: add a very simple test for 32-bit addr space

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106630


Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
llvm-svn: 340171

c95ef77d

Fix an undefined behavior when storing an empty StringRef. · 54829bb3

Haojian Wu authored Aug 20, 2018

Summary: Passing a nullptr to memcpy is UB.

Reviewers: ioeric

Subscribers: llvm-commits, cfe-commits

Differential Revision: https://reviews.llvm.org/D50966

llvm-svn: 340170

54829bb3

[SelectionDAG] Add partial sign-bit support to ComputeNumSignBits for BITCAST nodes · 5b78c9d5

Simon Pilgrim authored Aug 20, 2018

Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts.

Handle the case where the sign bit extends to only part of the small elements.

llvm-svn: 340169

5b78c9d5

[X86][SSE] Fix PACKSS bitcast test from rL340166 · 11bec5b8
Simon Pilgrim authored Aug 20, 2018
```
We need the signbits to extends to lower 16-bits of the even elements

llvm-svn: 340167
```
11bec5b8

[X86][SSE] Add PACKSS test showing ComputeNumSignBits failure to handle a... · cee9c648

Simon Pilgrim authored Aug 20, 2018

[X86][SSE] Add PACKSS test showing ComputeNumSignBits failure to handle a partial sign bits extension through a bitcast

llvm-svn: 340166

cee9c648

[X86] Drop unnecessary exact qualifier from packss test · 686090a4
Simon Pilgrim authored Aug 20, 2018
```
llvm-svn: 340165
```
686090a4

[DWARF] Refactor DWARF classes to use unified error reporting. NFC. · cba595da

Victor Leschuk authored Aug 20, 2018

DWARF-related classes in lib/DebugInfo/DWARF contained 
duplicating code for creating StringError instances, like:

template <typename... Ts>
static Error createError(char const *Fmt, const Ts &... Vals) {
  std::string Buffer;
  raw_string_ostream Stream(Buffer);
  Stream << format(Fmt, Vals...);
  return make_error<StringError>(Stream.str(), inconvertibleErrorCode());
}

Similar function was placed in Support lib in https://reviews.llvm.org/D49824

This revision makes DWARF classes use this function
instead of their local implementation of it.

Reviewers: aprantl, dblaikie, probinson, wolfgangp, JDevlieghere, jhenderson

Reviewed By: JDevlieghere, jhenderson

Differential Revision: https://reviews.llvm.org/D49964

llvm-svn: 340163

cba595da

Use LLVM_BUILTIN_TRAP not __builtin_trap to appease windows builds. NFCI. · bbd2d15d
Simon Pilgrim authored Aug 20, 2018
```
llvm-svn: 340162
```
bbd2d15d

[AArch64][SVE] Asm: Add SVE System registers · 07db4322

Sander de Smalen authored Aug 20, 2018

This patch adds system registers for controlling aspects of SVE:
- ZCR_EL1  (r/w)   visible at EL1 and EL0.
- ZCR_EL2  (r/w)   visible at EL2 and Non-secure EL1 and EL0.
- ZCR_EL3  (r/w)   visible at all exception levels.

and a system register identifying SVE:
- ID_AA64ZFR0_EL1  (r)  SVE Feature identifier.

Reviewers: SjoerdMeijer, samparker, pbarrio, fhahn, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D50885

llvm-svn: 340158

07db4322

[llvm] Make YAML serialization up to 2.5 times faster · 5f26a642

Kirill Bobyrev authored Aug 20, 2018

This patch significantly improves performance of the YAML serializer by
optimizing `YAML::isNumeric` function. This function is called on the
most strings and is highly inefficient for two reasons:

* It uses `Regex`, which is parsed and compiled each time this
  function is called
* It uses multiple passes which are not necessary

This patch introduces stateful ad hoc YAML number parser which does not
rely on `Regex`. It also fixes YAML number format inconsistency: current
implementation supports C-stile octal number format (`01234567`) which
was present in YAML 1.0 specialization (http://yaml.org/spec/1.0/),
[Section 2.4. Tags, Example 2.19] but was deprecated and is no longer
present in latest YAML 1.2 specification
(http://yaml.org/spec/1.2/spec.html), see [Section 10.3.2. Tag
Resolution]. Since the rest of the rest of the implementation does not
support other deprecated YAML 1.0 numeric features such as sexagecimal
numbers, commas as delimiters it is treated as inconsistency and not
longer supported. This patch also adds unit tests to ensure the validity
of proposed implementation.

This performance bottleneck was identified while profiling Clangd's
global-symbol-builder tool with my colleague @ilya-biryukov. The
substantial part of the runtime was spent during a single-thread Reduce
phase, which concludes with YAML serialization of collected symbol
collection. Regex matching was accountable for approximately 45% of the
whole runtime (which involves sharded Map phase), now it is reduced to
18% (which is spent in `clang::clangd::CanonicalIncludes` and can be
also optimized because all used regexes are in fact either suffix
matches or exact matches).

`llvm-yaml-numeric-parser-fuzzer` was used to ensure the validity of the
proposed regex replacement. Fuzzing for ~60 hours using 10 threads did
not expose any bugs.

Benchmarking `global-symbol-builder` (using `hyperfine --warmup 2
--min-runs 5 'command 1' 'command 2'`) tool by processing a reasonable
amount of code (26 source files matched by
`clang-tools-extra/clangd/*.cpp` with all transitive includes) confirmed
our understanding of the performance bottleneck nature as it speeds up
the command by the factor of 1.6x:

| Command | Mean [s] | Min…Max [s] |
| this patch (D50839) | 84.7 ± 0.6 | 83.3…84.7 |
| master (rL339849) | 133.1 ± 0.8 | 132.4…134.6 |

Using smaller samples (e.g. by collecting symbols from
`clang-tools-extra/clangd/AST.cpp` only) yields even better performance
improvement, which is expected because Map phase takes less time
compared to Reduce and is 2.05x faster and therefore would significantly
improve the performance of standalone YAML serializations.

| Command | Mean [ms] | Min…Max [ms] |
| this patch (D50839) | 3702.2 ± 48.7 | 3635.1…3752.3 |
| master (rL339849) | 7607.6 ± 109.5 | 7533.3…7796.4 |

Reviewed by: zturner, ilya-biryukov

Differential revision: https://reviews.llvm.org/D50839

llvm-svn: 340154

5f26a642

[SimplifyCFG] Replace some uses of bitwise or with logical or · 6f1740d5
Justin Bogner authored Aug 20, 2018
```
It's clearer to use logical or for boolean values. Thanks to Steven
Zhang for noticing!

llvm-svn: 340153
```
6f1740d5
[InstCombine] Move some variable declarations into a more appropriate scope. NFC · 24674ca7
Craig Topper authored Aug 20, 2018
```
llvm-svn: 340150
```
24674ca7

[PowerPC] Add a peephole post RA to transform the inst that fed by add · f8f9af7b

QingShan Zhang authored Aug 20, 2018

If the arch is P8, we will select XFLOAD to load the floating point, and then, expand it to vsx and non-vsx X-form instruction post RA. This patch is trying to convert the X-form to D-form if it meets the requirement that one operand of the x-form inst is the special Zero register, and another operand fed by add inst. i.e.
y = add imm, reg
LFDX. 0, y
-->
LFD imm(reg)

Reviewers: Nemanjai
Differential Revision: https://reviews.llvm.org/D49007

llvm-svn: 340149

f8f9af7b

[bindings/go] Add coroutine passes · fdca0c6d

whitequark authored Aug 19, 2018

Add Go bindings for CoroEarly, CoroSplit, CoroElide and CoroCleanup.

Differential Revision: https://reviews.llvm.org/D50951

llvm-svn: 340148

fdca0c6d

[LLVM-C] Add coroutine passes · c438ac23
whitequark authored Aug 19, 2018
```
Differential Revision: https://reviews.llvm.org/D50950

llvm-svn: 340147
```
c438ac23

[C-API][DIBuilder] Added DIFlags in LLVMDIBuilderCreateBasicType · b56a4d31

whitequark authored Aug 19, 2018

Added DIFlags in LLVMDIBuilderCreateBasicType to add optional DWARF
attributes, such as DW_AT_endianity.

Patch by Chirag Patel.

Differential Revision: https://reviews.llvm.org/D50832

llvm-svn: 340146

b56a4d31

Aug 19, 2018

[InstCombine] Add test cases for an icmp combine that is missing support for... · 5f695cc1
Craig Topper authored Aug 19, 2018
```
[InstCombine] Add test cases for an icmp combine that is missing support for splat vector constants.

llvm-svn: 340144
```
5f695cc1

[SelectionDAG] Add basic demanded elements support to ComputeNumSignBits for BITCAST nodes · 5b936ec8

Simon Pilgrim authored Aug 19, 2018

Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts.

The next step would be to support cases where the large elements aren't all sign bits, and determine the small element equivalent based on the demanded elements.

llvm-svn: 340143

5b936ec8

[X86][SSE] Add PACKSS test showing ComputeNumSignBits failure to handle... · 0fd72ab4
Simon Pilgrim authored Aug 19, 2018
```
[X86][SSE] Add PACKSS test showing ComputeNumSignBits failure to handle demanded elts through a bitcast

llvm-svn: 340139
```
0fd72ab4

[X86] Fix an issue in the matching for ADDUS. · 803912ea

Craig Topper authored Aug 19, 2018

We were basically assuming only one operand of the compare could be an ADD node and using that to swap operands. But we can have a normal add followed by a saturing add.

This rewrites the canonicalization to just be based on the condition code.

llvm-svn: 340134

803912ea

[X86] Add a test case showing an issue in our addusw pattern matching. · a85d7e92

Craig Topper authored Aug 19, 2018

We are unable to handle a normal add followed by a saturing add with certain operand orders on the icmp.

llvm-svn: 340133

a85d7e92

Aug 18, 2018

Updating MergeFunctions.rst · 6373f5dd

Aditya Kumar authored Aug 18, 2018

Improving readability, removing redundant contents.

Reviewers: hiraditya
Differential Revision: https://reviews.llvm.org/D50686

llvm-svn: 340131

6373f5dd

[X86] Use SDValue::operator== instead of DAG.isEqualTo in strictly integer matching. · 2b03df9b
Craig Topper authored Aug 18, 2018
```
isEqualTo is more useful for floating point. operator== is sufficient for integer.

llvm-svn: 340130
```
2b03df9b
[X86] Simplify the PADDUS legality check in combineSelect to match PSUBUS. NFC · 3e299d89
Craig Topper authored Aug 18, 2018
```
While there remove some trailing whitespace.

llvm-svn: 340129
```
3e299d89

[X86] Add support for using 512-bit PSUBUS to combineSelect. · 40c9559b

Craig Topper authored Aug 18, 2018

The code already support 128 and 256 and even knows to split 256 for AVX1. So we really just needed to stop looking for specific VTs and subtarget features and just look for legal VTs with i8/i16 elements.

While there, add some curly braces around outer if statement bodies that contain only another if. It makes all the closing curly braces look more regular.

llvm-svn: 340128

40c9559b

[X86] Add test cases to show missed opportunities to use 512-bit PSUBUS. · b40a1d5f
Craig Topper authored Aug 18, 2018
```
llvm-svn: 340127
```
b40a1d5f

[MS Demangler] Resolve backreferences eagerly, not lazily. · d9e925fc

Zachary Turner authored Aug 18, 2018

A while back I submitted a patch to resolve backreferences
lazily, thinking this that it was not always possible to know
in advance what type you were looking at until you had completed
a full pass over the input, and therefore it would be impossible
to resolve backreferences eagerly.

This was mistaken though, and turned out to be an unrelated
problem.  In fact, the reverse is true.  You *must* resolve
backreferences eagerly.  This is because certain types of nested
mangled symbols do not share a backreference context with their
parent symbol, and as such, if you try to resolve them lazily
their backreference context will have been lost by the time you
finish demangling the entire input.  On the other hand, resolving
them eagerly appears to always work, and enables us to port
many more tests over.

llvm-svn: 340126

d9e925fc

[RuntimeDyld] Fix a bug in RuntimeDyld::loadObjectImpl that was over-allocating · 8e296229

Lang Hames authored Aug 18, 2018

space for common symbols.

Patch by Dmitry Sidorov. Thanks Dmitry!

Differential revision: https://reviews.llvm.org/D50240

llvm-svn: 340125

8e296229

[X86] Replace all single match schedule class instregexs with instrs entries · 9c1761a6
Simon Pilgrim authored Aug 18, 2018
```
Helps reduce cost of instrw collection

llvm-svn: 340124
```
9c1761a6
[X86] Merge shift/rotate schedule class instregexs · ebfd6ebb
Simon Pilgrim authored Aug 18, 2018
```
Helps reduce cost of instrw collection

llvm-svn: 340123
```
ebfd6ebb

[DebugInfo] In FastISel, convert llvm.dbg.label to DBG_LABEL MI. · 68c706ce

Hsiangkai Wang authored Aug 18, 2018

Convert llvm.dbg.label(!label_metadata) to DBG_LABEL !label_metadata.

Differential Revision: https://reviews.llvm.org/D50622

llvm-svn: 340122

68c706ce

[X86] Add a signed test case for PR38622. Use nounwind to reduce the output on... · 911efbb9
Craig Topper authored Aug 18, 2018
```
[X86] Add a signed test case for PR38622. Use nounwind to reduce the output on the unsigned test case.

llvm-svn: 340121
```
911efbb9

[DAGCombiner] Allow divide by constant optimization on opaque constants. · cc5dbbf7

Craig Topper authored Aug 18, 2018

Summary:
I believe this restores the behavior we had before r339147.

Fixes PR38622.

Reviewers: RKSimon, chandlerc, spatel

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50936

llvm-svn: 340120

cc5dbbf7

Add the extended XMM registers mappings for AVX-512. · bc94ae43
Zachary Turner authored Aug 18, 2018
```
After this we should have the entire AVX-512 register set
mapping in place.

llvm-svn: 340118
```
bc94ae43
[ORC] Fix some parameter names. NFC. · 6ac2be0a
Lang Hames authored Aug 18, 2018
```
llvm-svn: 340116
```
6ac2be0a

[ORC] Rename 'finalize' to 'emit' to avoid potential confusion. · 76e21c97

Lang Hames authored Aug 18, 2018

An emitted symbol has had its contents written and its memory protections
applied, but it is not automatically ready to execute.

Prior to ORC supporting concurrent compilation, the term "finalized" could be
interpreted two different (but effectively equivalent) ways: (1) The finalized
symbol's contents have been written and its memory protections applied, and (2)
the symbol is ready to run. Now that ORC supports concurrent compilation, sense
(1) no longer implies sense (2). We have already introduced a new term, 'ready',
to capture sense (2), so rename sense (1) to 'emitted' to avoid any lingering
confusion.

llvm-svn: 340115

76e21c97