Commits · 432b88e5f45aa69f63d73a7e86a0b330d955e98c · Lorenzo Albano / LLVM bpEVL

Sep 20, 2017

CodeGen: support SwiftError SwiftCC on Windows x64 · 432b88e5

Saleem Abdulrasool authored Sep 20, 2017

Add support for passing SwiftError through a register on the Windows x64
calling convention.  This allows the use of swifterror attributes on
parameters which is used by the swift front end for the `Error`
parameter.  This partially enables building the swift standard library
for Windows x86_64.

llvm-svn: 313791

432b88e5

Re-land "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs" · 4e040287

Reid Kleckner authored Sep 20, 2017

After r313775, it's easier to maintain a parallel BitVector of spilled
locations indexed by location number.

I wasn't able to build a good reduced test case for this iteration of
the bug, but I added a more direct assertion that spilled values must
use frame index locations. If this bug reappears, it won't only fire on
the NEON vector code that we detected it on, but on medium-sized
integer-only programs as well.

llvm-svn: 313786

4e040287

[DebugInfo] Use a MapVector to coalesce MachineOperand locations · 92687d45

Reid Kleckner authored Sep 20, 2017

Summary:
The new code should be linear in the number of DBG_VALUEs, while the old
code was quadratic. NFC intended.

This is also hopefully a more direct expression of the problem, which is
to:

1. Rewrite all virtual register operands to stack slots or physical
   registers
2. Uniquely number those machine operands, assigning them location
   numbers
3. Rewrite all uses of the old location numbers in the interval map to
   use the new location numbers

In r313400, I attempted to track which locations were spilled in a
parallel bitvector indexed by location number. My code was broken
because these location numbers are not stable during rewriting.

Reviewers: aprantl, hans

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38068

llvm-svn: 313775

92687d45

Recommit [MachineCombiner] Update instruction depths incrementally for large BBs. · ceb44947

Florian Hahn authored Sep 20, 2017

This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when calling updateDepths, as
BlockIter already points to the next element.

Original commit message:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.

> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313751

ceb44947

[MIRPrinter] Print empty successor lists when they cannot be guessed · d652aeb1

Quentin Colombet authored Sep 19, 2017

This re-applies commit r313685, this time with the proper updates to
the test cases.

Original commit message:
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print

         entry
        /      \
   true (def)   false (no list of successors)
       |
 split.true (use)

The MIR parser would understand this:

         entry
        /      \
   true (def)   false
       |        /  <-- invalid edge
 split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

llvm-svn: 313696

d652aeb1

CodeGen: use range based for loops (NFC) · 399a4e9b

Saleem Abdulrasool authored Sep 19, 2017

Simplify the RPOT traversal by using a range based for loop for the
iterator dereference.

llvm-svn: 313687

399a4e9b

Revert "[MIRPrinter] Print empty successor lists when they cannot be guessed" · 6888dbcd

Quentin Colombet authored Sep 19, 2017

This reverts commit r313685.

I thought I had ran ninja check, but apparently I didn't...
Need to update a bunch of mir tests.

llvm-svn: 313686

6888dbcd

Sep 19, 2017

[MIRPrinter] Print empty successor lists when they cannot be guessed · 7fdaa5e6

Quentin Colombet authored Sep 19, 2017

Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print
          entry
         /      \
    true (def)   false (no list of successors)
        |
  split.true (use)

The MIR parser would understand this:
          entry
         /      \
    true (def)   false
        |        /  <-- invalid edge
  split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

llvm-svn: 313685

7fdaa5e6

Revert "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs" · ffdf0874

Reid Kleckner authored Sep 19, 2017

This reverts r313640, originally r313400, one more time for essentially
the same issue. My BitVector of spilled location numbers isn't working
because we coalesce identical DBG_VALUE locations as we rewrite them,
invalidating the location numbers used to index the BitVector.

llvm-svn: 313679

ffdf0874

Re-land "Fix Bug 30978 by emitting cv file checksums." · 26fa1bf4

Reid Kleckner authored Sep 19, 2017

This reverts r313431 and brings back r313374 with a fix to write
checksums as binary data and not ASCII hex strings.

llvm-svn: 313657

26fa1bf4

Re-land r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs" · 86c74dd7

Reid Kleckner authored Sep 19, 2017

I forgot to zero out the BitVector when reusing it between UserValues.

Later uses of the same location number for a different UserValue would
falsely indicate that they were spilled. Usually this would lead to
incorrect debug info, but in some cases they would indicate something
nonsensical like a memory location based on a vector register (Q8 on
ARM).

llvm-svn: 313640

86c74dd7

Revert r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs" · 4de620ab

Hans Wennborg authored Sep 18, 2017

This caused asserts in Chromium. See http://crbug.com/766261

> Summary:
> This comes up in optimized debug info for C++ programs that pass and
> return objects indirectly by address. In these programs,
> llvm.dbg.declare survives optimization, which causes us to emit indirect
> DBG_VALUE instructions. The fast register allocator knows to insert
> DW_OP_deref when spilling indirect DBG_VALUE instructions, but the
> LiveDebugVariables did not until this change.
>
> This fixes part of PR34513. I need to look into why this doesn't work at
> -O0 and I'll send follow up patches to handle that.
>
> Reviewers: aprantl, dblaikie, probinson
>
> Subscribers: qcolombet, hiraditya, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D37911

llvm-svn: 313589

4de620ab

[DAGCombiner] fold assertzexts separated by trunc · f31b1a00

Sanjay Patel authored Sep 18, 2017

If we have an AssertZext of a truncated value that has already been AssertZext'ed, 
we can assert on the wider source op to improve the zext-y knowledge:
 assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN

This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.

Differential Revision: https://reviews.llvm.org/D37017

llvm-svn: 313577

f31b1a00

Sep 18, 2017

[DAG, x86] allow store merging before and after legalization (PR34217) · 7765c93b

Sanjay Patel authored Sep 18, 2017

rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217

We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.

I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.

Differential Revision: https://reviews.llvm.org/D37987

llvm-svn: 313564

7765c93b

[GlobalISel] Only build expensive remarks if they're enabled. NFC. · d630a92b

Ahmed Bougacha authored Sep 18, 2017

r313390 taught 'allowExtraAnalysis' to check whether remarks are
enabled at all.  Use that to only do the expensive instruction printing
if they are.

llvm-svn: 313552

d630a92b

[SelectionDAG] Add BITCAST handling to ComputeNumSignBits for splatted sign bits. · 0b21ef1f

Simon Pilgrim authored Sep 18, 2017

For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements.

We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results.

Differential Revision: https://reviews.llvm.org/D37849

llvm-svn: 313543

0b21ef1f

Sep 16, 2017

Revert "Fix Bug 30978 by emitting cv file checksums." · 913213c8

Eric Beckmann authored Sep 16, 2017

This reverts commit 6389e7aa724ea7671d096f4770f016c3d86b0d54.

There is a bug in this implementation where the string value of the
checksum is outputted, instead of the actual hex bytes.  Therefore the
checksum is incorrect, and this prevent pdbs from being loaded by visual
studio.  Revert this until the checksum is emitted correctly.

llvm-svn: 313431

913213c8

Name the sentinel value used for the location number of the undefined register NFC · eed09732
Reid Kleckner authored Sep 15, 2017
```
llvm-svn: 313405
```
eed09732

Sep 15, 2017

[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs · 3a66c1cb

Reid Kleckner authored Sep 15, 2017

Summary:
This comes up in optimized debug info for C++ programs that pass and
return objects indirectly by address. In these programs,
llvm.dbg.declare survives optimization, which causes us to emit indirect
DBG_VALUE instructions. The fast register allocator knows to insert
DW_OP_deref when spilling indirect DBG_VALUE instructions, but the
LiveDebugVariables did not until this change.

This fixes part of PR34513. I need to look into why this doesn't work at
-O0 and I'll send follow up patches to handle that.

Reviewers: aprantl, dblaikie, probinson

Subscribers: qcolombet, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37911

llvm-svn: 313400

3a66c1cb

[DebugInfo] Add missing DW_OP_deref when an NRVO pointer is spilled · 9e6c309e

Reid Kleckner authored Sep 15, 2017

Summary:
Fixes PR34513.

Indirect DBG_VALUEs typically come from dbg.declares of non-trivially
copyable C++ objects that must be passed by address. We were already
handling the case where the virtual register gets allocated to a
physical register and is later spilled. That's what usually happens for
normal parameters that aren't NRVO variables: they usually appear in
physical register parameters, and are spilled later in the function,
which would correctly add deref.

NRVO variables are different because the dbg.declare can come much later
after earlier instructions cause the incoming virtual register to be
spilled.

Also, clean up this code. We only need to look at the first operand of a
DBG_VALUE, which eliminates the operand loop.

Reviewers: aprantl, dblaikie, probinson

Subscribers: MatzeB, qcolombet, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37929

llvm-svn: 313399

9e6c309e

[WebAssembly] MC: Create wasm data segments based on MCSections · 759631c7

Sam Clegg authored Sep 15, 2017

This means that we can honor -fdata-sections rather than
always creating a segment for each symbol.

It also allows for a followup change to add .init_array and friends.

Differential Revision: https://reviews.llvm.org/D37876

llvm-svn: 313395

759631c7

Change encodeU/SLEB128 to pad to certain number of bytes · 66a99e41

Sam Clegg authored Sep 15, 2017

Previously the 'Padding' argument was the number of padding
bytes to add. However most callers that use 'Padding' know
how many overall bytes they need to write.  With the previous
code this would mean encoding the LEB once to find out how
many bytes it would occupy and then using this to calulate
the 'Padding' value.

See: https://reviews.llvm.org/D36595

Differential Revision: https://reviews.llvm.org/D37494

llvm-svn: 313393

66a99e41

Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs." · 534bfbd3

Hans Wennborg authored Sep 15, 2017

This caused PR34629: asserts firing when building Chromium. It also broke some
buildbots building test-suite as reported on the commit thread.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>           T1 = A + B
>           T2 = T1 + 10
>           T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>           Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>           leal 1(%rax,%rcx,1), %rdx
>           leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>           leal 1(%rax,%rcx,1), %rdx
>           leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet
>
> Reviewed By: lsaba
>
> Subscribers: spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313376

534bfbd3

Fix Bug 30978 by emitting cv file checksums. · 349746f0

Eric Beckmann authored Sep 15, 2017

Summary:
The checksums had already been placed in the IR, this patch allows
MCCodeView to actually write it out to an MCStreamer.

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37157

llvm-svn: 313374

349746f0

Recommit "[RegAlloc] Make sure live-ranges reflect the state of the IR when · 6188f326

Jonas Paulsson authored Sep 15, 2017

         removing them"

This was temporarily reverted, but now that the fix has been commited (r313197)
it should be put back in place.

https://bugs.llvm.org/show_bug.cgi?id=34502

This reverts commit 9ef93d9dc4c51568e858cf8203cd2c5ce8dca796.

llvm-svn: 313349

6188f326

[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs. · 908c8b37

Jatin Bhateja authored Sep 15, 2017

Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
          T1 = A + B
          T2 = T1 + 10
          T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
          Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
          leal 1(%rax,%rcx,1), %rdx
          leal 1(%rax,%rcx,2), %rcx
       will be factored as following
          leal 1(%rax,%rcx,1), %rdx
          leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet

Reviewed By: lsaba

Subscribers: spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313343

908c8b37

[codeview] Use a type index of zero for static method "this" types · 87288b98
Reid Kleckner authored Sep 15, 2017
```
Otherwise VS won't show anything in the autos or watch window of static
methods.

llvm-svn: 313329
```
87288b98

Sep 14, 2017

Add AddresSpace to PseudoSourceValue. · 312ccf76
Jan Sjodin authored Sep 14, 2017
```
Differential Revision: https://reviews.llvm.org/D35089

llvm-svn: 313297
```
312ccf76

Remove usages of deprecated std::unary_function and std::binary_function. · 591aac7c

Benjamin Kramer authored Sep 14, 2017

These are removed in C++17. We still have some users of
unary_function::argument_type, so just spell that typedef out. No
functionality change intended.

Note that many of the argument types are actually wrong :)

llvm-svn: 313287

591aac7c

TableGen support for parameterized register class information · 779d98e1

Krzysztof Parzyszek authored Sep 14, 2017

This replaces TableGen's type inference to operate on parameterized
types instead of MVTs, and as a consequence, some interfaces have
changed:
- Uses of MVTs are replaced by ValueTypeByHwMode.
- EEVT::TypeSet is replaced by TypeSetByHwMode.

This affects the way that types and type sets are printed, and the
tests relying on that have been updated.

There are certain users of the inferred types outside of TableGen
itself, namely FastISel and GlobalISel. For those users, the way
that the types are accessed have changed. For typical scenarios,
these replacements can be used:
- TreePatternNode::getType(ResNo) -> getSimpleType(ResNo)
- TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo)
- TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false)

For more information, please refer to the review page.

Differential Revision: https://reviews.llvm.org/D31951

llvm-svn: 313271

779d98e1

[IfConversion] More simple, correct dead/kill liveness handling · 6ca02b25
Krzysztof Parzyszek authored Sep 14, 2017
```
Patch by Jesper Antonsson.

Differential Revision: https://reviews.llvm.org/D37611

llvm-svn: 313268
```
6ca02b25

[DAGCombine] (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2) · 8bd2d878

Simon Pilgrim authored Sep 14, 2017

We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or.

Original patch by @tstellar

Differential Revision: https://reviews.llvm.org/D19325

llvm-svn: 313251

8bd2d878

[SelectionDAG] ComputeNumSignBits - cleanup ROTL/ROTR wrapping to match DAGCombine etc. · 523483e0

Simon Pilgrim authored Sep 14, 2017

Use RotAmt.urem(VTBits) instead of AND(RotAmt, VTBits - 1)

TBH I don't expect non-power-of-2 types to be created, but it makes the logic clearer and matches what we do in other rotation combines.

llvm-svn: 313245

523483e0

[XRay][CodeGen] Use the current function symbol as the associated symbol for... · 01fd7c8b

Dean Michael Berris authored Sep 14, 2017

[XRay][CodeGen] Use the current function symbol as the associated symbol for the instrumentation map

Summary:
XRay had been assuming that the previous section is the "text" section
of the function when lowering the instrumentation map. Unfortunately
this is not a safe assumption, because we may be coming from lowering
debug type information for the function being lowered.

This fixes an issue with combining -gsplit-dwarf, -generate-type-units,
-debug-compile and -fxray-instrument for sole member functions. When the
split dwarf section is stripped, we're left with references from the
xray_instr_map to the debug section. The change now uses the function's
symbol instead of the previous section's start symbol.

We found the bug while attempting to strip the split debug sections off
an XRay-instrumented object file, which had a peculiar edge-case for
single-function classes where the single function is being lowered.
Because XRay had assocaited the instrumentation map for a function to
the debug types section instead of the function's section, the objcopy
call will fail due to the misplaced reference from the xray_instr_map
section.

Reviewers: pcc, dblaikie, echristo

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D37791

llvm-svn: 313233

01fd7c8b

[codeview] Fold FIXME into comment, there's nothing to do. NFC · cd7bba02
Reid Kleckner authored Sep 13, 2017
```
llvm-svn: 313214
```
cd7bba02

Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs." · 06e2a384

Hans Wennborg authored Sep 13, 2017

This caused PR34596.

> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.
>
> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313213

06e2a384

Allow target to decide when to cluster loads/stores in misched · 7fe9a5d9

Stanislav Mekhanoshin authored Sep 13, 2017

MachineScheduler when clustering loads or stores checks if base
pointers point to the same memory. This check is done through
comparison of base registers of two memory instructions. This
works fine when instructions have separate offset operand. If
they require a full calculated pointer such instructions can
never be clustered according to such logic.

Changed shouldClusterMemOps to accept base registers as well and
let it decide what to do about it.

Differential Revision: https://reviews.llvm.org/D37698

llvm-svn: 313208

7fe9a5d9

Sep 13, 2017

[codeview] VLAs and unsized arrays should use a size of zero · 89af112c

Reid Kleckner authored Sep 13, 2017

Previously we used a size of '1' for VLAs because we weren't sure what
MSVC did. However, MSVC does support declaring an array without a size,
for which it emits an array type with a size of zero. Clang emits the
same DI metadata for VLAs and arrays without bound, so we would describe
arrays without bound as having one element. This lead to Microsoft
debuggers only printing a single element.

Emitting a size of zero appears to cause these debuggers to search the
symbol information to find a definition of the variable with accurate
array bounds.

Fixes http://crbug.com/763580

llvm-svn: 313203

89af112c

[RegAlloc] Keep a copy of live interval for the spilled vregs in HoistSpillHelper. · c0d06646

Wei Mi authored Sep 13, 2017

This is to fix PR34502. After rL311401, the live range of spilled vreg will be
cleared. HoistSpill need to use the live range of the original vreg before splitting
to know the moving range of the spills. The patch saves a copy of live interval for
the spilled vreg inside of HoistSpillHelper.

Differential Revision: https://reviews.llvm.org/D37578

llvm-svn: 313197

c0d06646

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). · 618c555b
Eugene Zelenko authored Sep 13, 2017
```
llvm-svn: 313194
```
618c555b