Commits · bd0cb787d05b555791a88640dcf9800bab54d4fa · Lorenzo Albano / LLVM bpEVL

May 30, 2018

[ORC] Update JITCompileCallbackManager to support multi-threaded code. · bd0cb787

Lang Hames authored May 30, 2018

Previously JITCompileCallbackManager only supported single threaded code. This
patch embeds a VSO (see include/llvm/ExecutionEngine/Orc/Core.h) in the callback
manager. The VSO ensures that the compile callback is only executed once and that
the resulting address cached for use by subsequent re-entries.

llvm-svn: 333490

bd0cb787

Make the mangled name collision diagnostic a bit more useful by listing the mangling. · b534510c

Richard Smith authored May 30, 2018

This helps especially when the collision is for a template specialization,
where the template arguments are not available from anywhere else in the
diagnostic, and are likely relevant to the problem.

llvm-svn: 333489

b534510c

Fix test failure after r333485. · 4977c375

Eric Fiselier authored May 30, 2018

I missed adjusting a test under Misc in the last commit.
This patch updates that test.

llvm-svn: 333488

4977c375

[RISCV] Support resolving fixup_riscv_call and add to MCFixupKindInfo table · c3d0e892

Shiva Chen authored May 30, 2018

Resolving fixup_riscv_call by assembler when the linker relaxation diabled
and the function and callsite within the same compile unit.

And also adding static_assert after Infos array declaration
to avoid missing any new fixup in MCFixupKindInfo in the future.

Differential Revision: https://reviews.llvm.org/D47126

llvm-svn: 333487

c3d0e892

[ODRHash] Support FunctionTemplateDecl in records. · 9359e8f2
Richard Trieu authored May 30, 2018
```
llvm-svn: 333486
```
9359e8f2

[Sema] Use %sub to cleanup overload diagnostics · 92e523bf

Eric Fiselier authored May 30, 2018

Summary:
This patch adds the newly added `%sub` diagnostic modifier to cleanup repetition in the overload candidate diagnostics.

I think this should be good to go.

@rsmith: Some of the notes now emit `function template` where they only said `function` previously. It seems OK to me, but I would like your sign off on it.


Reviewers: rsmith, EricWF

Reviewed By: EricWF

Subscribers: cfe-commits, rsmith

Differential Revision: https://reviews.llvm.org/D47101

llvm-svn: 333485

92e523bf

Add HIP toolchain · f614422d

Yaxun Liu authored May 30, 2018

This patch adds HIP toolchain to support HIP language mode. It includes:

Create specific compiler jobs for HIP.

Choose specific libraries for HIP.

With contribution from Greg Rodgers.

Differential Revision: https://reviews.llvm.org/D45212

llvm-svn: 333484

f614422d

Add action builder for HIP · 3af038be

Yaxun Liu authored May 30, 2018

To support separate compile/link and linking across device IR in different source files,
a new HIP action builder is introduced. Basically it compiles/links host and device
code separately, and embed fat binary in host linking stage through linker script.

Differential Revision: https://reviews.llvm.org/D46476

llvm-svn: 333483

3af038be

Revert r332839. · 6ca999ba

Richard Smith authored May 30, 2018

This is causing miscompiles and "definition with same mangled name as another
definition" errors.

llvm-svn: 333482

6ca999ba

Update ABI lists after change in r333467. · 18567eac

Eric Fiselier authored May 29, 2018

r333467 updated the symbols exported by libc++.so/dylib by changing
the ODR usage of __uncaught_exception/__uncaught_exceptions. This
should not be a breaking change.

llvm-svn: 333481

18567eac

ELF: Run the same test without --thinlto-jobs as we do with --thinlto-jobs. · 6c73f9d1

Peter Collingbourne authored May 29, 2018

The comment only made sense a long time ago, when --thinlto-jobs was
tied with --lto-partitions. That was changed in r283817, but the test
wasn't updated at the same time. This patch does so.

llvm-svn: 333480

6c73f9d1

Mark deduction guide tests as failing on apple-clang-9 · 2fec6dc5

JF Bastien authored May 29, 2018

As discussed here: http://lists.llvm.org/pipermail/cfe-dev/2018-May/058116.html
The tests fail on clang-5, as well as apple-clang-9. Mark them as such.

llvm-svn: 333479

2fec6dc5

[LLDB] Revert r303907. · 54132d6c

Tim Shen authored May 29, 2018

See https://reviews.llvm.org/rL303907 for details about the bug.

llvm-svn: 333478

54132d6c

[VPlan] Replace LLVM_ATTRIBUTE_USED with ifndef NDEBUG · b94b21d4

Diego Caballero authored May 29, 2018

Minor replacement. LLVM_ATTRIBUTE_USED was introduced to silence
a warning but using #ifndef NDEBUG makes more sense in this case.

Reviewers: dblaikie, fhahn, hsaito

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D47498

llvm-svn: 333476

b94b21d4

[X86] Remove some of the extractelts from the new MOVSS+FMA patterns. · 5989db0f

Craig Topper authored May 29, 2018

We only need the extractelt that corresponds to the register we're trying to insert back into. We can't guarantee the others haven't been optimized out depending on how those operands were produced.

So instead just look for an FR32/FR64 input and emit a COPY_TO_REGCLASS to VR128 in the output pattern. This matches what we do for ADD/SUB/MUL/DIV.

llvm-svn: 333473

5989db0f

[COFF] Unify output section code. NFC · 663518d6

Shoaib Meenai authored May 29, 2018

Peter Collingbourne suggested moving the switch to the top of the
function, so that all the code that cares about the output section for a
symbol is in the same place.

Differential Revision: https://reviews.llvm.org/D47497

llvm-svn: 333472

663518d6

Check pointer null-ness before dereferencing it. · e69acc5d

Richard Trieu authored May 29, 2018

-Warc-repeated-use-of-weak may trigger a segmentation fault when the Decl
being checked is outside of a function scope, leaving the current function
info pointer null.  This adds a check before using the function info.

llvm-svn: 333471

e69acc5d

[Driver] Rename DefaultTargetTriple to TargetTriple · dd38d931

Petr Hosek authored May 29, 2018

While this value is initialized with the DefaultTargetTriple, it
can be later overriden using the -target flag so TargetTriple is
a more accurate name. This change also provides an accessor which
could be accessed from ToolChain implementations.

Differential Revision: https://reviews.llvm.org/D47357

llvm-svn: 333468

dd38d931

Fix embarrasing typo in uncaught_exceptions. Update tests to really test this.... · 3a92ecc8

Marshall Clow authored May 29, 2018

Fix embarrasing typo in uncaught_exceptions. Update tests to really test this. Thanks to Peter Klotz for calling my attention to this.

llvm-svn: 333467

3a92ecc8

[ObjC] Add a Makefile for the test added in r333465. · 6592c7c9

Davide Italiano authored May 29, 2018

Not strictly necessary, but makes the test more robust in case
we end up changing the defaults.

<rdar://problem/40622096>

llvm-svn: 333466

6592c7c9

[ObjC] Fix the formatter for NSOrderedSet. · d9b9c919
Davide Italiano authored May 29, 2018
```
While I'm here, delete some dead code.

<rdar://problem/40622096>

llvm-svn: 333465
```
d9b9c919

May 29, 2018

[X86] Use VR128X instead of VR128 in EVEX instruction patterns. · dbd371e9
Craig Topper authored May 29, 2018
```
llvm-svn: 333464
```
dbd371e9

[X86] Rename the operands in the recently introduced MOVSS+FMA patterns so... · aba57bfe

Craig Topper authored May 29, 2018

[X86] Rename the operands in the recently introduced MOVSS+FMA patterns so that the operand names in the output pattern are always in 1, 2, 3 order since those are the operand names in the instruction.

The order should be controlled in the input pattern.

llvm-svn: 333463

aba57bfe

Fix build error introduced in rL333459 · f4f37509
Sam Clegg authored May 29, 2018
```
The DEBUG macro was renamed LLVM_DEBUG.

llvm-svn: 333462
```
f4f37509

[LoopInstSimplify] Re-implement the core logic of loop-instsimplify to · 4cbcbb07

Chandler Carruth authored May 29, 2018

be both simpler and substantially more efficient.

Rather than use a hand-rolled iteration technique that isn't quite the
same as RPO, use the pre-built RPO loop body traversal utility.

Once visiting the loop body in RPO, we can assert that we visit defs
before uses reliably. When this is the case, the only need to iterate is
when simplifying a def that is used by a PHI node along a back-edge.
With this patch, the first pass over the loop body is just a complete
simplification of every instruction across the loop body. When we
encounter a use of a simplified instruction that stems from a PHI node
in the loop body that has already been visited (due to some cyclic CFG,
potentially the loop itself, or a nested loop, or unstructured control
flow), we recall that specific PHI node for the second iteration.
Nothing else needs to be preserved from iteration to iteration.

On the second and later iterations, only instructions known to have
simplified inputs are considered, each time starting from a set of PHIs
that had simplified inputs along the backedges.

Dead instructions are collected along the way, but deleted in a batch at
the end of each iteration making the iterations themselves substantially
simpler. This uses a new batch API for recursively deleting dead
instructions.

This alsa changes the routine to visit subloops. Because simplification
is fundamentally transitive, we may need to visit the entire loop body,
including subloops, to handle knock-on simplification.

I've added a basic test file that helps demonstrate that all of these
changes work. It includes both straight-forward loops with
simplifications as well as interesting PHI-structures, CFG-structures,
and a nested loop case.

Differential Revision: https://reviews.llvm.org/D47407

llvm-svn: 333461

4cbcbb07

[X86] Fix a potential crash that occur after r333419. · 5439b3d1

Craig Topper authored May 29, 2018

The code could issue a truncate from a small type to larger type. We need to extend in that case instead.

llvm-svn: 333460

5439b3d1

[WebAssembly] Add more error checking to object file parsing · b7c62394

Sam Clegg authored May 29, 2018

This should address some of the assert failures the fuzzer has been
finding such as:
  https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6719

Differential Revision: https://reviews.llvm.org/D47086

llvm-svn: 333459

b7c62394

AMDGPU: Fix broken check lines · 4b3829d8
Matt Arsenault authored May 29, 2018
```
llvm-svn: 333458
```
4b3829d8
AMDGPU: Fix typo in option description · 2e4d338d
Matt Arsenault authored May 29, 2018
```
llvm-svn: 333457
```
2e4d338d

AMDGPU: Round up kernel argument allocation size · 1ea0402e

Matt Arsenault authored May 29, 2018

AFAIK the driver's allocation will actually have to round this
up anyway. It is useful to track the rounded up size, so that
the end of the kernel segment is known to be dereferencable so
a wider s_load_dword can be used for a short argument at the end
of the segment.

llvm-svn: 333456

1ea0402e

[RISCV] Add peepholes for Global Address lowering patterns · 97684419

Sameer AbuAsal authored May 29, 2018

Summary:
  Base and offset are always separated when a GlobalAddress node is lowered
  (rL332641) as an optimization to reduce instruction count. However, this
  optimization is not profitable if the Global Address ends up being used in only
  instruction.

  This patch adds peephole optimizations that merge an offset of
  an address calculation into the LUI %%hi and ADD %lo of the lowering sequence.

  The peephole handles three patterns:

 1) ADDI (ADDI (LUI %hi(global)) %lo(global)), offset
     --->
      ADDI (LUI %hi(global + offset)) %lo(global + offset).

   This generates:
   lui a0, hi (global + offset)
   add a0, a0, lo (global + offset)

   Instead of

   lui a0, hi (global)
   addi a0, hi (global)
   addi a0, offset

   This pattern is for cases when the offset is small enough to fit in the
   immediate filed of ADDI (less than 12 bits).

 2) ADD ((ADDI (LUI %hi(global)) %lo(global)), (LUI hi_offset))
     --->
      offset = hi_offset << 12
      ADDI (LUI %hi(global + offset)) %lo(global + offset)

   Which generates the ASM:

   lui  a0, hi(global + offset)
   addi a0, lo(global + offset)

   Instead of:

   lui  a0, hi(global)
   addi a0, lo(global)
   lui a1, (offset)
   add a0, a0, a1

   This pattern is for cases when the offset doesn't fit in an immediate field
   of ADDI but the lower 12 bits are all zeros.

 3) ADD ((ADDI (LUI %hi(global)) %lo(global)), (ADDI lo_offset, (LUI hi_offset)))
     --->
        offset = global + offhi20<<12 + offlo12
        ADDI (LUI %hi(global + offset)) %lo(global + offset)

   Which generates the ASM:

   lui  a1, %hi(global + offset)
   addi a1, %lo(global + offset)

   Instead of:

   lui  a0, hi(global)
   addi a0, lo(global)
   lui a1, (offhi20)
   addi a1, (offlo12)
   add a0, a0, a1

   This pattern is for cases when the offset doesn't fit in an immediate field
   of ADDI and both the lower 1 bits and high 20 bits are non zero.

    Reviewers: asb

    Reviewed By: asb

    Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos,
  niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang

llvm-svn: 333455

97684419

[BasicAA] Teach the analysis about atomic memcpy · 3a6c50f4

Daniel Neilson authored May 29, 2018

Summary:
A simple change to derive mod/ref info from the atomic memcpy
intrinsic in the same way as from the regular memcpy intrinsic.

llvm-svn: 333454

3a6c50f4

Update CodeView register names in a test that was missed in r333421. · 99feb567
Douglas Yung authored May 29, 2018
```
llvm-svn: 333453
```
99feb567

Remove unused DWARFUnit::HasDIEsParsed() · a3ad3c48

Jan Kratochvil authored May 29, 2018

It was not implemented correctly after https://reviews.llvm.org/D46810 but then
it has not been used anywhere anyway.

llvm-svn: 333452

a3ad3c48

AMDGPU: Always set COMPUTE_PGM_RSRC2.ENABLE_TRAP_HANDLER to zero for AMDHSA as · 2ca6b1f2
Konstantin Zhuravlyov authored May 29, 2018
```
it is set by CP

Differential Revision: https://reviews.llvm.org/D47392

llvm-svn: 333451
```
2ca6b1f2

[COFF] Simplify symbol table output section computation · 4e518336

Shoaib Meenai authored May 29, 2018

Rather than using a loop to compare symbol RVAs to the starting RVAs of
sections to determine which section a symbol belongs to, just get the
output section of a symbol directly via its chunk, and bail if the
symbol doesn't have an output section, which avoids having to hardcode
logic for handling dead symbols, CodeView symbols, etc. This was
suggested by Reid Kleckner; thank you.

This also fixes writing out symbol tables in the presence of RVA table
input sections (e.g. .sxdata and .gfids). Such sections aren't written
to the output file directly, so their RVA is 0, and the loop would thus
fail to find an output section for them, resulting in a segfault. Extend
some existing tests to cover this case.

Fixes PR37584.

Differential Revision: https://reviews.llvm.org/D47391

llvm-svn: 333450

4e518336

Fix compiler unused variable warning in DWARFUnit · 43b40939
Jan Kratochvil authored May 29, 2018
```
Alex Langford has reported it from: https://reviews.llvm.org/D46810

llvm-svn: 333449
```
43b40939
[TableGen] Use explicit constructor for InstMemo · 33b6f9ac
Florian Hahn authored May 29, 2018
```
This should fix a few buildbot failures with old
GCC versions.

llvm-svn: 333448
```
33b6f9ac

[CodeGen][Darwin] Set the calling-convention of thread-local variable · 1da9dbbc

Akira Hatanaka authored May 29, 2018

initialization functions to 'cxx_fast_tlscc'.

This fixes a bug where instructions calling initialization functions for
thread-local static members of c++ template classes were using calling
convention 'cxx_fast_tlscc' while the called functions weren't annotated
with the calling convention.

rdar://problem/40447463

Differential Revision: https://reviews.llvm.org/D47354

llvm-svn: 333447

1da9dbbc

[X86] Tag some 128/256 load/store instructions as requiring avx512vl instead of avx512f. · 681b882d
Craig Topper authored May 29, 2018
```
llvm-svn: 333446
```
681b882d