- May 30, 2018
-
-
Lang Hames authored
Previously JITCompileCallbackManager only supported single threaded code. This patch embeds a VSO (see include/llvm/ExecutionEngine/Orc/Core.h) in the callback manager. The VSO ensures that the compile callback is only executed once and that the resulting address cached for use by subsequent re-entries. llvm-svn: 333490
-
Richard Smith authored
This helps especially when the collision is for a template specialization, where the template arguments are not available from anywhere else in the diagnostic, and are likely relevant to the problem. llvm-svn: 333489
-
Eric Fiselier authored
I missed adjusting a test under Misc in the last commit. This patch updates that test. llvm-svn: 333488
-
Shiva Chen authored
Resolving fixup_riscv_call by assembler when the linker relaxation diabled and the function and callsite within the same compile unit. And also adding static_assert after Infos array declaration to avoid missing any new fixup in MCFixupKindInfo in the future. Differential Revision: https://reviews.llvm.org/D47126 llvm-svn: 333487
-
Richard Trieu authored
llvm-svn: 333486
-
Eric Fiselier authored
Summary: This patch adds the newly added `%sub` diagnostic modifier to cleanup repetition in the overload candidate diagnostics. I think this should be good to go. @rsmith: Some of the notes now emit `function template` where they only said `function` previously. It seems OK to me, but I would like your sign off on it. Reviewers: rsmith, EricWF Reviewed By: EricWF Subscribers: cfe-commits, rsmith Differential Revision: https://reviews.llvm.org/D47101 llvm-svn: 333485
-
Yaxun Liu authored
This patch adds HIP toolchain to support HIP language mode. It includes: Create specific compiler jobs for HIP. Choose specific libraries for HIP. With contribution from Greg Rodgers. Differential Revision: https://reviews.llvm.org/D45212 llvm-svn: 333484
-
Yaxun Liu authored
To support separate compile/link and linking across device IR in different source files, a new HIP action builder is introduced. Basically it compiles/links host and device code separately, and embed fat binary in host linking stage through linker script. Differential Revision: https://reviews.llvm.org/D46476 llvm-svn: 333483
-
Richard Smith authored
This is causing miscompiles and "definition with same mangled name as another definition" errors. llvm-svn: 333482
-
Eric Fiselier authored
r333467 updated the symbols exported by libc++.so/dylib by changing the ODR usage of __uncaught_exception/__uncaught_exceptions. This should not be a breaking change. llvm-svn: 333481
-
Peter Collingbourne authored
The comment only made sense a long time ago, when --thinlto-jobs was tied with --lto-partitions. That was changed in r283817, but the test wasn't updated at the same time. This patch does so. llvm-svn: 333480
-
JF Bastien authored
As discussed here: http://lists.llvm.org/pipermail/cfe-dev/2018-May/058116.html The tests fail on clang-5, as well as apple-clang-9. Mark them as such. llvm-svn: 333479
-
Tim Shen authored
See https://reviews.llvm.org/rL303907 for details about the bug. llvm-svn: 333478
-
Diego Caballero authored
Minor replacement. LLVM_ATTRIBUTE_USED was introduced to silence a warning but using #ifndef NDEBUG makes more sense in this case. Reviewers: dblaikie, fhahn, hsaito Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D47498 llvm-svn: 333476
-
Craig Topper authored
We only need the extractelt that corresponds to the register we're trying to insert back into. We can't guarantee the others haven't been optimized out depending on how those operands were produced. So instead just look for an FR32/FR64 input and emit a COPY_TO_REGCLASS to VR128 in the output pattern. This matches what we do for ADD/SUB/MUL/DIV. llvm-svn: 333473
-
Shoaib Meenai authored
Peter Collingbourne suggested moving the switch to the top of the function, so that all the code that cares about the output section for a symbol is in the same place. Differential Revision: https://reviews.llvm.org/D47497 llvm-svn: 333472
-
Richard Trieu authored
-Warc-repeated-use-of-weak may trigger a segmentation fault when the Decl being checked is outside of a function scope, leaving the current function info pointer null. This adds a check before using the function info. llvm-svn: 333471
-
Petr Hosek authored
While this value is initialized with the DefaultTargetTriple, it can be later overriden using the -target flag so TargetTriple is a more accurate name. This change also provides an accessor which could be accessed from ToolChain implementations. Differential Revision: https://reviews.llvm.org/D47357 llvm-svn: 333468
-
Marshall Clow authored
Fix embarrasing typo in uncaught_exceptions. Update tests to really test this. Thanks to Peter Klotz for calling my attention to this. llvm-svn: 333467
-
Davide Italiano authored
Not strictly necessary, but makes the test more robust in case we end up changing the defaults. <rdar://problem/40622096> llvm-svn: 333466
-
Davide Italiano authored
While I'm here, delete some dead code. <rdar://problem/40622096> llvm-svn: 333465
-
- May 29, 2018
-
-
Craig Topper authored
llvm-svn: 333464
-
Craig Topper authored
[X86] Rename the operands in the recently introduced MOVSS+FMA patterns so that the operand names in the output pattern are always in 1, 2, 3 order since those are the operand names in the instruction. The order should be controlled in the input pattern. llvm-svn: 333463
-
Sam Clegg authored
The DEBUG macro was renamed LLVM_DEBUG. llvm-svn: 333462
-
Chandler Carruth authored
be both simpler and substantially more efficient. Rather than use a hand-rolled iteration technique that isn't quite the same as RPO, use the pre-built RPO loop body traversal utility. Once visiting the loop body in RPO, we can assert that we visit defs before uses reliably. When this is the case, the only need to iterate is when simplifying a def that is used by a PHI node along a back-edge. With this patch, the first pass over the loop body is just a complete simplification of every instruction across the loop body. When we encounter a use of a simplified instruction that stems from a PHI node in the loop body that has already been visited (due to some cyclic CFG, potentially the loop itself, or a nested loop, or unstructured control flow), we recall that specific PHI node for the second iteration. Nothing else needs to be preserved from iteration to iteration. On the second and later iterations, only instructions known to have simplified inputs are considered, each time starting from a set of PHIs that had simplified inputs along the backedges. Dead instructions are collected along the way, but deleted in a batch at the end of each iteration making the iterations themselves substantially simpler. This uses a new batch API for recursively deleting dead instructions. This alsa changes the routine to visit subloops. Because simplification is fundamentally transitive, we may need to visit the entire loop body, including subloops, to handle knock-on simplification. I've added a basic test file that helps demonstrate that all of these changes work. It includes both straight-forward loops with simplifications as well as interesting PHI-structures, CFG-structures, and a nested loop case. Differential Revision: https://reviews.llvm.org/D47407 llvm-svn: 333461
-
Craig Topper authored
The code could issue a truncate from a small type to larger type. We need to extend in that case instead. llvm-svn: 333460
-
Sam Clegg authored
This should address some of the assert failures the fuzzer has been finding such as: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6719 Differential Revision: https://reviews.llvm.org/D47086 llvm-svn: 333459
-
Matt Arsenault authored
llvm-svn: 333458
-
Matt Arsenault authored
llvm-svn: 333457
-
Matt Arsenault authored
AFAIK the driver's allocation will actually have to round this up anyway. It is useful to track the rounded up size, so that the end of the kernel segment is known to be dereferencable so a wider s_load_dword can be used for a short argument at the end of the segment. llvm-svn: 333456
-
Sameer AbuAsal authored
Summary: Base and offset are always separated when a GlobalAddress node is lowered (rL332641) as an optimization to reduce instruction count. However, this optimization is not profitable if the Global Address ends up being used in only instruction. This patch adds peephole optimizations that merge an offset of an address calculation into the LUI %%hi and ADD %lo of the lowering sequence. The peephole handles three patterns: 1) ADDI (ADDI (LUI %hi(global)) %lo(global)), offset ---> ADDI (LUI %hi(global + offset)) %lo(global + offset). This generates: lui a0, hi (global + offset) add a0, a0, lo (global + offset) Instead of lui a0, hi (global) addi a0, hi (global) addi a0, offset This pattern is for cases when the offset is small enough to fit in the immediate filed of ADDI (less than 12 bits). 2) ADD ((ADDI (LUI %hi(global)) %lo(global)), (LUI hi_offset)) ---> offset = hi_offset << 12 ADDI (LUI %hi(global + offset)) %lo(global + offset) Which generates the ASM: lui a0, hi(global + offset) addi a0, lo(global + offset) Instead of: lui a0, hi(global) addi a0, lo(global) lui a1, (offset) add a0, a0, a1 This pattern is for cases when the offset doesn't fit in an immediate field of ADDI but the lower 12 bits are all zeros. 3) ADD ((ADDI (LUI %hi(global)) %lo(global)), (ADDI lo_offset, (LUI hi_offset))) ---> offset = global + offhi20<<12 + offlo12 ADDI (LUI %hi(global + offset)) %lo(global + offset) Which generates the ASM: lui a1, %hi(global + offset) addi a1, %lo(global + offset) Instead of: lui a0, hi(global) addi a0, lo(global) lui a1, (offhi20) addi a1, (offlo12) add a0, a0, a1 This pattern is for cases when the offset doesn't fit in an immediate field of ADDI and both the lower 1 bits and high 20 bits are non zero. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang llvm-svn: 333455
-
Daniel Neilson authored
Summary: A simple change to derive mod/ref info from the atomic memcpy intrinsic in the same way as from the regular memcpy intrinsic. llvm-svn: 333454
-
Douglas Yung authored
llvm-svn: 333453
-
Jan Kratochvil authored
It was not implemented correctly after https://reviews.llvm.org/D46810 but then it has not been used anywhere anyway. llvm-svn: 333452
-
Konstantin Zhuravlyov authored
it is set by CP Differential Revision: https://reviews.llvm.org/D47392 llvm-svn: 333451
-
Shoaib Meenai authored
Rather than using a loop to compare symbol RVAs to the starting RVAs of sections to determine which section a symbol belongs to, just get the output section of a symbol directly via its chunk, and bail if the symbol doesn't have an output section, which avoids having to hardcode logic for handling dead symbols, CodeView symbols, etc. This was suggested by Reid Kleckner; thank you. This also fixes writing out symbol tables in the presence of RVA table input sections (e.g. .sxdata and .gfids). Such sections aren't written to the output file directly, so their RVA is 0, and the loop would thus fail to find an output section for them, resulting in a segfault. Extend some existing tests to cover this case. Fixes PR37584. Differential Revision: https://reviews.llvm.org/D47391 llvm-svn: 333450
-
Jan Kratochvil authored
Alex Langford has reported it from: https://reviews.llvm.org/D46810 llvm-svn: 333449
-
Florian Hahn authored
This should fix a few buildbot failures with old GCC versions. llvm-svn: 333448
-
Akira Hatanaka authored
initialization functions to 'cxx_fast_tlscc'. This fixes a bug where instructions calling initialization functions for thread-local static members of c++ template classes were using calling convention 'cxx_fast_tlscc' while the called functions weren't annotated with the calling convention. rdar://problem/40447463 Differential Revision: https://reviews.llvm.org/D47354 llvm-svn: 333447
-
Craig Topper authored
llvm-svn: 333446
-