- Apr 11, 2018
-
-
David Blaikie authored
These aren't the .def style files used in LLVM that require a macro defined before their inclusion - they're just basic non-modular includes to stamp out command line flag variables. llvm-svn: 329840
-
Daniel Neilson authored
Summary: In preparation for a future commit, this regenerates the test checks for test/Transforms/DeadStoreElimination/OverwriteStoreBegin.ll test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll llvm-svn: 329839
-
Peter Collingbourne authored
Most importantly, we should not replace slashes with backslashes because that would invalidate the path. Differential Revision: https://reviews.llvm.org/D45473 llvm-svn: 329838
-
Simon Pilgrim authored
Atom is the only x86 target that still uses schedule itineraries, if we can remove this then we can begin the work on removing x86 itineraries. I've also found that it will help with PR36550. I've focussed on matching the existing model as closely as possible (relying on the schedule tests), PR36895 indicated a lot of these were incorrect but we can just as easily fix these after this patch as before. Hopefully we can get llvm-exegesis to help here, There are a few instructions that rely on itinerary scheduling (mainly push/pop/return) of multiple resource stages, but I don't think any of these are show stoppers. There are also a few codegen changes that seem related to the post-ra scheduler acting a little differently, I haven't tracked these down but they don't seem critical. NOTE: I don't have access to any Atom hardware, so this hasn't been tested in the wild. Differential Revision: https://reviews.llvm.org/D45486 llvm-svn: 329837
-
Andrea Di Biagio authored
[llvm-mca] Let the Scheduler notify dispatch stall events caused by the lack of scheduling resources. This patch moves part of the logic that notifies dispatch stall events from the DispatchUnit to the Scheduler. The main goal of this patch is to remove (yet another) dependency between the DispatchUnit and the Scheduler. Before this patch, the DispatchUnit had to know about `Scheduler::Event` and how to classify stalls due to the lack of scheduling resources. This patch removes that knowledge and simplifies the logic in DispatchUnit::checkScheduler. This is another change done in preparation for the work to fix PR36663. No functional change intended. llvm-svn: 329835
-
Simon Pilgrim authored
Pre-commit for D45486, don't rely on itinerary scheduler model to determine latencies for padding, use the generic TargetSchedModel::computeInstrLatency call. Also, replace hard coded (atom specific) 2*uop creation per padding cycle with a version based on the scheduler model's issue width. Differential Revision: https://reviews.llvm.org/D45486 llvm-svn: 329834
-
Artem Belevich authored
Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329830
-
Artem Belevich authored
When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature, consider those features available if we're compiling for GPU >= sm_XX or have enabled PTX version >= ptxYY. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329829
-
Tim Renouf authored
Summary: This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to allow for registers set up in wave dispatch, even if those registers are not used in the shader. Re-landed after noticing that the buildbot failure from 329808 seemed to be unrelated. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45503 Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771 llvm-svn: 329826
-
Daniel Neilson authored
Summary: In preparation for a future commit, this regenerates the test checks for test/Transforms/DeadStoreElimination/simple.ll test/Transforms/DeadStoreElimination/memintrinsics.ll llvm-svn: 329824
-
Reid Kleckner authored
This is causing compilation timeouts on code with long sequences of local values and calls (i.e. foo(1); foo(2); foo(3); ...). It turns out that code coverage instrumentation is a great way to create sequences like this, which how our users ran into the issue in practice. Intel has a tool that detects these kinds of non-linear compile time issues, and Andy Kaylor reported it as PR37010. The current sinking code scans the whole basic block once per local value sink, which happens before emitting each call. In theory, local values should only be introduced to be used by instructions between the current flush point and the last flush point, so we should only need to scan those instructions. llvm-svn: 329822
-
Sanjay Patel authored
llvm-svn: 329821
-
Paul Robinson authored
Previously the MD5 option of the .file directive provided the checksum as a quoted hex string; now it's a normal hex number with 0x prefix, same as the .octa directive accepts. Differential Revision: https://reviews.llvm.org/D45459 llvm-svn: 329820
-
Petar Jovanovic authored
Add the minimal support necessary to lower a function that returns the sum of two i32 values. Support argument/return lowering of i32 values through registers only. Add tablegen for regbankselect and instructionselect. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D44304 llvm-svn: 329819
-
Haicheng Wu authored
llvm-svn: 329818
-
Yaxun Liu authored
Two issues were fixed: runtime has difficulty to allocate memory for an external symbol of a kernel and set the address of the external symbol, therefore make the runtime handle of an enqueued kernel an ordinary global variable. Runtime only needs to store the address of the loaded kernel to the handle and has verified that this approach works. handle the situation where __enqueue_kernel* gets inlined therefore the enqueued kernel may be used through a constant expr instead of an instruction. Differential Revision: https://reviews.llvm.org/D45187 llvm-svn: 329815
-
Andrea Di Biagio authored
It caused a buildbot failure (clang-ppc64le-linux-multistage - build #6424) llvm-svn: 329812
-
Tim Renouf authored
This reverts 329808. That change caused a report of a failure in test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir that I didn't see. I suspect it is an expensive-check-only error. Change-Id: I8133f26f15e7d5ec2b09c687c12cd70e918461b0 llvm-svn: 329811
-
Sander de Smalen authored
Summary: Place parsing of a vector index into a separate function to reduce duplication, since the code is duplicated in both the parsing of a Neon vector register operand and a Neon vector list. This is patch [2/6] in a series to add assembler/disassembler support for SVE's contiguous ST1 (scalar+imm) instructions. Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45428 llvm-svn: 329809
-
Tim Renouf authored
Summary: This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to allow for registers set up in wave dispatch, even if those registers are not used in the shader. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45503 Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771 llvm-svn: 329808
-
Andrea Di Biagio authored
llvm-svn: 329807
-
Simon Pilgrim authored
Split variable index shuffles from immediate index shuffles WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.) WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.) WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.) WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.) Differential Revision: https://reviews.llvm.org/D45404 llvm-svn: 329806
-
Francis Visoiu Mistrih authored
Forgot to add a test case in the previous commit. llvm-svn: 329805
-
Simon Pilgrim authored
movhps/movlps test are still broken so we can't disable sse2 yet llvm-svn: 329802
-
Dmitry Preobrazhensky authored
See bug 36845: https://bugs.llvm.org/show_bug.cgi?id=36845 Differential Revision: https://reviews.llvm.org/D45443 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329801
-
Francis Visoiu Mistrih authored
In r329691, we would choose FP even if the offset wouldn't fit, just because the offset is smaller than the one from BP. This made many accesses through FP need to scavenge a register, which resulted in slower and bigger code for no good reason. This patch now always picks the offset that fits first, even if FP is preferred. llvm-svn: 329797
-
Andrea Di Biagio authored
llvm-svn: 329796
-
Andrea Di Biagio authored
Also, removed flag -verbose in favor of flag -retire-stats. llvm-svn: 329794
-
Andrea Di Biagio authored
Added flag -scheduler-stats to print scheduler related statistics. llvm-svn: 329792
-
Artur Gainullin authored
Bitwise 'not' of the min/max could be eliminated in the pattern: %notx = xor i32 %x, -1 %cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y %smax = select i1 %cmp1, i32 %notx, i32 %y %res = xor i32 %smax, -1 https://rise4fun.com/Alive/lCN Reviewers: spatel Reviewed by: spatel Subscribers: a.elovikov, llvm-commits Differential Revision: https://reviews.llvm.org/D45317 llvm-svn: 329791
-
Sjoerd Meijer authored
This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT. More work is required in this area, which will be addressed soon: - more variants need to be regression tested, but this depends on the next point. - first LowerConstantFP need to be adjusted for fp16 values. Differential Revision: https://reviews.llvm.org/D45205 llvm-svn: 329788
-
Clement Courbet authored
llvm-svn: 329783
-
Sander de Smalen authored
Summary: Merged 'tryMatchVectorRegister' (specific to Neon) and 'tryParseSVERegister' into a single 'tryParseVectorRegister' function, and created a generic 'parseVectorKind()' function that returns the #Elements and ElementWidth of a vector suffix. This reduces the duplication of this functionality between two the vector implementations. This is patch [1/6] in a series to add assembler/disassembler support for SVE's contiguous ST1 (scalar+imm) instructions. Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: fhahn Subscribers: tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D45427 llvm-svn: 329782
-
Clement Courbet authored
Summary: Fixes PR37053. Reviewers: uabelho, gchatelet Subscribers: mgorny, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D45436 llvm-svn: 329781
-
Petr Hosek authored
This was removed in D39932 but turned out this is actually needed because runtimes such as compiler-rt and libc++ rely on common options processing for setting certain flags such as -ffunction-sections and -fdata-sections. Differential Revision: https://reviews.llvm.org/D45507 llvm-svn: 329778
-
Craig Topper authored
[X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace 512-bit masked intrinsic with unmasked intrinsic and a select. The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency. llvm-svn: 329774
-
Craig Topper authored
[X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit an explicit MOV8mr instruction. Previously the code only knew how to handle setcc to a register. This should fix a crash in the chromium build. llvm-svn: 329771
-
Craig Topper authored
llvm-svn: 329769
-
Sriraman Tallam authored
With -fno-plt, for example, calls to printf when getting converted to puts still use the PLT. This patch checks for the metadata "RtLibUseGOT" and annotates the declaration with the right attributes. Differential Revision: https://reviews.llvm.org/D45180 llvm-svn: 329768
-
Rui Ueyama authored
llvm-svn: 329767
-