- Mar 18, 2019
-
-
Tim Renouf authored
Allow the clamp modifier on vop3 int arithmetic instructions in assembly and disassembly. This involved adding a clamp operand to the affected instructions in MIR and MC, and thus having to fix up several places in codegen and MIR tests. Differential Revision: https://reviews.llvm.org/D59267 Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e llvm-svn: 356399
-
Tim Renouf authored
This commit allows v_cndmask_b32_e64 with abs, neg source modifiers on src0, src1 to be assembled and disassembled. This does appear to be allowed, even though they are floating point modifiers and the operand type is b32. To do this, I added src0_modifiers and src1_modifiers to the MachineInstr, which involved fixing up several places in codegen and mir tests. Differential Revision: https://reviews.llvm.org/D59191 Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea llvm-svn: 356398
-
Erik Pilkington authored
These diagnose overflowing calls to subset of fortifiable functions. Some functions, like sprintf or strcpy aren't supported right not, but we should probably support these in the future. We previously supported this kind of functionality with -Wbuiltin-memcpy-chk-size, but that diagnostic doesn't work with _FORTIFY implementations that use wrapper functions. Also unlike that diagnostic, we emit these warnings regardless of whether _FORTIFY_SOURCE is actually enabled, which is nice for programs that don't enable the runtime checks. Why not just use diagnose_if, like Bionic does? We can get better diagnostics in the compiler (i.e. mention the sizes), and we have the potential to diagnose sprintf and strcpy which is impossible with diagnose_if (at least, in languages that don't support C++14 constexpr). This approach also saves standard libraries from having to add diagnose_if. rdar://48006655 Differential revision: https://reviews.llvm.org/D58797 llvm-svn: 356397
-
Amara Emerson authored
After review comments, it was preferred to not teach MachineIRBuilder about non-generic instructions beyond using buildInstr(). For AArch64 I've changed the buildCopy() calls to buildInstr() + a separate addReg() call. This also relaxes the MachineIRBuilder's COPY checking more because it may not always have a SrcOp given to it. llvm-svn: 356396
-
Alexandre Ganea authored
Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count). With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention. Also fix the * Linker * contrib section which wasn't correctly emitted previously. Differential Revision: https://reviews.llvm.org/D59502 llvm-svn: 356395
-
Tim Renouf authored
This fixes a couple of unflushed raw_string_ostream bugs in recent commits that only show up on a bot building on windows with expensive checks. Differential Revision: https://reviews.llvm.org/D59396 Change-Id: I9c6208325503b3ee0786b4b688e13fc24a15babf llvm-svn: 356394
-
Craig Topper authored
[X86] Rename imm8_su/imm16_su/imm32_su to relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are. llvm-svn: 356393
-
Warren Ristow authored
This reinstates r347934, along with a tweak to address a problem with PHI node ordering that that commit created (or exposed). (That commit was reverted at r348426, due to the PHI node issue.) Original commit message: r320789 suppressed moving the insertion point of SCEV expressions with dev/rem operations to the loop header in non-loop-invariant situations. This, and similar, hoisting is also unsafe in the loop-invariant case, since there may be a guard against a zero denominator. This is an adjustment to the fix of r320789 to suppress the movement even in the loop-invariant case. This fixes PR30806. Differential Revision: https://reviews.llvm.org/D57428 llvm-svn: 356392
-
Adhemerval Zanella authored
It uses the generic AArch64_IMM::expandMOVImm to get the correct number of instruction used in immediate materialization. Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58461 llvm-svn: 356391
-
Adhemerval Zanella authored
This patch follows some ideas from r352866 to optimize the floating point materialization even further. It changes isFPImmLegal to considere up to 2 mov instruction or up to 5 in case subtarget has fused literals. The rationale is the cost is the same for mov+fmov vs. adrp+ldr; but the mov+fmov sequence is always better because of the reduced d-cache pressure. The timings are still the same if you consider movw+movk+fmov vs. adrp+ldr will be fused (although one instruction longer). Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58460 llvm-svn: 356390
-
Adhemerval Zanella authored
This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 llvm-svn: 356389
-
Alexey Bataev authored
The default scheduling for doacross loops is changed from static to static, 1. llvm-svn: 356388
-
Adhemerval Zanella authored
It splits the login of actual instruction emission away from the logic that figures out the appropriate sequence on AArch64ExpandPseudo::expandMOVImm. The new function AArch64_IMM::expandMOVImm, which return the list of the instructions to materialize the immediate constant, is implemented on a separated unit because it will be used in a subsequent patch to optimize floating point materialization. Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58915 llvm-svn: 356387
-
Louis Dionne authored
llvm-svn: 356386
-
Michael Liao authored
llvm-svn: 356385
-
Craig Topper authored
[X86] Remove the _alt forms of (V)CMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Similar to previous change done for VPCOM and VPCMP Differential Revision: https://reviews.llvm.org/D59468 llvm-svn: 356384
-
Sanjay Patel authored
llvm-svn: 356383
-
Nirav Dave authored
Delete temporarily constructed node uses for analysis after it's use, holding onto original input nodes. Ideally this would be rewritten without making nodes, but this appears relatively complex. Reviewers: spatel, RKSimon, craig.topper Subscribers: jdoerfert, hiraditya, deadalnix, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57921 llvm-svn: 356382
-
Michael Liao authored
llvm-svn: 356381
-
Nico Weber authored
It seems to pass fine on my Mac, and it running it only on Windows made me miss it in r355959 and required r355959. When the test was added in r288992 we still used Win-only UnDecorateSymbolName() for demangling. Now we use LLVM's microsoftDemangle() which is cross-platform. Differential Revision: https://reviews.llvm.org/D59497 llvm-svn: 356380
-
Pavel Labath authored
Test hangs under heavy load. llvm-svn: 356379
-
Pavel Labath authored
gcc-8 diagnoses these. llvm-svn: 356378
-
Pavel Labath authored
Use floor-division for consistentcy across python versions. This fixes a couple of libstdc++ data formatter tests. llvm-svn: 356377
-
Louis Dionne authored
Even though the header makes the exact same check since https://llvm.org/D59063, the headers could conceivably change in the future and introduce a bug. llvm-svn: 356376
-
Siva Chandra authored
Reviewers: espindola Subscribers: emaste, arichardson, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59275 llvm-svn: 356374
-
Neil Henning authored
Add an experimental buffer fat pointer address space that is currently unhandled in the backend. This commit reserves address space 7 as a non-integral pointer repsenting the 160-bit fat pointer (128-bit buffer descriptor + 32-bit offset) that is heavily used in graphics workloads using the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D58957 llvm-svn: 356373
-
Sanjay Patel authored
Follow-up to: rL356338 rL356369 We can calculate an arbitrary vector constant minus the bitwidth, so there's no need to limit this transform to scalars and splats. llvm-svn: 356372
-
George Rimar authored
This fixes the https://bugs.llvm.org/show_bug.cgi?id=40980. Previously if string optimization occurred as a result of StringTableBuilder's finalize() method, the size wasn't updated. This hopefully also makes the interaction between sections during finalization processes a bit more clear. Differential revision: https://reviews.llvm.org/D59488 llvm-svn: 356371
-
Pavel Labath authored
s/iteritems/items llvm-svn: 356370
-
Sanjay Patel authored
Follow-up to: rL356338 Rotates are a special case of funnel shift where the 2 input operands are the same value, but that does not need to be a restriction for the canonicalization when the shift amount is a constant. llvm-svn: 356369
-
Simon Pilgrim authored
Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC) Approved by @uweigand (Ulrich Weigand) llvm-svn: 356368
-
Sanjay Patel authored
llvm-svn: 356367
-
Fangrui Song authored
Summary: -ignore specifies a list of PP callbacks to ignore. It cannot express a whitelist, which may be more useful than a blacklist. Add a new option -callbacks to replace it. -ignore= (default) => -callbacks='*' (default) -ignore=FileChanged,FileSkipped => -callbacks='*,-FileChanged,-FileSkipped' -callbacks='Macro*' : print only MacroDefined,MacroExpands,MacroUndefined,... Reviewers: juliehockett, aaron.ballman, alexfh, ioeric Reviewed By: aaron.ballman Subscribers: nemanjai, kbarton, jsji, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D59296 llvm-svn: 356366
-
Roman Lebedev authored
Results in much nicer -help output: ``` $ ./bin/llvm-exegesis -help USAGE: llvm-exegesis [options] OPTIONS: Color Options: -color - Use colors in output (default=autodetect) General options: -enable-cse-in-irtranslator - Should enable CSE in irtranslator -enable-cse-in-legalizer - Should enable CSE in Legalizer Generic Options: -help - Display available options (-help-hidden for more) -help-list - Display list of available options (-help-list-hidden for more) -version - Display the version of this program llvm-exegesis analysis options: -analysis-clustering-epsilon=<number> - dbscan epsilon for benchmark point clustering -analysis-clusters-output-file=<string> - -analysis-display-unstable-clusters - if there is more than one benchmark for an opcode, said benchmarks may end up not being clustered into the same cluster if the measured performance characteristics are different. by default all such opcodes are filtered out. this flag will instead show only such unstable opcodes -analysis-inconsistencies-output-file=<string> - -analysis-inconsistency-epsilon=<number> - epsilon for detection of when the cluster is different from the LLVM schedule profile values -analysis-numpoints=<uint> - minimum number of points in an analysis cluster llvm-exegesis benchmark options: -ignore-invalid-sched-class - ignore instructions that do not define a sched class -mode=<value> - the mode to run =latency - Instruction Latency =inverse_throughput - Instruction Inverse Throughput =uops - Uop Decomposition =analysis - Analysis -num-repetitions=<uint> - number of time to repeat the asm snippet -opcode-index=<int> - opcode to measure, by index -opcode-name=<string> - comma-separated list of opcodes to measure, by name -snippets-file=<string> - code snippets to measure llvm-exegesis options: -benchmarks-file=<string> - File to read (analysis mode) or write (latency/uops/inverse_throughput modes) benchmark results. “-” uses stdin/stdout. -mcpu=<string> - cpu name to use for pfm counters, leave empty to autodetect ``` llvm-svn: 356364
-
David Stenberg authored
Summary: Look past bitcasts when looking for parameter debug values that are described by frame-index loads in `EmitFuncArgumentDbgValue()`. In the attached test case we would be left with an undef `DBG_VALUE` for the parameter without this patch. A similar fix was done for parameters passed in registers in D13005. This fixes PR40777. Reviewers: aprantl, vsk, jmorse Reviewed By: aprantl Subscribers: bjope, javed.absar, jdoerfert, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D58831 llvm-svn: 356363
-
Pavel Labath authored
These warnings start to get emitted with gcc-8. llvm-svn: 356362
-
Pavel Labath authored
Summary: This is a preparatory step to enable adding of unwind plans by symbol file plugins. Although at the surface it seems that currently symbol files have nothing to do with unwinding, this isn't entirely correct even now. The mere act of adding a symbol file can have the effect of making more sections (typically .debug_frame) available to the unwinding machinery, so that it can have more unwind strategies to choose from. Up until now, we've had a bug, which went largely unnoticed, where unwind info in the manually added symbols files (target symbols add) was being ignored during unwinding. Reinitializing the UnwindTable fixes that bug too. Reviewers: clayborg, jasonmolenda, alexshap Subscribers: jdoerfert, lldb-commits Differential Revision: https://reviews.llvm.org/D58347 llvm-svn: 356361
-
Christof Douma authored
Fixes https://bugs.llvm.org/show_bug.cgi?id=35094 The Dead register definition pass should leave alone the atomicrmw instructions on AArch64 (LTE extension). The reason is the following statement in the Arm ARM: "The ST<OP> instructions, and LD<OP> instructions where the destination register is WZR or XZR, are not regarded as doing a read for the purpose of a DMB LD barrier." A good example was given in the gcc thread by Will Deacon (linked in the bugzilla ticket 35094): P0 (atomic_int* y,atomic_int* x) { atomic_store_explicit(x,1,memory_order_relaxed); atomic_thread_fence(memory_order_release); atomic_store_explicit(y,1,memory_order_relaxed); } P1 (atomic_int* y,atomic_int* x) { atomic_fetch_add_explicit(y,1,memory_order_relaxed); // STADD atomic_thread_fence(memory_order_acquire); int r0 = atomic_load_explicit(x,memory_order_relaxed); } P2 (atomic_int* y) { int r1 = atomic_load_explicit(y,memory_order_relaxed); } My understanding is that it is forbidden for r0 == 0 and r1 == 2 after this test has executed. However, if the relaxed add in P1 compiles to STADD and the subsequent acquire fence is compiled as DMB LD, then we don't have any ordering guarantees in P1 and the forbidden result could be observed. Change-Id: I419f9f9df947716932038e1100c18d10a96408d0 llvm-svn: 356360
-
Craig Topper authored
llvm-svn: 356359
-
Alex Bradbury authored
llvm-svn: 356358
-