Skip to content
  1. Mar 18, 2019
    • Tim Renouf's avatar
      [AMDGPU] Asm/disasm clamp modifier on vop3 int arithmetic · cfdfba99
      Tim Renouf authored
      Allow the clamp modifier on vop3 int arithmetic instructions in assembly
      and disassembly.
      
      This involved adding a clamp operand to the affected instructions in MIR
      and MC, and thus having to fix up several places in codegen and MIR
      tests.
      
      Differential Revision: https://reviews.llvm.org/D59267
      
      Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e
      llvm-svn: 356399
      cfdfba99
    • Tim Renouf's avatar
      [AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers · 2e94f6e5
      Tim Renouf authored
      This commit allows v_cndmask_b32_e64 with abs, neg source
      modifiers on src0, src1 to be assembled and disassembled.
      
      This does appear to be allowed, even though they are floating point
      modifiers and the operand type is b32.
      
      To do this, I added src0_modifiers and src1_modifiers to the
      MachineInstr, which involved fixing up several places in codegen and mir
      tests.
      
      Differential Revision: https://reviews.llvm.org/D59191
      
      Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea
      llvm-svn: 356398
      2e94f6e5
    • Erik Pilkington's avatar
      [Sema] Add some compile time _FORTIFY_SOURCE diagnostics · b6e16ea0
      Erik Pilkington authored
      These diagnose overflowing calls to subset of fortifiable functions. Some
      functions, like sprintf or strcpy aren't supported right not, but we should
      probably support these in the future. We previously supported this kind of
      functionality with -Wbuiltin-memcpy-chk-size, but that diagnostic doesn't work
      with _FORTIFY implementations that use wrapper functions. Also unlike that
      diagnostic, we emit these warnings regardless of whether _FORTIFY_SOURCE is
      actually enabled, which is nice for programs that don't enable the runtime
      checks.
      
      Why not just use diagnose_if, like Bionic does? We can get better diagnostics in
      the compiler (i.e. mention the sizes), and we have the potential to diagnose
      sprintf and strcpy which is impossible with diagnose_if (at least, in languages
      that don't support C++14 constexpr). This approach also saves standard libraries
      from having to add diagnose_if.
      
      rdar://48006655
      
      Differential revision: https://reviews.llvm.org/D58797
      
      llvm-svn: 356397
      b6e16ea0
    • Amara Emerson's avatar
      Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy() · 8627178d
      Amara Emerson authored
      After review comments, it was preferred to not teach MachineIRBuilder about
      non-generic instructions beyond using buildInstr().
      
      For AArch64 I've changed the buildCopy() calls to buildInstr() + a
      separate addReg() call.
      
      This also relaxes the MachineIRBuilder's COPY checking more because it may
      not always have a SrcOp given to it.
      
      llvm-svn: 356396
      8627178d
    • Alexandre Ganea's avatar
      [DebugInfo][PDB] Don't write empty debug streams · 4aeea4cc
      Alexandre Ganea authored
      Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count).
      
      With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention.
      Also fix the * Linker * contrib section which wasn't correctly emitted previously.
      
      Differential Revision: https://reviews.llvm.org/D59502
      
      llvm-svn: 356395
      4aeea4cc
    • Tim Renouf's avatar
      [MsgPack][AMDGPU] Fix unflushed raw_string_ostream bugs on windows expensive checks bot · 8723a565
      Tim Renouf authored
      This fixes a couple of unflushed raw_string_ostream bugs in recent
      commits that only show up on a bot building on windows with expensive
      checks.
      
      Differential Revision: https://reviews.llvm.org/D59396
      
      Change-Id: I9c6208325503b3ee0786b4b688e13fc24a15babf
      llvm-svn: 356394
      8723a565
    • Craig Topper's avatar
      [X86] Rename imm8_su/imm16_su/imm32_su to... · f07062a7
      Craig Topper authored
      [X86] Rename imm8_su/imm16_su/imm32_su to relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are.
      
      llvm-svn: 356393
      f07062a7
    • Warren Ristow's avatar
      [SCEV] Guard movement of insertion point for loop-invariants · ad7d0ded
      Warren Ristow authored
      This reinstates r347934, along with a tweak to address a problem with
      PHI node ordering that that commit created (or exposed). (That commit
      was reverted at r348426, due to the PHI node issue.)
      
      Original commit message:
      
      r320789 suppressed moving the insertion point of SCEV expressions with
      dev/rem operations to the loop header in non-loop-invariant situations.
      This, and similar, hoisting is also unsafe in the loop-invariant case,
      since there may be a guard against a zero denominator. This is an
      adjustment to the fix of r320789 to suppress the movement even in the
      loop-invariant case.
      
      This fixes PR30806.
      
      Differential Revision: https://reviews.llvm.org/D57428
      
      llvm-svn: 356392
      ad7d0ded
    • Adhemerval Zanella's avatar
      [AArch64] Small fix for getIntImmCost · 270249de
      Adhemerval Zanella authored
      It uses the generic AArch64_IMM::expandMOVImm to get the correct
      number of instruction used in immediate materialization.
      
      Reviewers: efriedma
      
      Differential Revision: https://reviews.llvm.org/D58461
      
      llvm-svn: 356391
      270249de
    • Adhemerval Zanella's avatar
      [AArch64] Optimize floating point materialization · a3cefa5d
      Adhemerval Zanella authored
      This patch follows some ideas from r352866 to optimize the floating
      point materialization even further. It changes isFPImmLegal to
      considere up to 2 mov instruction or up to 5 in case subtarget has
      fused literals.
      
      The rationale is the cost is the same for mov+fmov vs. adrp+ldr; but
      the mov+fmov sequence is always better because of the reduced d-cache
      pressure. The timings are still the same if you consider movw+movk+fmov
      vs. adrp+ldr will be fused (although one instruction longer).
      
      Reviewers: efriedma
      
      Differential Revision: https://reviews.llvm.org/D58460
      
      llvm-svn: 356390
      a3cefa5d
    • Adhemerval Zanella's avatar
      [TargetLowering] Add code size information on isFPImmLegal. NFC · 664c1ef5
      Adhemerval Zanella authored
      This allows better code size for aarch64 floating point materialization
      in a future patch.
      
      Reviewers: evandro
      
      Differential Revision: https://reviews.llvm.org/D58690
      
      llvm-svn: 356389
      664c1ef5
    • Alexey Bataev's avatar
      [OPENMP] Set scheduling for doacross loops as schedule, 1. · f6a53d63
      Alexey Bataev authored
      The default scheduling for doacross loops is changed from static to
      static, 1.
      
      llvm-svn: 356388
      f6a53d63
    • Adhemerval Zanella's avatar
      [AArch64] Refactor floating point materialization. NFC · 8a595b1d
      Adhemerval Zanella authored
      It splits the login of actual instruction emission away from the logic
      that figures out the appropriate sequence on AArch64ExpandPseudo::expandMOVImm.
      The new function AArch64_IMM::expandMOVImm, which return the list of the 
      instructions to materialize the immediate constant, is implemented on a 
      separated unit because it will be used in a subsequent patch to optimize
      floating point materialization.
      
      Reviewers: efriedma
      
      Differential Revision: https://reviews.llvm.org/D58915
      
      llvm-svn: 356387
      8a595b1d
    • Louis Dionne's avatar
      [libc++][NFC] Promote CMake comment to an actual option description · 0c962cb5
      Louis Dionne authored
      llvm-svn: 356386
      0c962cb5
    • Michael Liao's avatar
      3c2aadbe
    • Craig Topper's avatar
      [X86] Remove the _alt forms of (V)CMP instructions. Use a combination of... · c2b35ebc
      Craig Topper authored
      [X86] Remove the _alt forms of (V)CMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more
      
      Similar to previous change done for VPCOM and VPCMP
      
      Differential Revision: https://reviews.llvm.org/D59468
      
      llvm-svn: 356384
      c2b35ebc
    • Sanjay Patel's avatar
      [InstCombine] add/adjust test for NaN checks; NFC · 08b5e68e
      Sanjay Patel authored
      llvm-svn: 356383
      08b5e68e
    • Nirav Dave's avatar
      [DAG] Cleanup unused node in SimplifySelectCC. · 55c921f4
      Nirav Dave authored
      Delete temporarily constructed node uses for analysis after it's use,
      holding onto original input nodes. Ideally this would be rewritten
      without making nodes, but this appears relatively complex.
      
      Reviewers: spatel, RKSimon, craig.topper
      
      Subscribers: jdoerfert, hiraditya, deadalnix, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D57921
      
      llvm-svn: 356382
      55c921f4
    • Michael Liao's avatar
      [MVT] Fix typos in comment. NFC. · c131e0e2
      Michael Liao authored
      llvm-svn: 356381
      c131e0e2
    • Nico Weber's avatar
      lld-link: Run conflict-mangled.test on all systems · 2b1dca79
      Nico Weber authored
      It seems to pass fine on my Mac, and it running it only on Windows made
      me miss it in r355959 and required r355959.
      
      When the test was added in r288992 we still used Win-only
      UnDecorateSymbolName() for demangling. Now we use LLVM's
      microsoftDemangle() which is cross-platform.
      
      Differential Revision: https://reviews.llvm.org/D59497
      
      llvm-svn: 356380
      2b1dca79
    • Pavel Labath's avatar
      Skip TestVSCode_setFunctionBreakpoints on linux · 0e5012ea
      Pavel Labath authored
      Test hangs under heavy load.
      
      llvm-svn: 356379
      0e5012ea
    • Pavel Labath's avatar
      Fix some "variable 'foo' set but not used" warnings · 370e5dba
      Pavel Labath authored
      gcc-8 diagnoses these.
      
      llvm-svn: 356378
      370e5dba
    • Pavel Labath's avatar
      Fix libstdc++ data formatters for python3 · 22457e66
      Pavel Labath authored
      Use floor-division for consistentcy across python versions. This fixes a
      couple of libstdc++ data formatter tests.
      
      llvm-svn: 356377
      22457e66
    • Louis Dionne's avatar
      [libc++] Add a test for PR40977 · 2bde5303
      Louis Dionne authored
      Even though the header makes the exact same check since https://llvm.org/D59063,
      the headers could conceivably change in the future and introduce a bug.
      
      llvm-svn: 356376
      2bde5303
    • Siva Chandra's avatar
      [ELF] Emit weak-undef symbols in .dynsym of a PIE binary only if linked against shared libs. · 1915e2be
      Siva Chandra authored
      Reviewers: espindola
      
      Subscribers: emaste, arichardson, MaskRay, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D59275
      
      llvm-svn: 356374
      1915e2be
    • Neil Henning's avatar
      [AMDGPU] Add an experimental buffer fat pointer address space. · 523dab07
      Neil Henning authored
      Add an experimental buffer fat pointer address space that is currently
      unhandled in the backend. This commit reserves address space 7 as a
      non-integral pointer repsenting the 160-bit fat pointer (128-bit buffer
      descriptor + 32-bit offset) that is heavily used in graphics workloads
      using the AMDGPU backend.
      
      Differential Revision: https://reviews.llvm.org/D58957
      
      llvm-svn: 356373
      523dab07
    • Sanjay Patel's avatar
      [InstCombine] allow general vector constants for funnel shift to shift transforms · 60633935
      Sanjay Patel authored
      Follow-up to:
      rL356338
      rL356369
      
      We can calculate an arbitrary vector constant minus the bitwidth, so there's
      no need to limit this transform to scalars and splats.
      
      llvm-svn: 356372
      60633935
    • George Rimar's avatar
      [llvm-objcopy] - Calculate the string table section sizes correctly. · faf308b1
      George Rimar authored
      This fixes the https://bugs.llvm.org/show_bug.cgi?id=40980.
      
      Previously if string optimization occurred as a result of
      StringTableBuilder's finalize() method, the size wasn't updated.
      
      This hopefully also makes the interaction between sections during finalization
      processes a bit more clear.
      
      Differential revision: https://reviews.llvm.org/D59488
      
      llvm-svn: 356371
      faf308b1
    • Pavel Labath's avatar
      Fix TestCommandScriptImmediateOutput for python3 · 58e9ef13
      Pavel Labath authored
      s/iteritems/items
      
      llvm-svn: 356370
      58e9ef13
    • Sanjay Patel's avatar
      [InstCombine] extend rotate-left-by-constant canonicalization to funnel shift · 84de8a30
      Sanjay Patel authored
      Follow-up to:
      rL356338
      
      Rotates are a special case of funnel shift where the 2 input operands
      are the same value, but that does not need to be a restriction for the
      canonicalization when the shift amount is a constant.
      
      llvm-svn: 356369
      84de8a30
    • Simon Pilgrim's avatar
      [SystemZ] Remove icmp undef from reduced tests · f9ab4f5f
      Simon Pilgrim authored
      Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC)
      
      Approved by @uweigand (Ulrich Weigand)
      
      llvm-svn: 356368
      f9ab4f5f
    • Sanjay Patel's avatar
      [InstCombine] add funnel shift tests with arbitrary constants; NFC · d7f15393
      Sanjay Patel authored
      llvm-svn: 356367
      d7f15393
    • Fangrui Song's avatar
      [pp-trace] Delete -ignore and add a new option -callbacks · 560a45a3
      Fangrui Song authored
      Summary:
      -ignore specifies a list of PP callbacks to ignore. It cannot express a
      whitelist, which may be more useful than a blacklist.
      Add a new option -callbacks to replace it.
      
      -ignore= (default) => -callbacks='*' (default)
      -ignore=FileChanged,FileSkipped => -callbacks='*,-FileChanged,-FileSkipped'
      
      -callbacks='Macro*' : print only MacroDefined,MacroExpands,MacroUndefined,...
      
      Reviewers: juliehockett, aaron.ballman, alexfh, ioeric
      
      Reviewed By: aaron.ballman
      
      Subscribers: nemanjai, kbarton, jsji, cfe-commits
      
      Tags: #clang
      
      Differential Revision: https://reviews.llvm.org/D59296
      
      llvm-svn: 356366
      560a45a3
    • Roman Lebedev's avatar
      [llvm-exegesis] Separate tool options into three categories. · 23629385
      Roman Lebedev authored
      Results in much nicer -help output:
      ```
      $ ./bin/llvm-exegesis -help
      USAGE: llvm-exegesis [options]
      
      OPTIONS:
      
      Color Options:
      
        -color                                         - Use colors in output (default=autodetect)
      
      General options:
      
        -enable-cse-in-irtranslator                    - Should enable CSE in irtranslator
        -enable-cse-in-legalizer                       - Should enable CSE in Legalizer
      
      Generic Options:
      
        -help                                          - Display available options (-help-hidden for more)
        -help-list                                     - Display list of available options (-help-list-hidden for more)
        -version                                       - Display the version of this program
      
      llvm-exegesis analysis options:
      
        -analysis-clustering-epsilon=<number>          - dbscan epsilon for benchmark point clustering
        -analysis-clusters-output-file=<string>        -
        -analysis-display-unstable-clusters            - if there is more than one benchmark for an opcode, said benchmarks may end up not being clustered into the same cluster if the measured performance characteristics are different. by default all such opcodes are filtered out. this flag will instead show only such unstable opcodes
        -analysis-inconsistencies-output-file=<string> -
        -analysis-inconsistency-epsilon=<number>       - epsilon for detection of when the cluster is different from the LLVM schedule profile values
        -analysis-numpoints=<uint>                     - minimum number of points in an analysis cluster
      
      llvm-exegesis benchmark options:
      
        -ignore-invalid-sched-class                    - ignore instructions that do not define a sched class
        -mode=<value>                                  - the mode to run
          =latency                                     -   Instruction Latency
          =inverse_throughput                          -   Instruction Inverse Throughput
          =uops                                        -   Uop Decomposition
          =analysis                                    -   Analysis
        -num-repetitions=<uint>                        - number of time to repeat the asm snippet
        -opcode-index=<int>                            - opcode to measure, by index
        -opcode-name=<string>                          - comma-separated list of opcodes to measure, by name
        -snippets-file=<string>                        - code snippets to measure
      
      llvm-exegesis options:
      
        -benchmarks-file=<string>                      - File to read (analysis mode) or write (latency/uops/inverse_throughput modes) benchmark results. “-” uses stdin/stdout.
        -mcpu=<string>                                 - cpu name to use for pfm counters, leave empty to autodetect
      ```
      
      llvm-svn: 356364
      23629385
    • David Stenberg's avatar
      [DebugInfo] Ignore bitcasts when lowering stack arg dbg.values · 8a2e4af7
      David Stenberg authored
      Summary:
      Look past bitcasts when looking for parameter debug values that are
      described by frame-index loads in `EmitFuncArgumentDbgValue()`.
      
      In the attached test case we would be left with an undef `DBG_VALUE`
      for the parameter without this patch.
      
      A similar fix was done for parameters passed in registers in D13005.
      
      This fixes PR40777.
      
      Reviewers: aprantl, vsk, jmorse
      
      Reviewed By: aprantl
      
      Subscribers: bjope, javed.absar, jdoerfert, llvm-commits
      
      Tags: #debug-info, #llvm
      
      Differential Revision: https://reviews.llvm.org/D58831
      
      llvm-svn: 356363
      8a2e4af7
    • Pavel Labath's avatar
      Fix "type qualifiers ignored on cast result type" warnings · f92ddfed
      Pavel Labath authored
      These warnings start to get emitted with gcc-8.
      
      llvm-svn: 356362
      f92ddfed
    • Pavel Labath's avatar
      Reinitialize UnwindTable when the SymbolFile changes · dec96392
      Pavel Labath authored
      Summary:
      This is a preparatory step to enable adding of unwind plans by symbol
      file plugins.
      
      Although at the surface it seems that currently symbol files have
      nothing to do with unwinding, this isn't entirely correct even now. The
      mere act of adding a symbol file can have the effect of making more
      sections (typically .debug_frame) available to the unwinding machinery,
      so that it can have more unwind strategies to choose from.
      
      Up until now, we've had a bug, which went largely unnoticed, where
      unwind info in the manually added symbols files (target symbols add) was
      being ignored during unwinding. Reinitializing the UnwindTable fixes
      that bug too.
      
      Reviewers: clayborg, jasonmolenda, alexshap
      
      Subscribers: jdoerfert, lldb-commits
      
      Differential Revision: https://reviews.llvm.org/D58347
      
      llvm-svn: 356361
      dec96392
    • Christof Douma's avatar
      [AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse · 8cfd91dc
      Christof Douma authored
      Fixes https://bugs.llvm.org/show_bug.cgi?id=35094
      
      The Dead register definition pass should leave alone the atomicrmw
      instructions on AArch64 (LTE extension). The reason is the following
      statement in the Arm ARM:
      
      "The ST<OP> instructions, and LD<OP> instructions where the destination
      register is WZR or XZR, are not regarded as doing a read for the purpose
      of a DMB LD barrier."
      
      A good example was given in the gcc thread by Will Deacon (linked in the
      bugzilla ticket 35094):
      
          P0 (atomic_int* y,atomic_int* x) {
            atomic_store_explicit(x,1,memory_order_relaxed);
            atomic_thread_fence(memory_order_release);
            atomic_store_explicit(y,1,memory_order_relaxed);
          }
      
          P1 (atomic_int* y,atomic_int* x) {
            atomic_fetch_add_explicit(y,1,memory_order_relaxed);  // STADD
            atomic_thread_fence(memory_order_acquire);
            int r0 = atomic_load_explicit(x,memory_order_relaxed);
          }
      
          P2 (atomic_int* y) {
            int r1 = atomic_load_explicit(y,memory_order_relaxed);
          }
      
          My understanding is that it is forbidden for r0 == 0 and r1 == 2 after
          this test has executed. However, if the relaxed add in P1 compiles to
          STADD and the subsequent acquire fence is compiled as DMB LD, then we
          don't have any ordering guarantees in P1 and the forbidden result could
          be observed.
      
      Change-Id: I419f9f9df947716932038e1100c18d10a96408d0
      llvm-svn: 356360
      8cfd91dc
    • Craig Topper's avatar
      ba898da1
    • Alex Bradbury's avatar
      [RISCV] Add ImmArg to intrinsics · 60444ad1
      Alex Bradbury authored
      llvm-svn: 356358
      60444ad1
Loading