Skip to content
  1. Feb 06, 2019
    • Shoaib Meenai's avatar
      [cmake] Add all subprojects to LLVM_ALL_PROJECTS · 351314a1
      Shoaib Meenai authored
      Make LLVM_ALL_PROJECTS reflect all top-level directories in the monorepo
      rather than an arbitrary subset. clang-tools-extra is technically
      unnecessary since it gets enabled by clang, but having it there for
      consistency shouldn't hurt either.
      
      Differential Revision: https://reviews.llvm.org/D57843
      
      llvm-svn: 353346
      351314a1
    • Roland Froese's avatar
      [PowerPC] Add vector truncate test to prep for D56507 NFC · 42f58498
      Roland Froese authored
      llvm-svn: 353344
      42f58498
    • Shoaib Meenai's avatar
      [cmake] Add openmp to LLVM_ALL_PROJECTS · af8eadd9
      Shoaib Meenai authored
      It'll get ignored in LLVM_ENABLE_PROJECTS after r353148 otherwise.
      
      llvm-svn: 353343
      af8eadd9
    • Jordan Rupprecht's avatar
      [libObject][NFC] Include filename in error message · d3a7e9d1
      Jordan Rupprecht authored
      llvm-svn: 353341
      d3a7e9d1
    • Alina Sbirlea's avatar
      [LICM/MSSA] Add promotion to scalars by building an AliasSetTracker with MemorySSA. · 6cba96ed
      Alina Sbirlea authored
      Summary:
      Experimentally we found that promotion to scalars carries less benefits
      than sinking and hoisting in LICM. When using MemorySSA, we build an
      AliasSetTracker on demand in order to reuse the current infrastructure.
      We only build it if less than AccessCapForMSSAPromotion exist in the
      loop, a cap that is by default set to 250. This value ensures there are
      no runtime regressions, and there are small compile time gains for
      pathological cases. A much lower value (20) was found to yield a single
      regression in the llvm-test-suite and much higher benefits for compile
      times. Conservatively we set the current cap to a high value, but we will
      explore lowering it when MemorySSA is enabled by default.
      
      Reviewers: sanjoy, chandlerc
      
      Subscribers: nemanjai, jlebar, Prazek, george.burgess.iv, jfb, jsji, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56625
      
      llvm-svn: 353339
      6cba96ed
    • Nirav Dave's avatar
      [DAG] Immediately cleanup unused nodes from extend-based combines. · b3506bf9
      Nirav Dave authored
      llvm-svn: 353338
      b3506bf9
    • Michael Berg's avatar
      Move IR flag handling directly into builder calls for cases translated from... · f0d81a31
      Michael Berg authored
      Move IR flag handling directly into builder calls for cases translated from Instructions in GlobalIsel
      
      Reviewers: aditya_nandakumar, volkan
      
      Reviewed By: aditya_nandakumar
      
      Subscribers: rovka, kristof.beyls, volkan, Petar.Avramovic
      
      Differential Revision: https://reviews.llvm.org/D57630
      
      llvm-svn: 353336
      f0d81a31
    • Alina Sbirlea's avatar
      [AliasSetTracker] Pass MustAlias to addPointer more often. · 910c6bef
      Alina Sbirlea authored
      Summary:
      Pass the alias info to addPointer when available. Will save an alias()
      call for must sets when adding a known Must or May alias.
      [Part of a series of cleanup patches]
      
      Reviewers: reames, mkazantsev
      
      Subscribers: sanjoy, jlebar, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56613
      
      llvm-svn: 353335
      910c6bef
    • Craig Topper's avatar
      1c7ee208
    • Nirav Dave's avatar
      [X86][DAG] Avoid creating dangling bitcast. · c6bfa103
      Nirav Dave authored
      combineExtractWithShuffle may leave a dangling bitcast which may
      prevent further optimization in later passes. Avoid constructing it
      unless it is used.
      
      llvm-svn: 353333
      c6bfa103
    • Sanjay Patel's avatar
      [x86] add tests for horizontal ops (PR38971, PR33758); NFC · 29a710be
      Sanjay Patel authored
      llvm-svn: 353332
      29a710be
    • Jonas Paulsson's avatar
      [SystemZ] Improved handling of the @llvm.ctlz intrinsic. · b21dde05
      Jonas Paulsson authored
      Since SystemZ supports counting of leading zeros with the FLOGR instruction,
      isCheapToSpeculateCtlz() should return true, which it now does.
      
      ISD::CTLZ_ZERO_UNDEF i32 is now handled the same way as ISD::CTLZ is, which
      is needed since promotion to i64 is required and CTLZ_ZERO_UNDEF is only
      expanded to CTLZ if it is Legal or Custom.
      
      Review: Ulrich Weigand
      https://reviews.llvm.org/D57710
      
      llvm-svn: 353330
      b21dde05
    • Peter Collingbourne's avatar
      build: Remove the cmake check for malloc.h. · 02fc3c69
      Peter Collingbourne authored
      As far as I can tell, malloc.h is only being used here to provide
      a definition of mallinfo (malloc itself is declared in stdlib.h via
      cstdlib). We already have a macro for whether mallinfo is available,
      so switch to using that instead.
      
      Differential Revision: https://reviews.llvm.org/D57807
      
      llvm-svn: 353329
      02fc3c69
    • Jonas Paulsson's avatar
      [SystemZ] Wait with VGBM selection until after DAGCombine2. · 8cda83a5
      Jonas Paulsson authored
      Don't lower BUILD_VECTORs to BYTE_MASK, but instead expose the BUILD_VECTORs
      to the DAGCombiner and select them to VGBM in Select(). This allows the
      DAGCombiner to understand the constant vector values.
      
      For floating point, only all-zeros vectors are now generated with VGBM, as it
      turned out to be somewhat complicated to handle any arbitrary constants,
      while in practice this is very rare and hardly needed.
      
      The SystemZ ISD opcodes z_byte_mask, z_vzero and z_vones have been removed.
      
      Review: Ulrich Weigand
      https://reviews.llvm.org/D57152
      
      llvm-svn: 353325
      8cda83a5
    • Florian Hahn's avatar
      [opt-viewer] Add --filter option to select remarks for displaying. · 169f6423
      Florian Hahn authored
      This allows limiting the displayed remarks to the ones with names
      matching the filter (regular) expression.
      
      Generating html pages for a larger project with optimization remarks can
      result in a huge HTML documents and using --filter allows to focus on a
      set of interesting remarks.
      
      Reviewers: hfinkel, anemet, thegameg, serge-sans-paille
      
      Reviewed By: anemet
      
      Differential Revision: https://reviews.llvm.org/D57827
      
      llvm-svn: 353322
      169f6423
    • Bjorn Pettersson's avatar
      [SelectionDAG] Cleanup some code comments. NFC · 350352c8
      Bjorn Pettersson authored
      Don't repeat the function name in some doxygen
      comments.
      
      (Just a minor cleanup, while testing to push
      from the git monorepo setup.)
      
      llvm-svn: 353317
      350352c8
    • Jessica Paquette's avatar
      [GlobalISel][NFC] Gardening: Factor out code for simple unary intrinsics · e288c526
      Jessica Paquette authored
      There was a lot of repeated code wrt unary math intrinsics in
      translateKnownIntrinsic. This factors out the repeated MIRBuilder code into
      two functions: translateSimpleUnaryIntrinsic and getSimpleUnaryIntrinsicOpcode.
      
      This simplifies adding simple unary intrinsics, since after this, all you have
      to do is add the mapping to SimpleUnaryIntrinsicOpcodes.
      
      Differential Revision: https://reviews.llvm.org/D57774
      
      llvm-svn: 353316
      e288c526
    • James Henderson's avatar
      [yaml2obj]Allow number for ELF symbol type · c836e488
      James Henderson authored
      yaml2obj previously only recognised standard STT_* names, and didn't
      allow arbitrary numbers. This change allows the user to specify a number
      for the type instead. It also adds a test to verify the existing
      behaviour for obj2yaml for unkown symbol types.
      
      Reviewed by: grimar
      
      Differential Revision: https://reviews.llvm.org/D57822
      
      llvm-svn: 353315
      c836e488
    • Sanjay Patel's avatar
      [InstCombine] X | C == C --> (X & ~C) == 0 · 68bc5fb0
      Sanjay Patel authored
      We should canonicalize to one of these forms,
      and compare-with-zero could be more conducive
      to follow-on transforms. This also leads to
      generally better codegen as shown in PR40611:
      https://bugs.llvm.org/show_bug.cgi?id=40611
      
      llvm-svn: 353313
      68bc5fb0
    • Sanjay Patel's avatar
      [InstCombine] add tests for PR40611 and regenerate checks; NFC · 51abb86f
      Sanjay Patel authored
      Lots of unrelated diffs here from the newer version of the script.
      
      llvm-svn: 353312
      51abb86f
    • Tim Northover's avatar
      AArch64: enforce even/odd register pairs for CASP instructions. · 474f5d9b
      Tim Northover authored
      ARMv8.1a CASP instructions need the first of the pair to be an even register
      (otherwise the encoding is unallocated). We enforced this during assembly, but
      not CodeGen before.
      
      llvm-svn: 353308
      474f5d9b
    • Nirav Dave's avatar
      [InlineAsm][X86] Add backend support for X86 flag output parameters. · e5c37958
      Nirav Dave authored
      Allow custom handling of inline assembly output parameters and add X86
      flag parameter support.
      
      llvm-svn: 353307
      e5c37958
    • Nirav Dave's avatar
      [SelectionDAGBuilder] Refactor Inline Asm output check. NFCI. · 54511076
      Nirav Dave authored
      llvm-svn: 353305
      54511076
    • Ulrich Weigand's avatar
      [SystemZ] Do not return INT_MIN from strcmp/memcmp · 17a00126
      Ulrich Weigand authored
      The IPM sequence currently generated to compute the strcmp/memcmp
      result will return INT_MIN for the "less than zero" case.  While
      this is in compliance with the standard, strictly speaking, it
      turns out that common applications cannot handle this, e.g. because
      they negate a comparison result in order to implement reverse
      compares.
      
      This patch changes code to use a different sequence that will result
      in -2 for the "less than zero" case (same as GCC).  However, this
      requires that the two source operands of the compare instructions
      are inverted, which breaks the optimization in removeIPMBasedCompare.
      Therefore, I've removed this (and all of optimizeCompareInstr), and
      replaced it with a mostly equivalent optimization in combineCCMask
      at the DAGcombine level.
      
      llvm-svn: 353304
      17a00126
    • Tim Northover's avatar
      AArch64: annotate atomics with dropped acquire semantics when printing. · 71025a2f
      Tim Northover authored
      A quirk of the v8.1a spec is that when the writeback regiser for an atomic
      read-modify-write instruction is wzr/xzr, the instruction no longer enforces
      acquire ordering. However, it's still written with the misleading 'a' mnemonic.
      
      So this adds an annotation when disassembling such instructions, mentioning the
      change.
      
      llvm-svn: 353303
      71025a2f
    • Sanjay Patel's avatar
      [x86] vectorize cast ops in lowering to avoid register file transfers · e84fbb67
      Sanjay Patel authored
      The proposal in D56796 may cross the line because we're trying to avoid vectorization 
      transforms in generic DAG combining. So this is an alternate, later, x86-specific 
      translation of that patch.
      
      There are several potential follow-ups to enhance this:
      1. Allow extraction from non-zero element index.
      2. Peek through extends of smaller width integers.
      3. Support x86-specific conversion opcodes like X86ISD::CVTSI2P
      
      Differential Revision: https://reviews.llvm.org/D56864
      
      llvm-svn: 353302
      e84fbb67
    • Andrea Di Biagio's avatar
      [MCA] Speedup ResourceManager queries. NFCI · 02974728
      Andrea Di Biagio authored
      When a resource unit R is released, the ResourceManager notifies groups that
      contain R. Before this patch, the logic in method ResourceManager::release()
      implemented a potentially slow iterative search of dependent groups on the
      entire set of processor resources.
      This patch replaces that logic with a simpler (and often faster) lookup on array
      `Resource2Groups`.  This patch gives an average speedup of ~3-4% (observed on a
      release build when testing for target btver2).
      No functional change intended.
      
      llvm-svn: 353301
      02974728
    • Nico Weber's avatar
      gn build: Merge r353265, r353237 · da2bb5d5
      Nico Weber authored
      llvm-svn: 353298
      da2bb5d5
    • Eugene Leviant's avatar
      Attempt to fix buildbot after r353289 · ef6eba24
      Eugene Leviant authored
      llvm-svn: 353294
      ef6eba24
    • Clement Courbet's avatar
      [DAGCombine][NFC] GatherAllAliases should take a LSBaseSDNode. · 5a6712b6
      Clement Courbet authored
      GatherAllAliases only makes sense for LSBaseSDNode. Enforce it with
      static typing instead of runtime cast.
      
      llvm-svn: 353291
      5a6712b6
    • Max Kazantsev's avatar
      [NFC] Simplify check in guard widening · cd48ac36
      Max Kazantsev authored
      llvm-svn: 353290
      cd48ac36
    • Eugene Leviant's avatar
      [llvm-objcopy] Allow regular expressions in name comparison · f324f6dc
      Eugene Leviant authored
      Differential revision: https://reviews.llvm.org/D57517
      
      llvm-svn: 353289
      f324f6dc
    • James Henderson's avatar
      [DebugInfo]Print correct value for special opcode address increment · b6b5b1a5
      James Henderson authored
      The wrong variable was being used when printing the address increment in
      verbose output of .debug_line. This patch fixes this.
      
      Reviewed by: JDevlieghere
      
      Differential Revision: https://reviews.llvm.org/D57693
      
      llvm-svn: 353288
      b6b5b1a5
    • James Henderson's avatar
      [DebugInfo][llvm-symbolizer]Add some tests for edge cases when symbolizing · cd1424ae
      James Henderson authored
      This patch adds half a dozen new tests that test various edge cases in
      the behaviour of the symbolizer and DWARF data parsing. All of them test
      the current behaviour.
      
      Reviewed by: JDevlieghere, aprantl
      
      Differential Revision: https://reviews.llvm.org/D57741
      
      llvm-svn: 353286
      cd1424ae
    • Roman Lebedev's avatar
      [yaml::BinaryRef] Slight perf tuning (for llvm-exegesis analysis mode) · 41828010
      Roman Lebedev authored
      Summary:
      llvm-exegesis uses this functionality to read it's benchmark dumps.
      This reading of `.yaml`s takes ~60% of runtime for 14656 benchmark points (i.e. one sweep over all x86 instructions),
      but only 30% of time for 3x as much benchmark points.
      
      In particular, this `BinaryRef` appears to be an obvious pain point.
      Without patch:
      ```
      $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-orig.html
      no exegesis target for x86_64-unknown-linux-gnu, using default
      Parsed 14656 benchmark points
      Printing sched class consistency analysis results to file '/tmp/clusters-orig.html'
      ...
      no exegesis target for x86_64-unknown-linux-gnu, using default
      Parsed 14656 benchmark points
      Printing sched class consistency analysis results to file '/tmp/clusters-orig.html'
      
       Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-orig.html' (25 runs):
      
                  972.86 msec task-clock                #    0.994 CPUs utilized            ( +-  0.25% )
                      30      context-switches          #   30.774 M/sec                    ( +- 21.74% )
                       0      cpu-migrations            #    0.370 M/sec                    ( +- 67.81% )
                   11873      page-faults               # 12211.512 M/sec                   ( +-  0.00% )
              3898373408      cycles                    # 4009682.186 GHz                   ( +-  0.25% )  (83.12%)
               360399748      stalled-cycles-frontend   #    9.24% frontend cycles idle     ( +-  0.54% )  (83.24%)
              1099450483      stalled-cycles-backend    #   28.20% backend cycles idle      ( +-  0.59% )  (33.63%)
              4910528820      instructions              #    1.26  insn per cycle
                                                        #    0.22  stalled cycles per insn  ( +-  0.13% )  (50.21%)
              1111976775      branches                  # 1143726625.854 M/sec              ( +-  0.10% )  (66.77%)
                23248474      branch-misses             #    2.09% of all branches          ( +-  0.19% )  (83.29%)
      
                 0.97850 +- 0.00647 seconds time elapsed  ( +-  0.66% )
      ```
      With the patch:
      ```
      $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-new.html
      no exegesis target for x86_64-unknown-linux-gnu, using default
      Parsed 14656 benchmark points
      Printing sched class consistency analysis results to file '/tmp/clusters-new.html'
      ...
      no exegesis target for x86_64-unknown-linux-gnu, using default
      Parsed 14656 benchmark points
      Printing sched class consistency analysis results to file '/tmp/clusters-new.html'
      
       Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-new.html' (25 runs):
      
                  905.29 msec task-clock                #    0.999 CPUs utilized            ( +-  0.11% )
                      15      context-switches          #   16.533 M/sec                    ( +- 32.27% )
                       0      cpu-migrations            #    0.000 K/sec
                   11873      page-faults               # 13121.789 M/sec                   ( +-  0.00% )
              3627759720      cycles                    # 4009283.100 GHz                   ( +-  0.11% )  (83.19%)
               370401480      stalled-cycles-frontend   #   10.21% frontend cycles idle     ( +-  0.22% )  (83.19%)
              1007114438      stalled-cycles-backend    #   27.76% backend cycles idle      ( +-  0.34% )  (33.62%)
              4414014304      instructions              #    1.22  insn per cycle
                                                        #    0.23  stalled cycles per insn  ( +-  0.08% )  (50.36%)
              1003751700      branches                  # 1109314021.971 M/sec              ( +-  0.07% )  (66.97%)
                24611010      branch-misses             #    2.45% of all branches          ( +-  0.10% )  (83.41%)
      
                 0.90593 +- 0.00105 seconds time elapsed  ( +-  0.12% )
      ```
      So this decreases the overall run time of llvm-exegesis analysis mode (on one sweep) by roughly -7%.
      
      To be noted, `BinaryRef::writeAsBinary()` change is the reason for the perf changes,
      usage of `llvm::isHexDigit()` instead of `isxdigit()` does not appear to have any perf impact,
      i have only changed it "for symmetry".
      
      `writeAsBinary()` change is correct, it produces identical de-hex-ified buffer, and the final output is thus identical:
      ```
      $ sha512sum /tmp/clusters-*
      db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-new.html
      db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-orig.html
      ```
      
      Reviewers: silvas, espindola, sbc100, zturner, courbet, gchatelet
      
      Reviewed By: gchatelet
      
      Subscribers: tschuett, RKSimon, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D57699
      
      llvm-svn: 353282
      41828010
    • Fangrui Song's avatar
      b8ee8c85
    • Max Kazantsev's avatar
      [NFC] Factor out detatchment of dead blocks from their erasing · 36b392cb
      Max Kazantsev authored
      llvm-svn: 353277
      36b392cb
    • Max Kazantsev's avatar
      a4ccfc18
    • Max Kazantsev's avatar
      [NFC] Revert rL353274 · 0d7ad3c9
      Max Kazantsev authored
      llvm-svn: 353275
      0d7ad3c9
    • Max Kazantsev's avatar
      61e6ffc3
Loading