Skip to content
  1. Jul 29, 2017
  2. Jul 28, 2017
    • Krzysztof Parzyszek's avatar
      [Hexagon] Formatting changes, NFC · cfd8806b
      Krzysztof Parzyszek authored
      llvm-svn: 309442
      cfd8806b
    • Easwaran Raman's avatar
      [Inliner] Do not apply any bonus for cold callsites. · 51b809bf
      Easwaran Raman authored
      Summary:
      Inlining threshold is increased by application of bonuses when the
      callee has a single reachable basic block or is rich in vector
      instructions. Similarly, inlining cost is reduced by applying a large
      bonus when the last call to a static function is considered for
      inlining. This patch disables the application of these bonuses when the
      callsite or the callee is cold. The intention here is to prevent a large
      cold callsite from being inlined to a non-cold caller that could prevent
      the caller from being inlined. This is especially important when the
      cold callsite is a last call to a static since the associated bonus is
      very high.
      
      Reviewers: chandlerc, davidxl
      
      Subscribers: danielcdh, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D35823
      
      llvm-svn: 309441
      51b809bf
    • Adrian Prantl's avatar
      Remove the unused dbg.value offset from SelectionDAG (NFC) · a617576b
      Adrian Prantl authored
      Followup to r309426.
      rdar://problem/33580047
      
      llvm-svn: 309436
      a617576b
    • Reid Kleckner's avatar
      67de3489
    • Reid Kleckner's avatar
      [lit] Remove stale test inputs before running check-lit · 7895d2fc
      Reid Kleckner authored
      This should fix googletest-format test failures on the clang modules
      buildbots, which have a stale copy of the OneTest script in the build
      directory.
      
      llvm-svn: 309432
      7895d2fc
    • Adrian Prantl's avatar
      Reword sentence in LangRef · 1b842dad
      Adrian Prantl authored
      llvm-svn: 309431
      1b842dad
    • Adrian Prantl's avatar
      Remove the obsolete offset parameter from @llvm.dbg.value · abe04759
      Adrian Prantl authored
      There is no situation where this rarely-used argument cannot be
      substituted with a DIExpression and removing it allows us to simplify
      the DWARF backend. Note that this patch does not yet remove any of
      the newly dead code.
      
      rdar://problem/33580047
      Differential Revision: https://reviews.llvm.org/D35951
      
      llvm-svn: 309426
      abe04759
    • Alexey Bataev's avatar
      [SLP] Allow vectorization of the instruction from the same basic blocks only, NFC. · e109655c
      Alexey Bataev authored
      Summary:
      After some changes in SLP vectorizer we missed some additional checks to
      limit the instructions for vectorization. We should not perform analysis
      of the instructions if the parent of instruction is not the same as the
      parent of the first instruction in the tree or it was analyzed already.
      
      Subscribers: mzolotukhin
      
      Differential Revision: https://reviews.llvm.org/D34881
      
      llvm-svn: 309425
      e109655c
    • Reid Kleckner's avatar
      Fix conditional tail call branch folding when both edges are the same · 9be82c31
      Reid Kleckner authored
      The conditional tail call logic did the wrong thing when both
      destinations of a conditional branch were the same:
      
      BB#1: derived from LLVM BB %entry
          Live Ins: %EFLAGS
          Predecessors according to CFG: BB#0
              JE_1 <BB#5>, %EFLAGS<imp-use,kill>
              JMP_1 <BB#5>
      
      BB#5: derived from LLVM BB %sw.epilog
          Predecessors according to CFG: BB#1
              TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...
      
      We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
      successor. Then BB#5 would be deleted as it had no predecessors, leaving
      a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.
      
      This patch checks that both conditional branch destinations are
      different before doing the transform. The standard branch folding logic
      is able to remove both the JMP_1 and the JE_1, and for my test case we
      end up forming a better conditional tail call later.
      
      Fixes PR33980
      
      llvm-svn: 309422
      9be82c31
    • Matt Arsenault's avatar
      AMDGPU: Look through a bitcast user of an out argument · da9ab148
      Matt Arsenault authored
      This allows handling of a lot more of the interesting
      cases in Blender. Most of the large functions unlikely
      to be inlined have this pattern.
      
      This is a special case for what clang emits for OpenCL 3
      element vectors. Annoyingly, these are emitted as
      <3 x elt>* pointers, but accessed as <4 x elt>* operations.
      This also needs to handle cases where a struct containing
      a single vector is used.
      
      llvm-svn: 309419
      da9ab148
    • Chad Rosier's avatar
      [Value Tracking] Refactor icmp comparison logic into helper. NFC. · 2f49803c
      Chad Rosier authored
      llvm-svn: 309417
      2f49803c
    • Matt Arsenault's avatar
      AMDGPU: Add pass to replace out arguments · c06574ff
      Matt Arsenault authored
      It is better to return arguments directly in registers
      if we are making a call rather than introducing expensive
      stack usage. In one of sample compile from one of
      Blender's many kernel variants, this fires on about
      ~20 different functions. Future improvements may be to
      recognize simple cases where the pointer is indexing a small
      array. This also fails when the store to the out argument
      is in a separate block from the return, which happens in
      a few of the Blender functions. This should also probably
      be using MemorySSA which might help with that.
      
      I'm not sure this is correct as a FunctionPass, but
      MemoryDependenceAnalysis seems to not work with
      a ModulePass.
      
      I'm also not sure where it should run.I think it should
      run  before DeadArgumentElimination, so maybe either
      EP_CGSCCOptimizerLate or EP_ScalarOptimizerLate.
      
      llvm-svn: 309416
      c06574ff
    • Hiroshi Yamauchi's avatar
      [LVI] Constant-propagate a zero extension of the switch condition value through case edges · 1b179bc5
      Hiroshi Yamauchi authored
      Summary:
      LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges.
      
      But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur.
      
      This patch adds a small logic to handle such a case in getEdgeValueLocal().
      
      This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary.
      
      With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%.
      
      
      
      
      Reviewers: wmi, dberlin, sanjoy
      
      Reviewed By: sanjoy
      
      Subscribers: davide, davidxl, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D34822
      
      llvm-svn: 309415
      1b179bc5
    • Tim Northover's avatar
      GlobalISel: map 128-bit values to an FPR by default. · a7f583e3
      Tim Northover authored
      Eventually we may want to allow a pair of GPRs but absolutely nothing in the
      entire world is ready for that yet.
      
      llvm-svn: 309404
      a7f583e3
    • Reid Kleckner's avatar
      [lit] Dump some FileCheck inputs to try to debug some failing tests · 125c74bc
      Reid Kleckner authored
      llvm-svn: 309400
      125c74bc
    • Reid Kleckner's avatar
      [lit] Fix shtest-format external_shell failures · 432914bb
      Reid Kleckner authored
      When using win32 cmd.exe, turn off command echoing at the beginning of
      the script (@echo off).
      
      Replace a bash shell script with a python script for the
      fail_with_bad_encoding test.
      
      llvm-svn: 309399
      432914bb
    • Matt Arsenault's avatar
      AMDGPU: Annotate implicitarg.ptr usage · 9166ce86
      Matt Arsenault authored
      We need to pass something to functions for this to work.
      It isn't derivable just from the kernarg segment pointer
      because the implicit arguments are placed after the
      kernel arguments.
      
      Also fixes missing test for the intrinsic.
      
      llvm-svn: 309398
      9166ce86
    • Wei Mi's avatar
      [GVN] Recommit the patch "Add phi-translate support in scalarpre" · 55c05e14
      Wei Mi authored
      Recommit after workaround the bug PR31652.
      
      Three bugs fixed in previous recommits: The first one is to use CurrentBlock
      instead of PREInstr's Parent as param of performScalarPREInsertion because
      the Parent of a clone instruction may be uninitialized. The second one is stop
      PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst
      is defined inside of CurrentBlock. The same value defined inside of loop in last
      iteration can not be regarded as available. The third one is an out-of-bound
      array access in a flipped if guard.
      
      Right now scalarpre doesn't have phi-translate support, so it will miss some
      simple pre opportunities. Like the following testcase, current scalarpre cannot
      recognize the last "a * b" is fully redundent because a and b used by the last
      "a * b" expr are both defined by phis.
      
      long a[100], b[100], g1, g2, g3;
      __attribute__((pure)) long goo();
      
      void foo(long a, long b, long c, long d) {
      
        g1 = a * b;
        if (__builtin_expect(g2 > 3, 0)) {
          a = c;
          b = d;
          g2 = a * b;
        }
        g3 = a * b;      // fully redundant.
      
      }
      
      The patch adds phi-translate support in scalarpre. This is only a temporary
      solution before the newpre based on newgvn is available.
      
      Differential Revision: https://reviews.llvm.org/D32252
      
      llvm-svn: 309397
      55c05e14
    • Chris Bieneman's avatar
      [CMake] NFC. Add intrinsics_gen target to CMake Exports · 57265892
      Chris Bieneman authored
      By creating a dummy of this target in LLVMConfig.cmake, projects that can build against out-of-tree LLVM can freely depend on the target without needing to have conditionals for if LLVM is in-tree or out-of-tree.
      
      llvm-svn: 309389
      57265892
    • Chad Rosier's avatar
      [ValueTracking] Remove a number of unused arguments. NFC. · e42b44b8
      Chad Rosier authored
      llvm-svn: 309385
      e42b44b8
    • Joel Jones's avatar
      [AArch64] Standardize suffixes for LSE Atomics mnemonics (NFCI) · 08e88e8d
      Joel Jones authored
      This NFC changeset standardizes the suffixes used for LSE Atomics
      instructions.
      
      It changes the existing suffixes - 'b', 'h', 's', 'd' - to the existing
      standard 'B', 'H', 'W' and 'X'.
      
      This changeset is the result of the code review discussion for D35319.
      
      Patch by: steleman
      
      Differential Revision: https://reviews.llvm.org/D35927
      
      llvm-svn: 309384
      08e88e8d
    • Strahinja Petrovic's avatar
      [ARM] Add the option to directly access TLS pointer · 25e9e1b8
      Strahinja Petrovic authored
      This patch enables choice for accessing thread local
      storage pointer (like '-mtp' in gcc).
      
      Differential Revision: https://reviews.llvm.org/D34408
      
      llvm-svn: 309381
      25e9e1b8
    • Simon Pilgrim's avatar
      [X86] Add test case for PR33290 · 1ff3da72
      Simon Pilgrim authored
      llvm-svn: 309375
      1ff3da72
    • Simon Pilgrim's avatar
      [X86][AVX] Cleanup shuffle combine tests - remove old prefixes. · 88d3bed3
      Simon Pilgrim authored
      llvm-svn: 309374
      88d3bed3
    • Peter Smith's avatar
      [ARM] Add test to check pcs of ARM ABI runtime floating point helpers · 5804364f
      Peter Smith authored
      The ARM Runtime ABI document (IHI0043) defines the AEABI floating point
      helper functions in section 4.1.2 The floating-point helper functions.
      The functions listed in this section must always use the base AAPCS calling
      convention.
      
      This test generates calls to all the helper functions that llvm supports
      and checks that the base AAPCS calling convention has been used. We test
      the equivalent of -mfloat-abi=soft, -mfloat-abi=softfp, -mfloat-abi=hardfp
      with an FPU that supports single and double precision, and one that only
      supports double precision.
      
      Differential Revision: https://reviews.llvm.org/D35904
      
      llvm-svn: 309371
      5804364f
    • Max Kazantsev's avatar
      [SCEV] Do not visit nodes twice in containsConstantSomewhere · fa496953
      Max Kazantsev authored
      This patch reworks the function that searches constants in Add and Mul SCEV expression
      chains so that now it does not visit a node more than once, and also renames this function
      for better correspondence between its implementation and semantics.
      
      Differential Revision: https://reviews.llvm.org/D35931
      
      llvm-svn: 309367
      fa496953
    • Jessica Paquette's avatar
      [MachineOutliner] NFC: Comment tidying · 4602c343
      Jessica Paquette authored
      The comment on describing the suffix tree had some pruning
      stuff that was out of date in it.
      
      Also fixed some typos.
      
      llvm-svn: 309365
      4602c343
    • Michal Gorny's avatar
      Revert rL309320 - "[OCaml] Respect CMAKE_C_FLAGS for OCaml C files" · a3ad81d2
      Michal Gorny authored
      This causes buildbot breakage for systems where OCaml files are built
      with a different compiler.
      
      llvm-svn: 309364
      a3ad81d2
    • Saleem Abdulrasool's avatar
      test: require x86 backend · 61d81ec7
      Saleem Abdulrasool authored
      Ensure that the target is registered before using it.  Should fix the
      hexagon Bots.
      
      llvm-svn: 309363
      61d81ec7
    • Saleem Abdulrasool's avatar
      MC: add support for cfi_return_column · a219b3d8
      Saleem Abdulrasool authored
      This adds support for the CFI pseudo-op return_column.  This specifies
      the frame table column which contains the return address.
      
      Addresses PR33953!
      
      llvm-svn: 309360
      a219b3d8
    • Saleem Abdulrasool's avatar
      MC: clang-format enumeration (NFC) · b3c70c09
      Saleem Abdulrasool authored
      This was hard to insert elements into.  clang-format it so that it is
      easier.  NFC.
      
      llvm-svn: 309359
      b3c70c09
    • Sanjoy Das's avatar
      Revert "[SCEV] Cache results of computeExitLimit" · 843ab574
      Sanjoy Das authored
      This reverts commit r309080.  The patch needs to clear out the
      ScalarEvolution::ExitLimits cache in forgetMemoizedResults.
      
      I've replied on the commit thread for the patch with more details.
      
      llvm-svn: 309357
      843ab574
Loading