Skip to content
  1. Apr 19, 2017
    • Xin Tong's avatar
      Allow suppressing host and target info in VersionPrinter · 59cb7782
      Xin Tong authored
      Summary:
      VersionPrinter by default outputs information about the Host CPU
      and Default target. Printing this information requires linking in
      a large amount of data, such as supported target triples as C
      strings, which in turn bloats the binary size.
      
      Enable a new CMake option LLVM_VERSION_PRINTER_SHOW_HOST_TARGET_INFO
      which controls printing of the host and target info. This allows
      the target triple names to be dead-code stripped. This is a nice
      win for LLVM clients that wish to minimize their binary size, such
      as graphics drivers.
      
      By default this is ON, so there is no change in the default behavior.
      Clients who wish to suppress this printing can do so by setting this
      option to off via CMake.
      
      A test app on Linux that uses ParseCommandLineOptions() shows a binary
      size reduction of 23KB (from 149K to 126K) for a Release build, and 24KB
      (from 135K to 111K) in a MinSizeRel build.
      
      Reviewers: klimek, beanz, bogner, chandlerc, compnerd
      
      Reviewed By: compnerd
      
      Patch by pammon (Peter Ammon) !
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D30904
      
      llvm-svn: 300630
      59cb7782
    • Dylan McKay's avatar
      [AVR] Fix the build · eb24b850
      Dylan McKay authored
      'PointerSize' was renamed to 'CodePointerSize'.
      
      llvm-svn: 300629
      eb24b850
    • Dean Michael Berris's avatar
      [XRay][tools] Add option to llvm-xray extract to symbolize functions · 918802be
      Dean Michael Berris authored
      Summary:
      This allows us to, if the symbol names are available in the binary, be
      able to provide the function name in the YAML output.
      
      Reviewers: dblaikie, pelikan
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32153
      
      llvm-svn: 300624
      918802be
    • Craig Topper's avatar
      [ConstantRange] Optimize APInt creation in getSignedMax/getSignedMin. · 88c64f32
      Craig Topper authored
      We were creating an APInt at the top of these methods that isn't always returned. For ranges wider than 64-bits this results in an allocation and deallocation when its not used.
      
      In getSignedMax we were creating Upper-1 to use in a compare and then creating it again for a return value. The compiler is unable to determine that these can be shared. So help it out and create the Upper-1 in a temporary that can be reused.
      
      This provides a little compile time improvement.
      
      llvm-svn: 300621
      88c64f32
    • Sanjay Patel's avatar
      [x86] add tests for potential andn optimization; NFC · ff981f92
      Sanjay Patel authored
      llvm-svn: 300617
      ff981f92
    • Reid Kleckner's avatar
      Fix crash in AttributeList::addAttributes, add test · fe64c013
      Reid Kleckner authored
      llvm-svn: 300614
      fe64c013
    • Sanjoy Das's avatar
      Add a getPointerOperandType() helper to LoadInst and StoreInst; NFC · f09c1e34
      Sanjoy Das authored
      I will use this in a later change.
      
      llvm-svn: 300613
      f09c1e34
  2. Apr 18, 2017
    • Craig Topper's avatar
      [MemoryBuiltins] Add isMallocOrCallocLikeFn so BasicAA can check for both at the same time · 09bb760b
      Craig Topper authored
      BasicAA wants to know if a function is either a malloc or calloc like function. Currently we have to check both separately. This means both calls check if its an intrinsic, query TLI, check the nobuiltin attribute, scan the AllocationFnData, etc.
      
      This patch adds a isMallocOrCallocLikeFn so we can go through all of the checks once per call.
      
      This also changes the one other location I saw that called both together.
      
      Differential Revision: https://reviews.llvm.org/D32188
      
      llvm-svn: 300608
      09bb760b
    • Davide Italiano's avatar
      80fe987b
    • Matt Arsenault's avatar
      DAG: Make mayBeEmittedAsTailCall parameter const · 3138075d
      Matt Arsenault authored
      llvm-svn: 300603
      3138075d
    • Matt Arsenault's avatar
      Fix typo · aa31dce3
      Matt Arsenault authored
      llvm-svn: 300597
      aa31dce3
    • Matt Arsenault's avatar
      AMDGPU: Make MFI fields private · 161e2b42
      Matt Arsenault authored
      llvm-svn: 300596
      161e2b42
    • Craig Topper's avatar
      [MemoryBuiltins] Use ImmutableCallSite instead of CallSite to remove a... · eae6db0e
      Craig Topper authored
      [MemoryBuiltins] Use ImmutableCallSite instead of CallSite to remove a const_cast and const correct. NFCI
      
      llvm-svn: 300585
      eae6db0e
    • Daniel Berlin's avatar
      NewGVN: Fix memory congruence verification. The return true should be a return... · 9d0042b4
      Daniel Berlin authored
      NewGVN: Fix memory congruence verification. The return true should be a return false. Merge the appropriate if statements so it doesn't happen again.
      
      llvm-svn: 300584
      9d0042b4
    • Chih-Hung Hsieh's avatar
      [X86] Keep EXTRACT_VECTOR_ELT result type as f128 for Android x86_64. · 877923a8
      Chih-Hung Hsieh authored
      Android x86_64 target uses f128 type and stores f128 values in %xmm* registers.
      SoftenFloatRes_EXTRACT_VECTOR_ELT should not convert result value
      from f128 to i128.
      
      Differential Revision: http://reviews.llvm.org/D32102
      
      llvm-svn: 300583
      877923a8
    • Craig Topper's avatar
    • Simon Pilgrim's avatar
      9398649f
    • Easwaran Raman's avatar
      [SLP vectorizer] Allow phi node reordering in tryToVectorizeList. · 76aba5f6
      Easwaran Raman authored
      In tryToVectorizeList, under a very limited circumstance (when entered
      from tryToVectorizePair), the values may be reordered (swapped) and the
      SLP tree is built with the new order. This extends that to the case when
      starting from phis in vectorizeChainsInBlock when there are exactly two
      phis. The textual order of phi nodes shouldn't really matter. Without
      this change, the loop body in the accompnaying test case is fully vectorized
      when we swap the orde of the phis but not with this order. While this
      doesn't solve the phi-ordering problem in a general way (for more than 2
      phis), this is simple fix that piggybacks on an existing mechanism and
      is useful in cases like multiplying two complex numbers.
      
      Differential revision: https://reviews.llvm.org/D32065
      
      llvm-svn: 300574
      76aba5f6
    • Simon Pilgrim's avatar
      [X86] Use for-range loop. NFCI. · e8ad1da4
      Simon Pilgrim authored
      llvm-svn: 300567
      e8ad1da4
    • Craig Topper's avatar
      [APInt] Use lshrInPlace to replace lshr where possible · fc947bcf
      Craig Topper authored
      This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result.
      
      This adds an lshrInPlace(const APInt &) version as well.
      
      Differential Revision: https://reviews.llvm.org/D32155
      
      llvm-svn: 300566
      fc947bcf
    • Daniel Berlin's avatar
      NewGVN: Don't waste time value numbering unreachable blocks · ec9deb7f
      Daniel Berlin authored
      llvm-svn: 300565
      ec9deb7f
    • Nirav Dave's avatar
      [DAG] Improve store merge candidate pruning. · 855ef456
      Nirav Dave authored
      Remove non-consecutive stores from store merge candidate search as
      they cannot be merged and will prevent us from finding subsequent
      mergeable store cases.
      
      Reviewers: jyknight, bogner, javed.absar, spatel
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32086
      
      llvm-svn: 300561
      855ef456
    • Nirav Dave's avatar
      Add base-index-based store merge test · e50544cd
      Nirav Dave authored
      llvm-svn: 300559
      e50544cd
    • Zvi Rackover's avatar
      LoopRerollPass: Prefer Value::hasOneUse() over Value::getNumUses(). NFC. · d942397e
      Zvi Rackover authored
      getNumUses() can be more expensive as it iterates over all list's elements.
      
      llvm-svn: 300558
      d942397e
    • Gil Rapaport's avatar
      [LV] Cache block mask values · fb1d915a
      Gil Rapaport authored
      This patch is part of D28975's breakdown.
      
      Add caching for block masks similar to the cache already used for edge masks,
      replacing generation per user with reusing the first generated value which
      dominates all uses.
      
      Differential Revision: https://reviews.llvm.org/D32054
      
      llvm-svn: 300557
      fb1d915a
    • Sanjay Patel's avatar
      [ConstantRange] fix doxygen comment formatting; NFC · 78d163c7
      Sanjay Patel authored
      llvm-svn: 300554
      78d163c7
    • Nikolai Bozhenov's avatar
      Make globalaa-retained.ll test catching more cases. · 95fc6441
      Nikolai Bozhenov authored
      Summary:
      * Add checks for store. That is needed because GlobalsAA is called
        twice in the current pipeline with different sets of Function passes
        following it. However, the loads are eliminated using instcombine
        which happens everywhere. On the other hand, DeadStoreElimination is
        performed only once so by checking for store we'll be able to catch
        more cases when GlobalsAA is invalidated unintentionally.
      * Add empty function above/below the test so that we don't depend on
        the relative order of instcombine/dead-store-elimination and the
        pass that invalidates the analysis (inside the same
        FunctionPassManager).
      
      Reviewers: kristof.beyls
      
      Reviewed By: kristof.beyls
      
      Subscribers: llvm-commits, n.bozhenov
      
      Differential Revision: https://reviews.llvm.org/D32015
      Patch by Andrei Elovikov <andrei.elovikov@intel.com>
      
      llvm-svn: 300553
      95fc6441
    • Nikolai Bozhenov's avatar
      [GVNHoist] Mark GlobalsAA as preserved by GVNHoist. · 9e4a1c39
      Nikolai Bozhenov authored
      Reviewers: sebpop, hiraditya
      
      Reviewed By: sebpop
      
      Subscribers: n.bozhenov, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32158
      Patch by Andrei Elovikov <andrei.elovikov@intel.com>
      
      llvm-svn: 300552
      9e4a1c39
    • Nirav Dave's avatar
      Add store Merge test. · b9776849
      Nirav Dave authored
      llvm-svn: 300551
      b9776849
    • Oliver Stannard's avatar
      [ARM] Add hardware build attributes in assembler · 7ad2e8aa
      Oliver Stannard authored
      In the assembler, we should emit build attributes based on the target
      selected with command-line options. This matches the GNU assembler's
      behaviour. We only do this for build attributes which describe the
      hardware that is expected to be available, not the ones that describe
      ABI compatibility.
      
      This is done by moving some of the attribute emission code to
      ARMTargetStreamer, so that it can be shared between the assembly and
      code-generation code paths. Since the assembler only creates a
      MCSubtargetInfo, not an ARMSubtarget, the code had to be changed to
      check raw features, and not use the convenience functions in
      ARMSubtarget.
      
      If different attributes are later specified using the .eabi_attribute
      directive, then they will take precedence, as happens when the same
      .eabi_attribute is specified twice.
      
      This must be enabled by an option, because we don't want to do this when
      parsing inline assembly. The attributes would match the ones emitted at
      the start of the file, so wouldn't actually change the emitted object
      file, but the extra directives would be added to every inline assembly
      block when emitting assembly, which we'd like to avoid.
      
      The majority of the changes in the build-attributes.ll test are just
      re-ordering the directives, because the hardware attributes are now
      emitted before the ABI ones. However, I did fix one bug which I spotted:
      Tag_CPU_arch_profile was not being emitted for v6M.
      
      Differential revision: https://reviews.llvm.org/D31812
      
      llvm-svn: 300547
      7ad2e8aa
    • Diana Picus's avatar
      [ARM] GlobalISel: Add support for G_SUB · a3a0cccb
      Diana Picus authored
      Support G_SUB throughout the GlobalISel pipeline. It is exactly the same
      as G_ADD, nothing fancy.
      
      llvm-svn: 300546
      a3a0cccb
    • Andrea Di Biagio's avatar
      517e3fc3
    • Andrea Di Biagio's avatar
      [SampleProfile] Skip intrinsic calls when visiting callsites in InlineHotFunctions. · e3edef09
      Andrea Di Biagio authored
      Before this patch, we always called method 'findCalleeFunctionSamples()' on
      intrinsic calls. However, intrinsic calls like llvm.dbg.value() are not viable
      candidates for obvious reasons.
      
      No functional change intended.
      
      Differential Revision: https://reviews.llvm.org/D32008
      
      llvm-svn: 300541
      e3edef09
    • Kristof Beyls's avatar
      Revert "[GlobalISel] Support vector-of-pointers in LLT" · a4e79cca
      Kristof Beyls authored
      This reverts r300535 and r300537.
      The newly added tests in test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
      produces slightly different code between LLVM versions being built with different compilers.
      E.g., dependent on the compiler LLVM is built with, either one of the following
      can be produced:
      
      remark: <unknown>:0:0: unable to legalize instruction: %vreg0<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg2; (in function: vector_of_pointers_extractelement)
      remark: <unknown>:0:0: unable to legalize instruction: %vreg2<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg0; (in function: vector_of_pointers_extractelement)
      
      Non-determinism like this is clearly a bad thing, so reverting this until
      I can find and fix the root cause of the non-determinism.
      
      llvm-svn: 300538
      a4e79cca
    • Kristof Beyls's avatar
      Fix gcc build after r300535. · c10e6250
      Kristof Beyls authored
      llvm-svn: 300537
      c10e6250
    • Diana Picus's avatar
      [ARM] Check for correct HW div when lowering divmod · e2626bb7
      Diana Picus authored
      For subtargets that use the custom lowering for divmod, e.g. gnueabi,
      we used to check if the subtarget has hardware divide and then lower to
      a div-mul-sub sequence if true, or to a libcall if false.
      
      However, judging by the usage of hasDivide vs hasDivideInARMMode, it
      seems that hasDivide only refers to Thumb. For instance, in the
      ARMTargetLowering constructor, the code that specifies whether to use
      libcalls for (S|U)DIV looks like this:
      
      bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide()
                                            : Subtarget->hasDivideInARMMode();
      
      In the case of divmod for arm-gnueabi, using only hasDivide() to
      determine what to do means that instead of lowering to __aeabi_idivmod
      to get the remainder, we lower to div-mul-sub and then further lower the
      div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but
      not in Thumb, we generate a libcall instead of using it (this is not an
      issue in practice since AFAICT none of the cores that we support have
      hardware divide in ARM but not Thumb).
      
      This patch fixes the code dealing with custom lowering to take into
      account the mode (Thumb or ARM) when deciding whether or not hardware
      division is available.
      
      Differential Revision: https://reviews.llvm.org/D32005
      
      llvm-svn: 300536
      e2626bb7
    • Kristof Beyls's avatar
      [GlobalISel] Support vector-of-pointers in LLT · fb73eb03
      Kristof Beyls authored
      This fixes PR32471.
      
      As comment 10 on that bug report highlights
      (https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a
      few different defendable design tradeoffs that could be made, including
      not representing pointers at all in LLT.
      
      I decided to go for representing vector-of-pointer as a concept in LLT,
      while keeping the size of the LLT type 64 bits (this is an increase from
      48 bits before). My rationale for keeping pointers explicit is that on
      some targets probably it's very handy to have the distinction between
      pointer and non-pointer (e.g. 68K has a different register bank for
      pointers IIRC). If we keep a scalar pointer, it probably is easiest to
      also have a vector-of-pointers to keep LLT relatively conceptually clean
      and orthogonal, while we don't have a very strong reason to break that
      orthogonality. Once we gain more experience on the use of LLT, we can
      of course reconsider this direction.
      
      Rejecting vector-of-pointer types in the IRTranslator is also an option
      to avoid the crash reported in PR32471, but that is only a very
      short-term solution; also needs quite a bit of code tweaks in places,
      and is probably fragile. Therefore I didn't consider this the best
      option.
      
      llvm-svn: 300535
      fb73eb03
    • Leslie Zhai's avatar
      test commit · d6fe0db8
      Leslie Zhai authored
      llvm-svn: 300532
      d6fe0db8
    • Craig Topper's avatar
      [APInt] Cleanup the reverseBits slow case a little. · 9eaef075
      Craig Topper authored
      Use lshrInPlace. Use single bit extract and operator|=(uint64_t) to avoid a few temporary APInts.
      
      llvm-svn: 300527
      9eaef075
    • Craig Topper's avatar
      [APInt] Make operator<<= shift in place. Improve the implementation of... · a8a4f0db
      Craig Topper authored
      [APInt] Make operator<<= shift in place. Improve the implementation of tcShiftLeft and use it to implement operator<<=.
      
      llvm-svn: 300526
      a8a4f0db
Loading