Skip to content
  1. Aug 25, 2016
  2. Aug 24, 2016
    • Todd Fiala's avatar
      fix darwin_log test errors on macOS < 10.12 · b17ac35f
      Todd Fiala authored
      The newer event-based tests I added neglected to do the
      macOS 10.12 check in the setup.  This caused earlier macOS
      test suite runs to attempt to compile code that doesn't exist.
      
      llvm-svn: 279672
      b17ac35f
    • Kyle Butt's avatar
      CodeGen: If Convert blocks that would form a diamond when tail-merged. · a8c7371d
      Kyle Butt authored
      The following function currently relies on tail-merging for if
      conversion to succeed. The common tail of cond_true and cond_false is
      extracted, and this then forms a diamond pattern that can be
      successfully if converted.
      
      If this block does not get extracted, either because tail-merging is
      disabled or the threshold is higher, we should still recognize this
      pattern and if-convert it.
      
      Fixed a regression in the original commit. Need to un-reverse branches after
      reversing them, or other conversions go awry.
      
      define i32 @t2(i32 %a, i32 %b) nounwind {
      entry:
              %tmp1434 = icmp eq i32 %a, %b           ; <i1> [#uses=1]
              br i1 %tmp1434, label %bb17, label %bb.outer
      
      bb.outer:               ; preds = %cond_false, %entry
              %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ]
              %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ]
              br label %bb
      
      bb:             ; preds = %cond_true, %bb.outer
              %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ]
              %tmp. = sub i32 0, %b_addr.021.0.ph
              %tmp.40 = mul i32 %indvar, %tmp.
              %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph
              %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph
              br i1 %tmp3, label %cond_true, label %cond_false
      
      cond_true:              ; preds = %bb
              %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph
              %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph
              %indvar.next = add i32 %indvar, 1
              br i1 %tmp1437, label %bb17, label %bb
      
      cond_false:             ; preds = %bb
              %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0
              %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10
              br i1 %tmp14, label %bb17, label %bb.outer
      
      bb17:           ; preds = %cond_false, %cond_true, %entry
              %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ]
              ret i32 %a_addr.026.1
      }
      
      Without tail-merging or diamond-tail if conversion:
      LBB1_1:                                 @ %bb
                                              @ =>This Inner Loop Header: Depth=1
              cmp     r0, r1
              ble     LBB1_3
      @ BB#2:                                 @ %cond_true
                                              @   in Loop: Header=BB1_1 Depth=1
              subs    r0, r0, r1
              cmp     r1, r0
              it      ne
              cmpne   r0, r1
              bgt     LBB1_4
      LBB1_3:                                 @ %cond_false
                                              @   in Loop: Header=BB1_1 Depth=1
              subs    r1, r1, r0
              cmp     r1, r0
              bne     LBB1_1
      LBB1_4:                                 @ %bb17
              bx      lr
      
      With diamond-tail if conversion, but without tail-merging:
      @ BB#0:                                 @ %entry
              cmp     r0, r1
              it      eq
              bxeq    lr
      LBB1_1:                                 @ %bb
                                              @ =>This Inner Loop Header: Depth=1
              cmp     r0, r1
              ite     le
              suble   r1, r1, r0
              subgt   r0, r0, r1
              cmp     r1, r0
              bne     LBB1_1
      @ BB#2:                                 @ %bb17
              bx      lr
      
      llvm-svn: 279671
      a8c7371d
    • Kyle Butt's avatar
      IfConversion: Rescan diamonds. · 6262ca34
      Kyle Butt authored
      The cost of predicating a diamond is only the instructions that are not shared
      between the two branches. Additionally If a predicate clobbering instruction
      occurs in the shared portion of the branches (e.g. a cond move), it may still
      be possible to if convert the sub-cfg. This change handles these two facts by
      rescanning the non-shared portion of a diamond sub-cfg to recalculate both the
      predication cost and whether both blocks are pred-clobbering.
      
      Fixed 2 bugs before recommitting. Branch instructions must be compared and found
      identical before diamond conversion. Also, predicate-clobbering instructions in
      the shared prefix disqualifies a potential diamond conversion. Includes tests
      for both.
      
      llvm-svn: 279670
      6262ca34
    • Jason Henline's avatar
      [StreamExecutor] Rename Executor to Device · bcc77b62
      Jason Henline authored
      Summary: This more clearly describes what the class is.
      
      Reviewers: jlebar
      
      Subscribers: jprice, parallel_libs-commits
      
      Differential Revision: https://reviews.llvm.org/D23851
      
      llvm-svn: 279669
      bcc77b62
    • Richard Smith's avatar
      Disable test under asan: it uses a lot of stack, and asan increases the · 571a6478
      Richard Smith authored
      per-frame stack usage enough to cause it to hit our stack limit. This is not
      ideal; we should find a better way of dealing with this, such as increasing
      our stack allocation when built with ASan.
      
      llvm-svn: 279668
      571a6478
    • Richard Smith's avatar
      PR29097: add an update record when we instantiate the default member · 4b054b26
      Richard Smith authored
      initializer of an imported field.
      
      llvm-svn: 279667
      4b054b26
    • Alexander Kornienko's avatar
    • Tim Northover's avatar
      ARM: don't diagnose cbz/cbnz to Thumb functions. · 9c3633f5
      Tim Northover authored
      A branch-distance to a Thumb function shouldn't be forced to be odd for
      CBZ/CBNZ instructions because (assuming it's within range), it's going to be a
      valid, even offset.
      
      llvm-svn: 279665
      9c3633f5
    • Kostya Serebryany's avatar
      [sanitizer] re-apply r279572 and r279595 reverted in r279643: change the... · 8e7ea9dd
      Kostya Serebryany authored
      [sanitizer] re-apply r279572 and r279595 reverted in r279643: change the 64-bit allocator to use a single array for free-d chunks instead of a lock-free linked list of tranfer batches. This change simplifies the code, makes the allocator more 'hardened', and will allow simpler code to release RAM to OS. This may also slowdown malloc stress tests due to lock contension, but I did not observe noticeable slowdown on various real multi-threaded benchmarks.
      
      llvm-svn: 279664
      8e7ea9dd
    • Bruno Cardoso Lopes's avatar
      [Sema][Comments] Support @param with c++ 'using' keyword · b09db225
      Bruno Cardoso Lopes authored
      Give appropriate warnings with -Wdocumentation for @param comments
      that refer to function aliases defined with 'using'. Very similar
      to typedef's behavior. This does not add support for
      TypeAliasTemplateDecl yet.
      
      Differential Revision: https://reviews.llvm.org/D23783
      
      rdar://problem/27300695
      
      llvm-svn: 279662
      b09db225
    • Kostya Serebryany's avatar
    • Changpeng Fang's avatar
      AMDGCN/SI: Implement readlane/readfirstlane intrinsics · 75f0968b
      Changpeng Fang authored
      Summary:
        This patch implements readlane/readfirstlane intrinsics.
      TODO: need to define a new register class to consider the case
      that the source could be a vector register or M0.
      
      Reviewed by:
        arsenm and tstellarAMD
      
      Differential Revision:
        http://reviews.llvm.org/D22489
      
      llvm-svn: 279660
      75f0968b
    • Eugene Zelenko's avatar
      Clang-tidy documentation style. Two Google checks are aliases. · 5f45722b
      Eugene Zelenko authored
      Differential revision: https://reviews.llvm.org/D23815
      
      llvm-svn: 279659
      5f45722b
    • Jason Henline's avatar
      [StreamExecutor] Fix allocateDeviceMemory · 3053bbf3
      Jason Henline authored
      Summary:
      The return value from PlatformExecutor::allocateDeviceMemory needs to be
      converted from Expected<GlobalDeviceMemoryBase> to
      Expected<GlobalDeviceMemory<T>> in Executor::allocateDeviceMemory.
      
      A similar bug is also fixed for Executor::allocateHostMemory.
      
      Thanks to jprice for identifying this bug.
      
      Reviewers: jprice, jlebar
      
      Subscribers: parallel_libs-commits
      
      Differential Revision: https://reviews.llvm.org/D23849
      
      llvm-svn: 279658
      3053bbf3
    • Michael Kruse's avatar
      Add %loadPolly to test command line. · 4a080de0
      Michael Kruse authored
      Required for out-of-tree builds of Polly.
      
      llvm-svn: 279657
      4a080de0
    • Matt Arsenault's avatar
      amdgcn: Also correct get_local_size type for HSA · d0a27522
      Matt Arsenault authored
      llvm-svn: 279656
      d0a27522
    • Rafael Espindola's avatar
      Use isTargetMachO instead of isTargetDarwin. · 70c6a397
      Rafael Espindola authored
      llvm-svn: 279655
      70c6a397
    • Jason Henline's avatar
      [StreamExecutor] Clean up device copy comments · 424fc7e6
      Jason Henline authored
      Summary:
      Consolidate Executor::synchronousCopy* and Stream::thenCopy* methods into
      Doxygen method groups and combine all their comments into one section.
      
      Also a "doc" target to the build files to use Doxygen to build the
      documentation.
      
      Reviewers: jlebar
      
      Subscribers: jprice, parallel_libs-commits
      
      Differential Revision: https://reviews.llvm.org/D23845
      
      llvm-svn: 279654
      424fc7e6
    • Samuel Antao's avatar
      Fix offload bundler tests so that diagnostic can start with caps. · c4a62115
      Samuel Antao authored
      Windows require that.
      
      llvm-svn: 279653
      c4a62115
    • Simon Pilgrim's avatar
      [X86][SSE] Add MINSD/MAXSD/MINSS/MAXSS intrinsic scalar load folding support · e14653e1
      Simon Pilgrim authored
      These are no different in load behaviour to the existing ADD/SUB/MUL/DIV scalar ops but were missing from isNonFoldablePartialRegisterLoad
      
      llvm-svn: 279652
      e14653e1
    • David Blaikie's avatar
      DebugInfo: Add flag to CU to disable emission of inline debug info into the skeleton CU · a45c31a5
      David Blaikie authored
      In cases where .dwo/.dwp files are guaranteed to be available, skipping
      the extra online (in the .o file) inline info can save a substantial
      amount of space - see the original r221306 for more details there.
      
      llvm-svn: 279651
      a45c31a5
    • David Blaikie's avatar
      DebugInfo: Add flag to CU to disable emission of inline debug info into the skeleton CU · a01f2953
      David Blaikie authored
      In cases where .dwo/.dwp files are guaranteed to be available, skipping
      the extra online (in the .o file) inline info can save a substantial
      amount of space - see the original r221306 for more details there.
      
      llvm-svn: 279650
      a01f2953
    • Matthew Simpson's avatar
      [LV] Unify vector and scalar maps · abd2be1e
      Matthew Simpson authored
      This patch unifies the data structures we use for mapping instructions from the
      original loop to their corresponding instructions in the new loop. Previously,
      we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap.
      WidenMap maintained the vector values each instruction from the old loop was
      represented with, and ScalarIVMap maintained the scalar values each scalarized
      induction variable was represented with. With this patch, all values created
      for the new loop are maintained in VectorLoopValueMap.
      
      The change allows for several simplifications. Previously, when an instruction
      was scalarized, we had to insert the scalar values into vectors in order to
      maintain the mapping in WidenMap. Then, if a user of the scalarized value was
      also scalar, we had to extract the scalar values from the temporary vector we
      created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions.
      If a scalarized value is used by a scalar instruction, the scalar value is used
      directly. However, if the scalarized value is needed by a vector instruction,
      we generate the needed insertelement instructions on-demand.
      
      A common idiom in several locations in the code (including the scalarization
      code), is to first get the vector values an instruction from the original loop
      maps to, and then extract a particular scalar value. This patch adds
      getScalarValue for this purpose along side getVectorValue as an interface into
      VectorLoopValueMap. These functions work together to return the requested
      values if they're available or to produce them if they're not.
      
      The mapping has also be made less permissive. Entries can be added to
      VectorLoopValue map with the new initVector and initScalar functions.
      getVectorValue has been modified to return a constant reference to the mapped
      entries.
      
      There's no real functional change with this patch; however, in some cases we
      will generate slightly different code. For example, instead of an insertelement
      sequence following the definition of an instruction, it will now precede the
      first use of that instruction. This can be seen in the test case changes.
      
      Differential Revision: https://reviews.llvm.org/D23169
      
      llvm-svn: 279649
      abd2be1e
Loading