Skip to content
  1. Nov 17, 2017
    • David Blaikie's avatar
      Fix a bunch more layering of CodeGen headers that are in Target · b3bde2ea
      David Blaikie authored
      All these headers already depend on CodeGen headers so moving them into
      CodeGen fixes the layering (since CodeGen depends on Target, not the
      other way around).
      
      llvm-svn: 318490
      b3bde2ea
    • Craig Topper's avatar
      [X86] Don't remove sign extend of gather/scatter indices during SelectionDAGBuilder. · 242374e2
      Craig Topper authored
      The sign extend might be from an i16 or i8 type and was inserted by InstCombine to match the pointer width. X86 gather legalization isn't currently detecting this to reinsert a sign extend to make things legal.
      
      It's a bit weird for the SelectionDAGBuilder to do this kind of optimization in the first place. With this removed we can at least lean on InstCombine somewhat to ensure the index is i32 or i64.
      
      I'll work on trying to recover some of the test cases by removing sign extends in the backend when its safe to do so with an understanding of the current legalizer capabilities.
      
      This should fix PR30690.
      
      llvm-svn: 318466
      242374e2
  2. Nov 16, 2017
    • Vedant Kumar's avatar
      Revert "[SelectionDAG] Consolidate (t|T)ransferDbgValues methods, NFC." · 53418797
      Vedant Kumar authored
      This reverts commit r318448. It looks like some of the asserts need to
      be weakened.
      
      http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/16296
      
      llvm-svn: 318455
      53418797
    • Craig Topper's avatar
      [DAGCombiner] Use cast instead of an unchecked dyn_cast. · b6b61dfb
      Craig Topper authored
      llvm-svn: 318450
      b6b61dfb
    • Vedant Kumar's avatar
      [SelectionDAG] Consolidate (t|T)ransferDbgValues methods, NFC. · 494814d5
      Vedant Kumar authored
      TransferDbgValues (capital 'T') is wired into ReplaceAllUsesWith, and
      transferDbgValues (lowercase 't') is used elsewhere (e.g in Legalize).
      
      Both functions should be doing the exact same thing. This patch
      consolidates the logic into one place.
      
      Differential Revision: https://reviews.llvm.org/D40104
      
      llvm-svn: 318448
      494814d5
    • Yaxun Liu's avatar
      Fix pointer EVT in SelectionDAGBuilder::visitAlloca · 0844ff2a
      Yaxun Liu authored
      SelectionDAGBuilder::visitAlloca assumes alloca address space is 0, which is
      incorrect for triple amdgcn---amdgiz and causes isel failure.
      
      This patch fixes that.
      
      Differential Revision: https://reviews.llvm.org/D40095
      
      llvm-svn: 318392
      0844ff2a
    • Sam Parker's avatar
      [DAGCombine] Enable more srl -> load combines · 43fa5911
      Sam Parker authored
      Change the calculation for the desired ValueType for non-sign
      extending loads, as in those cases we don't care about the
      higher bits. This creates a smaller ExtVT and allows for such
      combinations as:
      (srl (zextload i16, [addr]), 8) -> (zextload i8, [addr + 1])
      
      Differential Revision: https://reviews.llvm.org/D40034
      
      llvm-svn: 318390
      43fa5911
    • Benjamin Kramer's avatar
      Assert correct removal of SUnit in LatencyPriorityQueue · bd20e975
      Benjamin Kramer authored
      The LatencyPriorityQueue doesn't currently check whether the SU being removed really exists in the Queue.
      This method fails quietly when SU is not found and removes the last element from the Queue, leading to unexpected behavior.
      
      Unfortunately, this only occurs on our custom target, with the custom scheduler. In our case, when remove() is invoked, it removes the wrong SU at the end of the Queue, which is only discovered later when VerifyScheduledDAG() is invoked and finds that some nodes were not scheduled at all.
      
      As this is only reproducible with a lot of proprietary code, I'm hopeful this assert is straightforward enough to not necessitate a test.
      
      Patch by Ondrej Glasnak!
      
      Differential Revision: https://reviews.llvm.org/D40084
      
      llvm-svn: 318387
      bd20e975
    • Mikael Holmen's avatar
      [MachineRegisterInfo] Avoid having dbg.values affect code generation · 56e4abc2
      Mikael Holmen authored
      Summary:
      Use use_nodbg_empty() rather than use_empty() in
      MachineRegisterInfo::EmitLiveInCopies() when determining if a livein
      register has any uses or not. Otherwise a single dbg.value can make us
      generate different code, meaning -g would affect code generation.
      
      Found when compiling code for my out-of-tree target. Unfortunately I
      haven't been able to reproduce the problem on X86 or any of the other
      in-tree targets that I tried, so no test case.
      
      Reviewers: MatzeB
      
      Reviewed By: MatzeB
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D39044
      
      llvm-svn: 318382
      56e4abc2
    • Craig Topper's avatar
      [SelectionDAG] Use report_fatal_error instead of llvm_unreachable in some code... · 36e8d66e
      Craig Topper authored
      [SelectionDAG] Use report_fatal_error instead of llvm_unreachable in some code that can be reached if targets don't configure things correctly.
      
      For example, this is currently reachable by X86 if you use a masked store intrinsic with a v1iX type.
      
      Using a fatal error seems like a better user experience if someone were to encounter this on a release build. There are several other similar places that have been converted from unreachable to fatal error previously.
      
      llvm-svn: 318379
      36e8d66e
    • Yaxun Liu's avatar
      Fix APInt bit size in processDbgDeclares · 4d9a4d7a
      Yaxun Liu authored
      processDbgDeclares assumes pointer size is the same for different addr spaces.
      It uses pointer size for addr space 0 for all pointers, which causes assertion
      in stripAndAccumulateInBoundsConstantOffsets for amdgcn---amdgiz since
      pointer in addr space 5 has different size than in addr space 0.
      
      This patch fixes that.
      
      Differential Revision: https://reviews.llvm.org/D40085
      
      llvm-svn: 318370
      4d9a4d7a
    • Daniel Sanders's avatar
      [globalisel][tablegen] Generate rule coverage and use it to identify untested rules · f76f3154
      Daniel Sanders authored
      Summary:
      This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
      causes TableGen to instrument the generated table to collect rule coverage
      information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
      LLVM_ENABLE_DAGISEL_COV. The information is written to files
      (${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
      concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
      read this information and use it to emit warnings about untested rules.
      
      This technique could also be used by SelectionDAG and can be further
      extended to detect hot rules and give them priority over colder rules.
      
      Usage:
      * Enable LLVM_ENABLE_GISEL_COV in CMake
      * Build the compiler and run some tests
      * cat gisel-coverage-[0-9]* > gisel-coverage-all
      * Delete lib/Target/*/*GenGlobalISel.inc*
      * Build the compiler
      
      Known issues:
      * ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
        step due to a lack of a portable 'cat' command. It should be the
        concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
      * There's no mechanism to discard coverage information when the ruleset
        changes
      
      Depends on D39742
      
      Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka
      
      Reviewed By: rovka
      
      Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D39747
      
      llvm-svn: 318356
      f76f3154
    • Rong Xu's avatar
      [CodeGen] Fix the branch probability assertion in r318202 · e4572c6b
      Rong Xu authored
      Due to integer precision, we might have numerator greater than denominator in
      the branch probability scaling. Add a check to prevent this from happening.
      
      llvm-svn: 318353
      e4572c6b
    • Aditya Nandakumar's avatar
      [GISel][NFC]: Move getOpcodeDef from the LegalizationArtifactCombiner into... · 954eea07
      Aditya Nandakumar authored
      [GISel][NFC]: Move getOpcodeDef from the LegalizationArtifactCombiner into GlobalISel/Utils for use elsewhere
      
      llvm-svn: 318350
      954eea07
  3. Nov 15, 2017
  4. Nov 14, 2017
    • Aditya Nandakumar's avatar
      [GISel]: Rework legalization algorithm for better elimination of · e6201c87
      Aditya Nandakumar authored
      artifacts along with DCE
      
      Legalization Artifacts are all those insts that are there to make the
      type system happy. Currently, the target needs to say all combinations
      of extends and truncs are legal and there's no way of verifying that
      post legalization, we only have *truly* legal instructions. This patch
      changes roughly the legalization algorithm to process all illegal insts
      at one go, and then process all truncs/extends that were added to
      satisfy the type constraints separately trying to combine trivial cases
      until they converge. This has the added benefit that, the target
      legalizerinfo can only say which truncs and extends are okay and the
      artifact combiner would combine away other exts and truncs.
      
      Updated legalization algorithm to roughly the following pseudo code.
      
      WorkList Insts, Artifacts;
      collect_all_insts_and_artifacts(Insts, Artifacts);
      
      do {
        for (Inst in Insts)
               legalizeInstrStep(Inst, Insts, Artifacts);
        for (Artifact in Artifacts)
               tryCombineArtifact(Artifact, Insts, Artifacts);
      } while(!Insts.empty());
      
      Also, wrote a simple wrapper equivalent to SetVector, except for
      erasing, it avoids moving all elements over by one and instead just
      nulls them out.
      
      llvm-svn: 318210
      e6201c87
    • Rong Xu's avatar
      [CodeGen] Peel off the dominant case in switch statement in lowering · 3573d8da
      Rong Xu authored
      This patch peels off the top case in switch statement into a branch if the
      probability exceeds a threshold. This will help the branch prediction and
      avoids the extra compares when lowering into chain of branches.
      
      Differential Revision: http://reviews.llvm.org/D39262
      
      llvm-svn: 318202
      3573d8da
    • Hans Wennborg's avatar
      Rename CountingFunctionInserter and use for both mcount and cygprofile calls,... · e1ecd61b
      Hans Wennborg authored
      Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
      
      Clang implements the -finstrument-functions flag inherited from GCC, which
      inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.
      
      This is useful for getting a trace of how the functions in a program are
      executed. Normally, the calls remain even if a function is inlined into another
      function, but it is useful to be able to turn this off for users who are
      interested in a lower-level trace, i.e. one that reflects what functions are
      called post-inlining. (We use this to generate link order files for Chromium.)
      
      LLVM already has a pass for inserting similar instrumentation calls to
      mcount(), which it does after inlining. This patch renames and extends that
      pass to handle calls both to mcount and the cygprofile functions, before and/or
      after inlining as controlled by function attributes.
      
      Differential Revision: https://reviews.llvm.org/D39287
      
      llvm-svn: 318195
      e1ecd61b
    • Easwaran Raman's avatar
      [CodeGenPrepare] Disable div bypass when working set size is huge. · 0d55b55b
      Easwaran Raman authored
      Summary:
      Bypass of slow divs based on operand values is currently disabled for
      -Os. Do the same when profile summary is available and the working set
      size of the application is huge. This is similar to how loop peeling is
      guarded by hasHugeWorkingSetSize. In the div bypass case, the generated
      extra code (and the extra branch) tendss to outweigh the benefits of the
      bypass. This results in noticeable performance improvement on an
      internal application.
      
      Reviewers: davidxl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D39992
      
      llvm-svn: 318179
      0d55b55b
    • Yaxun Liu's avatar
      CodeGen: Fix TargetLowering::LowerCallTo for sret value type · 0b2f73fd
      Yaxun Liu authored
      TargetLowering::LowerCallTo assumes that sret value type corresponds to a
      pointer in default address space, which is incorrect, since sret value type
      should correspond to a pointer in alloca address space, which may not
      be the default address space. This causes assertion for amdgcn target
      in amdgiz environment.
      
      This patch fixes that.
      
      Differential Revision: https://reviews.llvm.org/D39996
      
      llvm-svn: 318167
      0b2f73fd
    • Sam Clegg's avatar
      [WebAssembly] Explicily disable comdat support for wasm output · 99966076
      Sam Clegg authored
      For now at least.  We clearly need some kind of comdat or
      linkonce_odr support for wasm but currently COMDAT is not
      supported.
      
      Disable COMDAT support in the same way we do the Mach-O.  This
      also causes clang not to generated COMDATs.
      
      Differential Revision: https://reviews.llvm.org/D39873
      
      llvm-svn: 318123
      99966076
  5. Nov 13, 2017
  6. Nov 11, 2017
    • Daniel Sanders's avatar
      [globalisel][tablegen] Import signextload and zeroextload. · 7e523673
      Daniel Sanders authored
      Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
      correct situations where SelectionDAG and GlobalISel disagree on
      representation. For example, it would rewrite:
        (sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
      to:
        (sext:i32 (load:i16 $ptr)<<unindexedload>>)
      
      I'd have preferred to replace the fragments and have the expansion happen
      naturally as part of PatFrag expansion but the type inferencing system can't
      cope with loads of types narrower than those mentioned in register classes.
      This is because the SDTCisInt's on the sext constrain both the result and
      operand to the 'legal' integer types (where legal is defined as 'a register
      class can contain the type') which immediately rules the narrower types out.
      Several targets (those with only one legal integer type) would then go on to
      crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
      for the result of the extend.
      
      Also, improve isObviouslySafeToFold() slightly to automatically return true for
      neighbouring instructions. There can't be any re-ordering problems if
      re-ordering isn't happenning. We'll need to improve it further to handle
      sign/zero-extending loads when the extend and load aren't immediate neighbours
      though.
      
      llvm-svn: 317971
      7e523673
    • Craig Topper's avatar
      [SelectionDAG] Make getUniformBase in SelectionDAGBuilder fail if any of the... · bdb8db45
      Craig Topper authored
      [SelectionDAG] Make getUniformBase in SelectionDAGBuilder fail if any of the middle GEP indices are non-constant.
      
      This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored.
      
      llvm-svn: 317950
      bdb8db45
  7. Nov 10, 2017
  8. Nov 09, 2017
  9. Nov 08, 2017
    • Adrian Prantl's avatar
      Let replaceVTableHolder accept any type. · a8e56458
      Adrian Prantl authored
      In Rust, a trait can be implemented for any type, and if a trait
      object pointer is used for the type, then a virtual table will be
      emitted for that trait/type combination.
      
      We would like debuggers to be able to inspect trait objects, which
      requires finding the concrete type associated with a given vtable.
      
      This patch changes LLVM so that any type can be passed to
      replaceVTableHolder. This allows the Rust compiler to emit the needed
      debug info -- associating a vtable with the concrete type for which it
      was emitted.
      
      This is a DWARF extension: DWARF only specifies the meaning of
      DW_AT_containing_type in one specific situation. This style of DWARF
      extension is routine, though, and LLVM already has one such case for
      DW_AT_containing_type.
      
      Patch by Tom Tromey!
      
      Differential Revision: https://reviews.llvm.org/D39503
      
      llvm-svn: 317730
      a8e56458
    • Dan Gohman's avatar
      Add an @llvm.sideeffect intrinsic · 2c74fe97
      Dan Gohman authored
      This patch implements Chandler's idea [0] for supporting languages that
      require support for infinite loops with side effects, such as Rust, providing
      part of a solution to bug 965 [1].
      
      Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
      effect, but which appears to optimization passes to have obscure side effects,
      such that they don't optimize away loops containing it. It also teaches
      several optimization passes to ignore this intrinsic, so that it doesn't
      significantly impact optimization in most cases.
      
      As discussed on llvm-dev [2], this patch is the first of two major parts.
      The second part, to change LLVM's semantics to have defined behavior
      on infinite loops by default, with a function attribute for opting into
      potential-undefined-behavior, will be implemented and posted for review in
      a separate patch.
      
      [0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
      [1] https://bugs.llvm.org/show_bug.cgi?id=965
      [2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html
      
      Differential Revision: https://reviews.llvm.org/D38336
      
      llvm-svn: 317729
      2c74fe97
    • Reid Kleckner's avatar
      Revert "Correct dwarf unwind information in function epilogue for X86" · 7adb2fdb
      Reid Kleckner authored
      This reverts r317579, originally committed as r317100.
      
      There is a design issue with marking CFI instructions duplicatable. Not
      all targets support the CFIInstrInserter pass, and targets like Darwin
      can't cope with duplicated prologue setup CFI instructions. The compact
      unwind info emission fails.
      
      When the following code is compiled for arm64 on Mac at -O3, the CFI
      instructions end up getting tail duplicated, which causes compact unwind
      info emission to fail:
        int a, c, d, e, f, g, h, i, j, k, l, m;
        void n(int o, int *b) {
          if (g)
            f = 0;
          for (; f < o; f++) {
            m = a;
            if (l > j * k > i)
              j = i = k = d;
            h = b[c] - e;
          }
        }
      
      We get assembly that looks like this:
      ; BB#1:                                 ; %if.then
      Lloh3:
      	adrp	x9, _f@GOTPAGE
      Lloh4:
      	ldr	x9, [x9, _f@GOTPAGEOFF]
      	mov	 w8, wzr
      Lloh5:
      	str		wzr, [x9]
      	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
      	.cfi_def_cfa_offset 16
      	.cfi_offset w19, -8
      	.cfi_offset w20, -16
      	cmp		w8, w0
      	b.lt	LBB0_3
      	b	LBB0_7
      LBB0_2:                                 ; %entry.if.end_crit_edge
      Lloh6:
      	adrp	x8, _f@GOTPAGE
      Lloh7:
      	ldr	x8, [x8, _f@GOTPAGEOFF]
      Lloh8:
      	ldr		w8, [x8]
      	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
      	.cfi_def_cfa_offset 16
      	.cfi_offset w19, -8
      	.cfi_offset w20, -16
      	cmp		w8, w0
      	b.ge	LBB0_7
      LBB0_3:                                 ; %for.body.lr.ph
      
      Note the multiple .cfi_def* directives. Compact unwind info emission
      can't handle that.
      
      llvm-svn: 317726
      7adb2fdb
Loading