Skip to content
  1. Sep 20, 2017
  2. Sep 15, 2017
    • Hans Wennborg's avatar
      Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs." · 534bfbd3
      Hans Wennborg authored
      This caused PR34629: asserts firing when building Chromium. It also broke some
      buildbots building test-suite as reported on the commit thread.
      
      > Summary:
      >    1/  Operand folding during complex pattern matching for LEAs has been
      >        extended, such that it promotes Scale to accommodate similar operand
      >        appearing in the DAG.
      >        e.g.
      >           T1 = A + B
      >           T2 = T1 + 10
      >           T3 = T2 + A
      >        For above DAG rooted at T3, X86AddressMode will no look like
      >           Base = B , Index = A , Scale = 2 , Disp = 10
      >
      >    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
      >        so that if there is an opportunity then complex LEAs (having 3 operands)
      >        could be factored out.
      >        e.g.
      >           leal 1(%rax,%rcx,1), %rdx
      >           leal 1(%rax,%rcx,2), %rcx
      >        will be factored as following
      >           leal 1(%rax,%rcx,1), %rdx
      >           leal (%rdx,%rcx)   , %edx
      >
      >    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      >       thus avoiding creation of any complex LEAs within a loop.
      >
      > Reviewers: lsaba, RKSimon, craig.topper, qcolombet
      >
      > Reviewed By: lsaba
      >
      > Subscribers: spatel, igorb, llvm-commits
      >
      > Differential Revision: https://reviews.llvm.org/D35014
      
      llvm-svn: 313376
      534bfbd3
    • Jatin Bhateja's avatar
      [X86] PR32755 : Improvement in CodeGen instruction selection for LEAs. · 908c8b37
      Jatin Bhateja authored
      Summary:
         1/  Operand folding during complex pattern matching for LEAs has been
             extended, such that it promotes Scale to accommodate similar operand
             appearing in the DAG.
             e.g.
                T1 = A + B
                T2 = T1 + 10
                T3 = T2 + A
             For above DAG rooted at T3, X86AddressMode will no look like
                Base = B , Index = A , Scale = 2 , Disp = 10
      
         2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
             so that if there is an opportunity then complex LEAs (having 3 operands)
             could be factored out.
             e.g.
                leal 1(%rax,%rcx,1), %rdx
                leal 1(%rax,%rcx,2), %rcx
             will be factored as following
                leal 1(%rax,%rcx,1), %rdx
                leal (%rdx,%rcx)   , %edx
      
         3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
            thus avoiding creation of any complex LEAs within a loop.
      
      Reviewers: lsaba, RKSimon, craig.topper, qcolombet
      
      Reviewed By: lsaba
      
      Subscribers: spatel, igorb, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D35014
      
      llvm-svn: 313343
      908c8b37
  3. Sep 05, 2017
    • Reid Kleckner's avatar
      Add llvm.codeview.annotation to implement MSVC __annotation · e33c94f1
      Reid Kleckner authored
      Summary:
      This intrinsic represents a label with a list of associated metadata
      strings. It is modelled as reading and writing inaccessible memory so
      that it won't be removed as dead code. I think the intention is that the
      annotation strings should appear at most once in the debug info, so I
      marked it noduplicate. We are allowed to inline code with annotations as
      long as we strip the annotation, but that can be done later.
      
      Reviewers: majnemer
      
      Subscribers: eraman, llvm-commits, hiraditya
      
      Differential Revision: https://reviews.llvm.org/D36904
      
      llvm-svn: 312569
      e33c94f1
  4. Aug 20, 2017
  5. Aug 07, 2017
    • Guy Blank's avatar
      [SelectionDAG] reset NewNodesMustHaveLegalTypes flag between basic blocks · 5ca01695
      Guy Blank authored
      The NewNodesMustHaveLegalTypes flag is set to false at the beginning of CodeGenAndEmitDAG, and set to true after legalizing types.
      But before calling CodeGenAndEmitDAG we build the DAG for the basic block.
      So for the first basic block NewNodesMustHaveLegalTypes would be 'false' during the SDAG building, and for all other basic blocks it would be 'true'.
      
      This patch sets the flag to false before SDAG building each basic block.
      
      Differential Revision:
      https://reviews.llvm.org/D33435
      
      llvm-svn: 310239
      5ca01695
  6. Aug 03, 2017
  7. Jul 29, 2017
  8. Jul 09, 2017
  9. Jul 04, 2017
    • Anna Thomas's avatar
      [FastISel][SelectionDAG]Teach fastISel about GC intrinsics · a66a98cc
      Anna Thomas authored
      Summary:
      We are crashing in LLC at O0 when gc intrinsics are present in the block.
      The reason being FastISel performs basic block ISel by modifying GC.relocates
      to be the first instruction in the block. This can cause us to visit the GC
      relocate before it's corresponding GC.statepoint is visited, which is incorrect.
      When we lower the statepoint, we record the base and derived pointers, along
      with the gc.relocates. After this we can visit the gc.relocate.
      
      This patch avoids fastISel from incorrectly creating the block with gc.relocate
      as the first instruction.
      
      Reviewers: qcolombet, skatkov, qikon, reames
      
      Reviewed by: skatkov
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D34421
      
      llvm-svn: 307084
      a66a98cc
  10. Jun 17, 2017
  11. Jun 15, 2017
    • Arnold Schwaighofer's avatar
      ISel: Fix FastISel of swifterror values · ae9312c4
      Arnold Schwaighofer authored
      The code assumed that we process instructions in basic block order.  FastISel
      processes instructions in reverse basic block order. We need to pre-assign
      virtual registers before selecting otherwise we get def-use relationships wrong.
      
      This only affects code with swifterror registers.
      
      rdar://32659327
      
      llvm-svn: 305484
      ae9312c4
  12. Jun 08, 2017
  13. Jun 06, 2017
    • Chandler Carruth's avatar
      Sort the remaining #include lines in include/... and lib/.... · 6bda14b3
      Chandler Carruth authored
      I did this a long time ago with a janky python script, but now
      clang-format has built-in support for this. I fed clang-format every
      line with a #include and let it re-sort things according to the precise
      LLVM rules for include ordering baked into clang-format these days.
      
      I've reverted a number of files where the results of sorting includes
      isn't healthy. Either places where we have legacy code relying on
      particular include ordering (where possible, I'll fix these separately)
      or where we have particular formatting around #include lines that
      I didn't want to disturb in this patch.
      
      This patch is *entirely* mechanical. If you get merge conflicts or
      anything, just ignore the changes in this patch and run clang-format
      over your #include lines in the files.
      
      Sorry for any noise here, but it is important to keep these things
      stable. I was seeing an increasing number of patches with irrelevant
      re-ordering of #include lines because clang-format was used. This patch
      at least isolates that churn, makes it easy to skip when resolving
      conflicts, and gets us to a clean baseline (again).
      
      llvm-svn: 304787
      6bda14b3
    • Davide Italiano's avatar
      [SelectionDAG] Update the dominator after splitting critical edges. · fb4d5c09
      Davide Italiano authored
      Running `llc -verify-dom-info` on the attached testcase results in a
      crash in the verifier, due to a stale dominator tree.
      
      i.e.
      
        DominatorTree is not up to date!
        Computed:
        =============================--------------------------------
        Inorder Dominator Tree:
          [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7}
            [2] %lor.lhs.false.i61.i.i.i {1,2}
            [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6}
              [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
      
        Actual:
        =============================--------------------------------
        Inorder Dominator Tree:
          [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9}
            [2] %lor.lhs.false.i61.i.i.i {1,2}
            [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8}
              [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
              [3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7}
      
      This is because in `SelectionDAGIsel` we split critical edges without
      updating the corresponding dominator for the function (and we claim
      in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved).
      
      We could either stop preserving the domtree in `getAnalysisUsage`
      or tell `splitCriticalEdge()` to update it.
      As the second option is easy to implement, that's the one I chose.
      
      Differential Revision:  https://reviews.llvm.org/D33800
      
      llvm-svn: 304742
      fb4d5c09
  14. Jun 02, 2017
  15. May 25, 2017
  16. May 10, 2017
    • Ahmed Bougacha's avatar
      [CodeGen] Don't require AA in SDAGISel at -O0. · 604526fe
      Ahmed Bougacha authored
      Before r247167, the pass manager builder controlled which AA
      implementations were used, exporting them all in the AliasAnalysis
      analysis group.
      
      Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA
      implementations if made available in the pass pipeline.
      
      But regardless, SDAGISel is required at O0, and really doesn't need to
      be doing fancy optimizations based on useful AA results.
      
      Don't require AA at CodeGenOpt::None, and only use it otherwise.
      
      This does have a functional impact (and one testcase is pessimized
      because we can't reuse a load).  But I think that's desirable no matter
      what.
      
      Note that this alone doesn't result in less DT computations: TwoAddress
      was previously able to reuse the DT we computed for SDAG.  That will be
      fixed separately.
      
      Differential Revision: https://reviews.llvm.org/D32766
      
      llvm-svn: 302611
      604526fe
  17. May 09, 2017
    • Reid Kleckner's avatar
      Re-land "Use the frame index side table for byval and inalloca arguments" · 3a363fff
      Reid Kleckner authored
      This re-lands r302483. It was not the cause of PR32977.
      
      llvm-svn: 302544
      3a363fff
    • Reid Kleckner's avatar
      Revert "Use the frame index side table for byval and inalloca arguments" · 9f29914d
      Reid Kleckner authored
      This reverts r302483 and it's follow up fix.
      
      llvm-svn: 302493
      9f29914d
    • Reid Kleckner's avatar
      Use the frame index side table for byval and inalloca arguments · 45efcf0c
      Reid Kleckner authored
      Summary:
      For inalloca functions, this is a very common code pattern:
      
        %argpack = type <{ i32, i32, i32 }>
        define void @f(%argpack* inalloca %args) {
        entry:
          %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
          %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
          %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
          tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
          tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
          tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")
      
      Even though these GEPs can be simplified to a constant offset from EBP
      or RSP, we don't do that at -O0, and each GEP is computed into a
      register. Registers used to compute argument addresses are typically
      spilled and clobbered very quickly after the initial computation, so
      live debug variable tracking loses information very quickly if we use
      DBG_VALUE instructions.
      
      This change moves processing of dbg.declare between argument lowering
      and basic block isel, so that we can ask if an argument has a frame
      index or not. If the argument lives in a register as is the case for
      byval arguments on some targets, then we don't put it in the side table
      and during ISel we emit DBG_VALUE instructions.
      
      Reviewers: aprantl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32980
      
      llvm-svn: 302483
      45efcf0c
  18. Apr 28, 2017
  19. Mar 30, 2017
    • Ahmed Bougacha's avatar
      [CodeGen] Pass SDAG an ORE, and replace FastISel stats with remarks. · 6dd60824
      Ahmed Bougacha authored
      In the long-term, we want to replace statistics with something
      finer-grained that lets us gather per-function data.
      Remarks are that replacement.
      
      Create an ORE instance in SelectionDAGISel, and pass it to
      SelectionDAG.
      
      SelectionDAG was used so that we can emit remarks from all
      SelectionDAG-related code, including TargetLowering and DAGCombiner.
      This isn't used in the current patch but Adam tells me he's interested
      for the fp-contract combines.
      
      Use the ORE instance to emit FastISel failures as remarks (instead of
      the mix of dbgs() dumps and statistics that we currently have).
      
      Eventually, we want to have an API that tells us whether remarks are
      enabled (http://llvm.org/PR32352) so that we don't emit expensive
      remarks (in this case, dumping IR) when it's not needed.  For now, use
      'isEnabled' as a crude replacement.
      
      This does mean that the replacement for '-fast-isel-verbose' is now
      '-pass-remarks-missed=isel'.  Additionally, clang users also need to
      enable remark diagnostics, using '-Rpass-missed=isel'.
      
      This also removes '-fast-isel-verbose2': there are no static statistics
      that we want to only enable in asserts builds, so we can always use
      the remarks regardless of the build type.
      
      Differential Revision: https://reviews.llvm.org/D31405
      
      llvm-svn: 299093
      6dd60824
  20. Mar 14, 2017
    • Nirav Dave's avatar
      Recommitting Craig Topper's patch now that r296476 has been recommitted. · 4fc8401a
      Nirav Dave authored
      When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node.
      
      This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency.
      
      llvm-svn: 297698
      4fc8401a
  21. Mar 03, 2017
    • Chandler Carruth's avatar
      [SDAG] Revert r296476 (and r296486, r296668, r296690). · ce52b807
      Chandler Carruth authored
      This patch causes compile times for some patterns to explode. I have
      a (large, unreduced) test case that slows down by more than 20x and
      several test cases slow down by 2x. I'm sending some of the test cases
      directly to Nirav and following up with more details in the review log,
      but this should unblock anyone else hitting this.
      
      llvm-svn: 296862
      ce52b807
  22. Mar 01, 2017
    • Reid Kleckner's avatar
      Elide argument copies during instruction selection · f7c0980c
      Reid Kleckner authored
      Summary:
      Avoids tons of prologue boilerplate when arguments are passed in memory
      and left in memory. This can happen in a debug build or in a release
      build when an argument alloca is escaped.  This will dramatically affect
      the code size of x86 debug builds, because X86 fast isel doesn't handle
      arguments passed in memory at all. It only handles the x86_64 case of up
      to 6 basic register parameters.
      
      This is implemented by analyzing the entry block before ISel to identify
      copy elision candidates. A copy elision candidate is an argument that is
      used to fully initialize an alloca before any other possibly escaping
      uses of that alloca. If an argument is a copy elision candidate, we set
      a flag on the InputArg. If the the target generates loads from a fixed
      stack object that matches the size and alignment requirements of the
      alloca, the SelectionDAG builder will delete the stack object created
      for the alloca and replace it with the fixed stack object. The load is
      left behind to satisfy any remaining uses of the argument value. The
      store is now dead and is therefore elided. The fixed stack object is
      also marked as mutable, as it may now be modified by the user, and it
      would be invalid to rematerialize the initial load from it.
      
      Supersedes D28388
      
      Fixes PR26328
      
      Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans
      
      Subscribers: igorb, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29668
      
      llvm-svn: 296683
      f7c0980c
    • Ahmed Bougacha's avatar
      [CodeGen] Remove dead FastISel code after SDAG emitted a tailcall. · 20b3e9a8
      Ahmed Bougacha authored
      When SDAGISel (top-down) selects a tail-call, it skips the remainder
      of the block.
      
      If, before that, FastISel (bottom-up) selected some of the (no-op) next
      few instructions, we can end up with dead instructions following the
      terminator (selected by SDAGISel).
      
      We need to erase them, as we know they aren't necessary (in addition to
      being incorrect).
      
      We already do this when FastISel falls back on the tail-call itself.
      Also remove the FastISel-emitted code if we fallback on the
      instructions between the tail-call and the return.
      
      llvm-svn: 296552
      20b3e9a8
  23. Feb 28, 2017
    • Craig Topper's avatar
      [DAGISel] When checking if chain node is foldable, make sure the intermediate... · 419f145e
      Craig Topper authored
      [DAGISel] When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node.
      
      This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency.
      
      llvm-svn: 296486
      419f145e
  24. Feb 27, 2017
  25. Feb 14, 2017
    • Aditya Nandakumar's avatar
      [Tablegen] Instrumenting table gen DAGGenISelDAG · bb0483bc
      Aditya Nandakumar authored
      To help assist in debugging ISEL or to prioritize GlobalISel backend
      work, this patch adds two more tables to <Target>GenISelDAGISel.inc -
      one which contains the patterns that are used during selection and the
      other containing include source location of the patterns
      Enabled through CMake varialbe LLVM_ENABLE_DAGISEL_COV
      
      llvm-svn: 295081
      bb0483bc
  26. Feb 13, 2017
    • Quentin Colombet's avatar
      [FastISel] Add a diagnostic to warm on fallback. · fbae5fcb
      Quentin Colombet authored
      This is consistent with what we do for GlobalISel. That way, it is easy
      to see whether or not FastISel is able to fully select a function.
      At some point we may want to switch that to an optimization remark.
      
      llvm-svn: 294970
      fbae5fcb
  27. Feb 10, 2017
    • Simon Pilgrim's avatar
      [DAGCombine] Allow vector constant folding of any value type before type legalization · bfb17478
      Simon Pilgrim authored
      The patch comes in 2 parts:
      
      1 - it makes use of the SelectionDAG::NewNodesMustHaveLegalTypes flag to tell when it can safely constant fold illegal types.
      
      2 - it correctly resets SelectionDAG::NewNodesMustHaveLegalTypes at the start of each call to SelectionDAGISel::CodeGenAndEmitDAG so all the pre-legalization stages can make use of it - not just the first basic block that gets handled.
      
      Fix for PR30760
      
      Differential Revision: https://reviews.llvm.org/D29568
      
      llvm-svn: 294749
      bfb17478
    • Craig Topper's avatar
      [SelectionDAG] Dump the DAG after legalizing vector ops and after the second type legalization · a9f11218
      Craig Topper authored
      Summary:
      With -debug, we aren't dumping the DAG after legalizing vector ops. In particular, on X86 with AVX1 only, we don't dump the DAG after we split 256-bit integer ops into pairs of 128-bit ADDs since this occurs during vector legalization.
      
      I'm only dumping if the legalize vector ops changes something since we don't print anything during legalize vector ops. So this dump shows up right after the first type-legalization dump happens. So if nothing changed this second dump is unnecessary.
      
      Having said that though, I think we should probably fix legalize vector ops to log what its doing.
      
      Reviewers: RKSimon, eli.friedman, spatel, arsenm, chandlerc
      
      Reviewed By: RKSimon
      
      Subscribers: wdng, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29554
      
      llvm-svn: 294711
      a9f11218
  28. Feb 07, 2017
    • Reid Kleckner's avatar
      [SDAGISel] Simplify some SDAGISel code, NFC · 0887d44a
      Reid Kleckner authored
      Hoist entry block code for arguments and swift error values out of the
      basic block instruction selection loop. Lowering arguments once up front
      seems much more readable than doing it conditionally inside the loop. It
      also makes it clear that argument lowering can update StaticAllocaMap
      because no instructions have been selected yet.
      
      Also use range-based for loops where possible.
      
      llvm-svn: 294329
      0887d44a
  29. Feb 04, 2017
  30. Feb 03, 2017
  31. Jan 30, 2017
Loading