Commits · 9e658c974b0d2ad0a68be4bd32a15c2f28200b52 · Lorenzo Albano / LLVM bpEVL

Dec 01, 2017

Revert "[X86] Improvement in CodeGen instruction selection for LEAs." · 9e658c97
Matt Morehouse authored Dec 01, 2017
```
This reverts r319543, due to ASan bot breakage.

llvm-svn: 319591
```
9e658c97

[X86] Improvement in CodeGen instruction selection for LEAs. · 328199ec

Jatin Bhateja authored Dec 01, 2017

Summary:
1/  Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to
     accommodate similar operand appearing in the DAG  e.g.
                 T1 = A + B
                 T2 = T1 + 10
                 T3 = T2 + A
    For above DAG rooted at T3, X86AddressMode will now look like
                Base = B , Index = A , Scale = 2 , Disp = 10

2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity
     then complex LEAs (having 3 operands) could be factored out  e.g.
                 leal 1(%rax,%rcx,1), %rdx
                 leal 1(%rax,%rcx,2), %rcx
     will be factored as following
                 leal 1(%rax,%rcx,1), %rdx
                 leal (%rdx,%rcx)   , %edx

3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop.

4/ Simplify LEA converts (lea (BASE,1,INDEX,0)  --> add (BASE, INDEX) which offers better through put.

PR32755 will be taken care of by this pathc.

Previous patch revisions : r313343 , r314886

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja

Reviewed By: lsaba, RKSimon, jbhateja

Subscribers: jmolloy, spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 319543

328199ec

Nov 22, 2017
- [SelectionDAG] Add a isel matcher op to check the type of node results other than result 0. · fb0d4cd4
  Craig Topper authored Nov 22, 2017
```
I plan to use this to check the type of the mask result of masked gathers in the X86 backend.

llvm-svn: 318820
```
  fb0d4cd4
Nov 17, 2017

Fix a bunch more layering of CodeGen headers that are in Target · b3bde2ea

David Blaikie authored Nov 17, 2017

All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490

b3bde2ea

Nov 16, 2017

Fix APInt bit size in processDbgDeclares · 4d9a4d7a

Yaxun Liu authored Nov 16, 2017

processDbgDeclares assumes pointer size is the same for different addr spaces.
It uses pointer size for addr space 0 for all pointers, which causes assertion
in stripAndAccumulateInBoundsConstantOffsets for amdgcn---amdgiz since
pointer in addr space 5 has different size than in addr space 0.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40085

llvm-svn: 318370

4d9a4d7a

Nov 08, 2017

Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering · 3f833edc

David Blaikie authored Nov 08, 2017

This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647

3f833edc

Oct 25, 2017

Implement salavageDebugInfo functionality for SelectionDAG. · 2eb7cbf9

Adrian Prantl authored Oct 24, 2017

Similar to how llvm::salvagDebugInfo hooks into InstCombine, this adds
a hook that can be invoked before an SDNode that is associated with an
SDDbgValue is erased to capture the effect of the deleted node in a
DIExpression.

The motivating example is an SDDebugValue attached to an ADD operation
that gets folded into a LOAD+OFFSET operation.

rdar://problem/32121503

llvm-svn: 316525

2eb7cbf9

Oct 16, 2017
- Add iterator range MachineRegisterInfo::liveins(), adopt users, NFC · 72518eaa
  Krzysztof Parzyszek authored Oct 16, 2017
```
llvm-svn: 315927
```
  72518eaa
Oct 10, 2017

Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.* · 0965da20

Adam Nemet authored Oct 09, 2017

Sync it up with the name of the class actually defined here.  This has been
bothering me for a while...

llvm-svn: 315249

0965da20

Oct 04, 2017

Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs... · 2a6c9adb

Hans Wennborg authored Oct 04, 2017

Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"

It broke the Chromium / SQLite build; see PR34830.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>          T1 = A + B
>          T2 = T1 + 10
>          T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>          Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>          leal 1(%rax,%rcx,1), %rdx
>          leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>          leal 1(%rax,%rcx,1), %rdx
>          leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
>     Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314919

2a6c9adb

[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post... · 3c29bacd

Jatin Bhateja authored Oct 04, 2017

[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)

Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
         T1 = A + B
         T2 = T1 + 10
         T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
         Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
         leal 1(%rax,%rcx,1), %rdx
         leal 1(%rax,%rcx,2), %rcx
       will be factored as following
         leal 1(%rax,%rcx,1), %rdx
         leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy

Reviewed By: lsaba

Subscribers: jmolloy, spatel, igorb, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314886

3c29bacd

Sep 20, 2017

CodeGen: support SwiftError SwiftCC on Windows x64 · 432b88e5

Saleem Abdulrasool authored Sep 20, 2017

Add support for passing SwiftError through a register on the Windows x64
calling convention.  This allows the use of swifterror attributes on
parameters which is used by the swift front end for the `Error`
parameter.  This partially enables building the swift standard library
for Windows x86_64.

llvm-svn: 313791

432b88e5

CodeGen: use range based for loops (NFC) · 399a4e9b

Saleem Abdulrasool authored Sep 19, 2017

Simplify the RPOT traversal by using a range based for loop for the
iterator dereference.

llvm-svn: 313687

399a4e9b

Sep 15, 2017

Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs." · 534bfbd3

Hans Wennborg authored Sep 15, 2017

This caused PR34629: asserts firing when building Chromium. It also broke some
buildbots building test-suite as reported on the commit thread.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>           T1 = A + B
>           T2 = T1 + 10
>           T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>           Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>           leal 1(%rax,%rcx,1), %rdx
>           leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>           leal 1(%rax,%rcx,1), %rdx
>           leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet
>
> Reviewed By: lsaba
>
> Subscribers: spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313376

534bfbd3

[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs. · 908c8b37

Jatin Bhateja authored Sep 15, 2017

Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
          T1 = A + B
          T2 = T1 + 10
          T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
          Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
          leal 1(%rax,%rcx,1), %rdx
          leal 1(%rax,%rcx,2), %rcx
       will be factored as following
          leal 1(%rax,%rcx,1), %rdx
          leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet

Reviewed By: lsaba

Subscribers: spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313343

908c8b37

Sep 05, 2017

Add llvm.codeview.annotation to implement MSVC __annotation · e33c94f1

Reid Kleckner authored Sep 05, 2017

Summary:
This intrinsic represents a label with a list of associated metadata
strings. It is modelled as reading and writing inaccessible memory so
that it won't be removed as dead code. I think the intention is that the
annotation strings should appear at most once in the debug info, so I
marked it noduplicate. We are allowed to inline code with annotations as
long as we strip the annotation, but that can be done later.

Reviewers: majnemer

Subscribers: eraman, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D36904

llvm-svn: 312569

e33c94f1

Aug 20, 2017
- Move helper classes into anonymous namespaces. · 49a49fe8
  Benjamin Kramer authored Aug 20, 2017
```
No functionality change intended.

llvm-svn: 311288
```
  49a49fe8
Aug 07, 2017

[SelectionDAG] reset NewNodesMustHaveLegalTypes flag between basic blocks · 5ca01695

Guy Blank authored Aug 07, 2017

The NewNodesMustHaveLegalTypes flag is set to false at the beginning of CodeGenAndEmitDAG, and set to true after legalizing types.
But before calling CodeGenAndEmitDAG we build the DAG for the basic block.
So for the first basic block NewNodesMustHaveLegalTypes would be 'false' during the SDAG building, and for all other basic blocks it would be 'true'.

This patch sets the flag to false before SDAG building each basic block.

Differential Revision:
https://reviews.llvm.org/D33435

llvm-svn: 310239

5ca01695

Aug 03, 2017
- DAG: Provide access to Pass instance from SelectionDAG · a52391f2
  Matt Arsenault authored Aug 03, 2017
```
This allows accessing an analysis pass during lowering.

llvm-svn: 309991
```
  a52391f2
Jul 29, 2017
- Remove the unused offset from DBG_VALUE (NFC) · 8b9bb534
  Adrian Prantl authored Jul 28, 2017
```
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309450
```
  8b9bb534
Jul 09, 2017

[FastISel] fix a fallback diagnostic. · b80b44b7

Igor Breger authored Jul 09, 2017

Summary: FastISel was marked as failed in case instruction selection succeeded.

Reviewers: qcolombet, zvi, rovka, ab

Reviewed By: zvi

Subscribers: javed.absar, ab, qcolombet, bogner, llvm-commits

Differential Revision: https://reviews.llvm.org/D34438

llvm-svn: 307489

b80b44b7

Jul 04, 2017

[FastISel][SelectionDAG]Teach fastISel about GC intrinsics · a66a98cc

Anna Thomas authored Jul 04, 2017

Summary:
We are crashing in LLC at O0 when gc intrinsics are present in the block.
The reason being FastISel performs basic block ISel by modifying GC.relocates
to be the first instruction in the block. This can cause us to visit the GC
relocate before it's corresponding GC.statepoint is visited, which is incorrect.
When we lower the statepoint, we record the base and derived pointers, along
with the gc.relocates. After this we can visit the gc.relocate.

This patch avoids fastISel from incorrectly creating the block with gc.relocate
as the first instruction.

Reviewers: qcolombet, skatkov, qikon, reames

Reviewed by: skatkov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34421

llvm-svn: 307084

a66a98cc

Jun 17, 2017
- [SelectionDAG] Update Loop info after splitting critical edges. · 9382c556
  Davide Italiano authored Jun 17, 2017
```
The analysis is expected to be preserved by SelectionDAG.

llvm-svn: 305621
```
  9382c556
Jun 15, 2017

ISel: Fix FastISel of swifterror values · ae9312c4

Arnold Schwaighofer authored Jun 15, 2017

The code assumed that we process instructions in basic block order.  FastISel
processes instructions in reverse basic block order. We need to pre-assign
virtual registers before selecting otherwise we get def-use relationships wrong.

This only affects code with swifterror registers.

rdar://32659327

llvm-svn: 305484

ae9312c4

Jun 08, 2017
- [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use... · 6ac7a348
  Eugene Zelenko authored Jun 07, 2017
```
[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

llvm-svn: 304954
```
  6ac7a348
Jun 06, 2017

Sort the remaining #include lines in include/... and lib/.... · 6bda14b3

Chandler Carruth authored Jun 06, 2017

I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787

6bda14b3

[SelectionDAG] Update the dominator after splitting critical edges. · fb4d5c09

Davide Italiano authored Jun 05, 2017

Running `llc -verify-dom-info` on the attached testcase results in a
crash in the verifier, due to a stale dominator tree.

i.e.

  DominatorTree is not up to date!
  Computed:
  =============================--------------------------------
  Inorder Dominator Tree:
    [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7}
      [2] %lor.lhs.false.i61.i.i.i {1,2}
      [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6}
        [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}

  Actual:
  =============================--------------------------------
  Inorder Dominator Tree:
    [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9}
      [2] %lor.lhs.false.i61.i.i.i {1,2}
      [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8}
        [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
        [3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7}

This is because in `SelectionDAGIsel` we split critical edges without
updating the corresponding dominator for the function (and we claim
in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved).

We could either stop preserving the domtree in `getAnalysisUsage`
or tell `splitCriticalEdge()` to update it.
As the second option is easy to implement, that's the one I chose.

Differential Revision:  https://reviews.llvm.org/D33800

llvm-svn: 304742

fb4d5c09

Jun 02, 2017

[SelectionDAG] Get rid of recursion in findNonImmUse · 4d8748a9

Max Kazantsev authored Jun 02, 2017

The recursive implementation of findNonImmUse may overflow stack
on extremely long use chains. This patch replaces it with an equivalent
iterative implementation.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D33775

llvm-svn: 304522

4d8748a9

May 25, 2017
- Add constrained intrinsics for some libm-equivalent operations · f466001e
  Andrew Kaylor authored May 25, 2017
```
Differential revision: https://reviews.llvm.org/D32319

llvm-svn: 303922
```
  f466001e
May 10, 2017

[CodeGen] Don't require AA in SDAGISel at -O0. · 604526fe

Ahmed Bougacha authored May 10, 2017

Before r247167, the pass manager builder controlled which AA
implementations were used, exporting them all in the AliasAnalysis
analysis group.

Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA
implementations if made available in the pass pipeline.

But regardless, SDAGISel is required at O0, and really doesn't need to
be doing fancy optimizations based on useful AA results.

Don't require AA at CodeGenOpt::None, and only use it otherwise.

This does have a functional impact (and one testcase is pessimized
because we can't reuse a load).  But I think that's desirable no matter
what.

Note that this alone doesn't result in less DT computations: TwoAddress
was previously able to reuse the DT we computed for SDAG.  That will be
fixed separately.

Differential Revision: https://reviews.llvm.org/D32766

llvm-svn: 302611

604526fe

May 09, 2017

Re-land "Use the frame index side table for byval and inalloca arguments" · 3a363fff
Reid Kleckner authored May 09, 2017
```
This re-lands r302483. It was not the cause of PR32977.

llvm-svn: 302544
```
3a363fff
Revert "Use the frame index side table for byval and inalloca arguments" · 9f29914d
Reid Kleckner authored May 09, 2017
```
This reverts r302483 and it's follow up fix.

llvm-svn: 302493
```
9f29914d

Use the frame index side table for byval and inalloca arguments · 45efcf0c

Reid Kleckner authored May 08, 2017

Summary:
For inalloca functions, this is a very common code pattern:

  %argpack = type <{ i32, i32, i32 }>
  define void @f(%argpack* inalloca %args) {
  entry:
    %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
    %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
    %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
    tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
    tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
    tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")

Even though these GEPs can be simplified to a constant offset from EBP
or RSP, we don't do that at -O0, and each GEP is computed into a
register. Registers used to compute argument addresses are typically
spilled and clobbered very quickly after the initial computation, so
live debug variable tracking loses information very quickly if we use
DBG_VALUE instructions.

This change moves processing of dbg.declare between argument lowering
and basic block isel, so that we can ask if an argument has a frame
index or not. If the argument lives in a register as is the case for
byval arguments on some targets, then we don't put it in the side table
and during ISel we emit DBG_VALUE instructions.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32980

llvm-svn: 302483

45efcf0c

Apr 28, 2017

TargetLowering: Add finalizeLowering() function; NFC · 744c215e

Matthias Braun authored Apr 28, 2017

Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.

This function is meant for target specific adjustments to
MachineFrameInfo or register reservations.

Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
handling into the new function to prove the concept. As an added bonus
GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
handling with this.

Differential Revision: https://reviews.llvm.org/D32621

llvm-svn: 301679

744c215e

[SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits · d0af7e8a

Craig Topper authored Apr 28, 2017

This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620

d0af7e8a

[SelectionDAG] Use various APInt methods to reduce temporary APInt creation · 0e03e74e

Craig Topper authored Apr 28, 2017

This patch uses various APInt methods to reduce the number of temporary APInts. These were all found while working through converting SelectionDAG's computeKnownBits to also use the KnownBits struct recently added to the ValueTracking version.

llvm-svn: 301618

0e03e74e

Mar 30, 2017

[CodeGen] Pass SDAG an ORE, and replace FastISel stats with remarks. · 6dd60824

Ahmed Bougacha authored Mar 30, 2017

In the long-term, we want to replace statistics with something
finer-grained that lets us gather per-function data.
Remarks are that replacement.

Create an ORE instance in SelectionDAGISel, and pass it to
SelectionDAG.

SelectionDAG was used so that we can emit remarks from all
SelectionDAG-related code, including TargetLowering and DAGCombiner.
This isn't used in the current patch but Adam tells me he's interested
for the fp-contract combines.

Use the ORE instance to emit FastISel failures as remarks (instead of
the mix of dbgs() dumps and statistics that we currently have).

Eventually, we want to have an API that tells us whether remarks are
enabled (http://llvm.org/PR32352) so that we don't emit expensive
remarks (in this case, dumping IR) when it's not needed.  For now, use
'isEnabled' as a crude replacement.

This does mean that the replacement for '-fast-isel-verbose' is now
'-pass-remarks-missed=isel'.  Additionally, clang users also need to
enable remark diagnostics, using '-Rpass-missed=isel'.

This also removes '-fast-isel-verbose2': there are no static statistics
that we want to only enable in asserts builds, so we can always use
the remarks regardless of the build type.

Differential Revision: https://reviews.llvm.org/D31405

llvm-svn: 299093

6dd60824

Mar 14, 2017

Recommitting Craig Topper's patch now that r296476 has been recommitted. · 4fc8401a

Nirav Dave authored Mar 14, 2017

When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node.

This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency.

llvm-svn: 297698

4fc8401a

Mar 03, 2017

[SDAG] Revert r296476 (and r296486, r296668, r296690). · ce52b807

Chandler Carruth authored Mar 03, 2017

This patch causes compile times for some patterns to explode. I have
a (large, unreduced) test case that slows down by more than 20x and
several test cases slow down by 2x. I'm sending some of the test cases
directly to Nirav and following up with more details in the review log,
but this should unblock anyone else hitting this.

llvm-svn: 296862

ce52b807

Mar 01, 2017

Elide argument copies during instruction selection · f7c0980c

Reid Kleckner authored Mar 01, 2017

Summary:
Avoids tons of prologue boilerplate when arguments are passed in memory
and left in memory. This can happen in a debug build or in a release
build when an argument alloca is escaped.  This will dramatically affect
the code size of x86 debug builds, because X86 fast isel doesn't handle
arguments passed in memory at all. It only handles the x86_64 case of up
to 6 basic register parameters.

This is implemented by analyzing the entry block before ISel to identify
copy elision candidates. A copy elision candidate is an argument that is
used to fully initialize an alloca before any other possibly escaping
uses of that alloca. If an argument is a copy elision candidate, we set
a flag on the InputArg. If the the target generates loads from a fixed
stack object that matches the size and alignment requirements of the
alloca, the SelectionDAG builder will delete the stack object created
for the alloca and replace it with the fixed stack object. The load is
left behind to satisfy any remaining uses of the argument value. The
store is now dead and is therefore elided. The fixed stack object is
also marked as mutable, as it may now be modified by the user, and it
would be invalid to rematerialize the initial load from it.

Supersedes D28388

Fixes PR26328

Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29668

llvm-svn: 296683

f7c0980c