Commits · 7f3e84c9fdb9628ba81ecffcc7d40ee1a1de7939 · Roger Ferrer / llvm-epi-0.8

Jul 23, 2012

Fixed DAGCombine optimizations which generate select_cc for targets · 9056076c

Nadav Rotem authored Jul 23, 2012

that do not support it (X86 does not lower select_cc).

PR: 13428

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160619

9056076c

Tidy up. Fix indentation and remove trailing whitespace. · 2694c05e
Craig Topper authored Jul 23, 2012
```
llvm-svn: 160617
```
2694c05e

Change llvm_unreachable in SplitVectorResult to report_fatal_error. Keeps... · b49546a3

Craig Topper authored Jul 23, 2012

Change llvm_unreachable in SplitVectorResult to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type. For instance 256-bit AVX intrinsics without having AVX enabled.

llvm-svn: 160616

b49546a3

Jul 21, 2012
- Remove unused private member variables uncovered by the recent changes to... · 5be8f601
  Benjamin Kramer authored Jul 20, 2012
```
Remove unused private member variables uncovered by the recent changes to clang's -Wunused-private-field.

llvm-svn: 160583
```
  5be8f601
Jul 20, 2012

Avoid folding loads that are unsafe to move. · e2cfd0d4

Jakob Stoklund Olesen authored Jul 20, 2012

LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.

This fixes PR13414.

llvm-svn: 160575

e2cfd0d4

Split loop exiting edges more aggressively. · f62c07f1

Jakob Stoklund Olesen authored Jul 20, 2012

PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.

Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.

The test case demonstrates the improvement. Before:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  movl  %esi, %eax
  je  LBB0_1

After:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  je  LBB0_1

  movl  %esi, %eax

llvm-svn: 160571

f62c07f1

Fix crash in machine verifier when trying to print the def of a register which has no def · dcf94db6
Pete Cooper authored Jul 19, 2012
```
llvm-svn: 160531
```
dcf94db6

Jul 19, 2012
- Replace some explicit compare loops with std::equal. · f364a63c
  Benjamin Kramer authored Jul 19, 2012
```
No functionality change.

llvm-svn: 160501
```
  f364a63c
- Fixed few warnings. · aaf97359
  Galina Kistanova authored Jul 19, 2012
```
llvm-svn: 160493
```
  aaf97359
- Remove tabs. · d163405d
  Bill Wendling authored Jul 19, 2012
```
llvm-svn: 160475
```
  d163405d
Jul 18, 2012

Fix a somewhat nasty crasher in PR13378. This crashes inside of · 985454e0

Chandler Carruth authored Jul 18, 2012

LiveIntervals due to the two-addr pass generating bogus MI code.

The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.

The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.

As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]

Thanks to Jakob for helpful hints on the way here, and the review.

llvm-svn: 160443

985454e0

ignore 'invoke @llvm.donothing', but still keep the edge to the continuation BB · 2151497d
Nuno Lopes authored Jul 18, 2012
```
llvm-svn: 160411
```
2151497d

Jul 17, 2012

Back out r160101 and instead implement a dag combine to recover from instcombine transformation. · e6a3b03e
Evan Cheng authored Jul 17, 2012
```
llvm-svn: 160387
```
e6a3b03e
Add some trace output to TwoAddressInstructionPass. · 0ef03118
Jakob Stoklund Olesen authored Jul 17, 2012
```
llvm-svn: 160380
```
0ef03118
Remove unused variable. · 7c1598ca
Benjamin Kramer authored Jul 17, 2012
```
llvm-svn: 160372
```
7c1598ca

Fix a crash in the legalization of large vectors. · 277a40bc

Nadav Rotem authored Jul 17, 2012

When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.

llvm-svn: 160357

277a40bc

Implement r160312 as target indepedenet dag combine. · 780f9b5f
Evan Cheng authored Jul 17, 2012
```
llvm-svn: 160354
```
780f9b5f
Make sure constant bitwidth is <= 64 bit before calling getSExtValue(). · 47d7be95
Evan Cheng authored Jul 17, 2012
```
llvm-svn: 160350
```
47d7be95

This is another case where instcombine demanded bits optimization created · f579beca

Evan Cheng authored Jul 17, 2012

large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346

f579beca

Jul 16, 2012

Minor cleanup and docs. · 60f7904d
Nadav Rotem authored Jul 16, 2012
```
llvm-svn: 160311
```
60f7904d

· 839a06e9

Nadav Rotem authored Jul 16, 2012

Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160305

839a06e9

Jul 15, 2012

Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be... · 3050e071

Nadav Rotem authored Jul 15, 2012

Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160235

3050e071

Refactor the code that checks that all operands of a node are UNDEFs. · a62368c9

Nadav Rotem authored Jul 15, 2012

Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160229

a62368c9

Reapply r160194, switching to use LV information for finding local kills. · db5536f0

Chandler Carruth authored Jul 15, 2012

The notable fix is to look at any dependencies attached to the kill
instruction (or other instructions between MI nad the kill) where the
dependencies are specific to the register in question.

The old code implicitly handled this by rejecting the transform if *any*
other uses were found within the block, but after the start point. The
new code directly finds the kill, and has to re-use the existing
dependency scan to check for non-kill uses.

This was caught by self-host, but I found the bug via inspection and use
of absurd assert scaffolding to compute the kills in two ways and
compare them. So I have no useful testcase for this other than
"bootstrap". I'd work harder to reduce a test case if this particular
code were likely to live for a long time.

Thanks to Benjamin Kramer for reviewing the fix itself.

llvm-svn: 160228

db5536f0

Jul 14, 2012

Add a dagcombine optimization to convert concat_vectors of undefs into a single undef. · 01892100
Nadav Rotem authored Jul 14, 2012
```
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.

llvm-svn: 160221
```
01892100
Account for early-clobber reload instructions. · 8f324a2c
Jakob Stoklund Olesen authored Jul 14, 2012
```
No test case, there are no in-tree targets that require this.

llvm-svn: 160219
```
8f324a2c

Be more verbose when detecting dominance problems. · 3d604ab9

Jakob Stoklund Olesen authored Jul 13, 2012

Catch uses of undefined physregs that haven't been added to basic block
live-in lists. Run the verifier to pinpoint the problem.

Also run the verifier when a virtual register use is not jointly
dominated by defs.

llvm-svn: 160207

3d604ab9

Revert r160194, which switched to use LV information for finding local · 9c97cd56
Chandler Carruth authored Jul 13, 2012
```
kills.

This is causing miscompiles that I'm working on tracking down.

llvm-svn: 160196
```
9c97cd56

Jul 13, 2012

Use the LiveVariables information to efficiently get local kills. This · 58c470dc

Chandler Carruth authored Jul 13, 2012

removes the largest scaling problem in the test cases from PR13225 when
ASan is switched to insert basic blocks in the natural CFG order.

It may also solve some scaling problems for more normal code with large
numbers of basic blocks and variables.

llvm-svn: 160194

58c470dc

Provide function name in 'Cannot select' fatal error. · 1af8c806

Jim Grosbach authored Jul 13, 2012

When dumping the DAG for a fatal 'Cannot select' back-end error, also
provide the name of the function the construct is in. Useful when dealing
with large testcases, as the next step is to llvm-extract the function
in question to get a small(er) testcase.

llvm-svn: 160152

1af8c806

The end of the prologue should be marked with is_stmt. · bf57091f
Eric Christopher authored Jul 12, 2012
```
Fixes PR13303.

Patch by Paul Robinson!

llvm-svn: 160148
```
bf57091f

Jul 12, 2012

The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of · 671cc257

Duncan Sands authored Jul 12, 2012

the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC).  However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.

llvm-svn: 160116

671cc257

Jul 11, 2012

InstrEmitter::EmitSubregNode() optimize extract_subreg in this case: · b1712285

Evan Cheng authored Jul 11, 2012

r1025 = s/zext r1024, 4
r1026 = extract_subreg r1025, 4

to a copy:
r1026 = copy r1024

This is correct. However it uses TII->isCoalescableExtInstr() which can return
true for instructions which essentially does a sext_in_reg so this can end up
with an illegal copy where the source and destination register classes do not
match. Add a check to avoid it. Sorry, no test case possible at this time.

rdar://11849816

llvm-svn: 160059

b1712285

Rename many of the Tmp1, Tmp2, Tmp3 variables to names such as Chain, Value, Ptr, etc. · 2a148668
Nadav Rotem authored Jul 11, 2012
```
No functionality change.

llvm-svn: 160042
```
2a148668
Remove unused variable. · 9488100d
Benjamin Kramer authored Jul 11, 2012
```
llvm-svn: 160040
```
9488100d
Refactor the DAG Legalizer by extracting the legalization of · de6fd282
Nadav Rotem authored Jul 11, 2012
```
Load and Store nodes into their own functions.
No functional change.

llvm-svn: 160037
```
de6fd282

Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an... · b8844d67

Owen Anderson authored Jul 11, 2012

Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization.
This is a speculative fix for a problem on Mips reported by Akira Hatanaka.

llvm-svn: 160036

b8844d67

Require and preserve LoopInfo for early if-conversion. · bc90a4ea
Jakob Stoklund Olesen authored Jul 10, 2012
```
It will surely be needed by heuristics.

llvm-svn: 160027
```
bc90a4ea

Teach the LiveInterval::join function to use the fast merge algorithm, · 2207f76c

Chandler Carruth authored Jul 10, 2012

generalizing its implementation sufficiently to support this value
number scenario as well.

This cuts out another significant performance hit in large functions
(over 10k basic blocks, etc), especially those with "natural" CFG
structures.

llvm-svn: 160026

2207f76c

Run early if-conversion in domtree post-order. · 02638392

Jakob Stoklund Olesen authored Jul 10, 2012

This ordering allows nested if-conversion without using a work list, and
it makes it possible to update the dominator tree on the fly as well.

Any erased basic blocks will always be dominated by the current
post-order position, so the domtree can be pruned without invalidating
the iterator.

llvm-svn: 160025

02638392