Commits · 02845410f9d5fabb516054547a1a331bcbefd19b · Roger Ferrer / llvm-epi-0.8

Nov 23, 2011

Jakob Stoklund Olesen authored Nov 23, 2011

This was a bug in keeping track of the available domains when merging
domain values.

The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.

Also add an assertion to catch future attempts at emitting AVX2
instructions.

llvm-svn: 145096

02845410

Fix a crash in block placement due to an inner loop that happened to be · 4a87aa0c

Chandler Carruth authored Nov 23, 2011

reversed in the function's original ordering, and we happened to
encounter it while handling an outer unnatural CFG structure.

Thanks to the test case reduced from GCC's source by Benjamin Kramer.
This may also fix a crasher in gzip that Duncan reduced for me, but
I haven't yet gotten to testing that one.

llvm-svn: 145094

4a87aa0c

Nov 22, 2011

Fix a devilish miscompile exposed by block placement. The · ee54feb6

Chandler Carruth authored Nov 22, 2011

updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.

This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.

This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.

llvm-svn: 145062

ee54feb6

Fix an obvious omission in the SelectionDAGBuilder where we were · e2530dc8

Chandler Carruth authored Nov 22, 2011

dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.

No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.

llvm-svn: 145060

e2530dc8

If a register is both an early clobber and part of a tied use, handle the use · 2021f382
Rafael Espindola authored Nov 22, 2011
```
before the clobber so that we copy the value if needed.

Fixes pr11415.

llvm-svn: 145056
```
2021f382

Nov 20, 2011

The logic for breaking the CFG in the presence of hot successors didn't · 18dfac38

Chandler Carruth authored Nov 20, 2011

properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

llvm-svn: 145010

18dfac38

Nov 19, 2011

Move the handling of unanalyzable branches out of the loop-driven chain · f3dc9eff

Chandler Carruth authored Nov 19, 2011

formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

llvm-svn: 144994

f3dc9eff

Nov 18, 2011

DISubrange supports unsigned lower/upper array bounds, so let's not fake it in... · 107e8ec3

Devang Patel authored Nov 17, 2011

DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange. 

llvm-svn: 144937

107e8ec3

Nov 17, 2011

When fast iseling a GEP, accumulate the offset rather than emitting a series of · f83ab704

Chad Rosier authored Nov 17, 2011

ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886

f83ab704

Make sure to replace the chain properly when DAGCombining a... · ff1eaa75

Eli Friedman authored Nov 16, 2011

Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD.  Fixes PR10747/PR11393.

llvm-svn: 144863

ff1eaa75

Nov 16, 2011

Add fast-isel stats to determine who's doing all the work, the · ff40b1e1
Chad Rosier authored Nov 16, 2011
```
target-independent selector or the target-specific selector.

llvm-svn: 144833
```
ff40b1e1

Fix the stats collection for fast-isel. The failed count was only accounting · cfd0d10e

Chad Rosier authored Nov 16, 2011

for a single miss and not all predecessor instructions that get selected by
the selection DAG instruction selector.  This is still not exact (e.g., over
states misses when folded/dead instructions are present), but it is a step in
the right direction.

llvm-svn: 144832

cfd0d10e

Disable expensive two-address optimizations at -O0. rdar://10453055 · 822ddde5
Evan Cheng authored Nov 16, 2011
```
llvm-svn: 144806
```
822ddde5
Disable the assertion again. Looks like fastisel is still generating bad kill markers. · 624eb2af
Evan Cheng authored Nov 16, 2011
```
llvm-svn: 144804
```
624eb2af

Sink codegen optimization level into MCCodeGenInfo along side relocation model · ecb2908b

Evan Cheng authored Nov 16, 2011

and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.

llvm-svn: 144788

ecb2908b

Record landing pads with a SmallSetVector to avoid multiple entries. · cca9aa58

Bob Wilson authored Nov 16, 2011

There may be many invokes that share one landing pad, and the previous code
would record the landing pad once for each invoke.  Besides the wasted
effort, a pair of volatile loads gets inserted every time the landing pad is
processed.  The rest of the code can get optimized away when a landing pad
is processed repeatedly, but the volatile loads remain, resulting in code like:

LBB35_18:
Ltmp483:
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r4, [r7, #-72]
        ldr     r2, [r7, #-68]

llvm-svn: 144787

cca9aa58

Update the SP in the SjLj jmpbuf whenever it changes. <rdar://problem/10444602 > · 643e63c4

Bob Wilson authored Nov 16, 2011

This same basic code was in the older version of the SjLj exception handling,
but it was removed in the recent revisions to that code.  It needs to be there.

llvm-svn: 144782

643e63c4

Revert r144568 now that r144730 has fixed the fast-isel kill marker bug. · 4ac36c8e
Evan Cheng authored Nov 16, 2011
```
llvm-svn: 144776
```
4ac36c8e

If the 2addr instruction has other kills, don't move it below any other uses... · b8c55a53

Evan Cheng authored Nov 16, 2011

If the 2addr instruction has other kills, don't move it below any other uses since we don't want to extend other live ranges.

llvm-svn: 144772

b8c55a53

RescheduleKillAboveMI() must backtrack to before the rescheduled DBG_VALUE... · 59f8156e
Evan Cheng authored Nov 16, 2011
```
RescheduleKillAboveMI() must backtrack to before the rescheduled DBG_VALUE instructions. rdar://10451185

llvm-svn: 144771
```
59f8156e
Process all uses first before defs to accurately capture register liveness. rdar://10449480 · 9ddd69a8
Evan Cheng authored Nov 16, 2011
```
llvm-svn: 144770
```
9ddd69a8
CONCAT_VECTORS can have more than two operands. PR11389. · 87f92512
Eli Friedman authored Nov 16, 2011
```
llvm-svn: 144768
```
87f92512

Add a couple asserts so it will be easier to debug if we accidentally pass... · d257a464

Eli Friedman authored Nov 16, 2011

Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer.

llvm-svn: 144767

d257a464

Rename MVT::untyped to MVT::Untyped to match similar nomenclature. · ca2f78a9
Owen Anderson authored Nov 16, 2011
```
llvm-svn: 144747
```
ca2f78a9
Stabilize the output of the dwarf accelerator tables. Fixes a comparison · 0abbd0ef
Eric Christopher authored Nov 15, 2011
```
failure during bootstrap with it turned on.

llvm-svn: 144731
```
0abbd0ef

GEPs with all zero indices are trivially coalesced by fast-isel. For example, · 291ce47d

Chad Rosier authored Nov 15, 2011

%arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0
%arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134

Prior to this commit, the GEP instruction that defines %arrayidx136 thought that 
%arrayidx135 was a trivial kill.  The GEP that defines %arrayidx135 doesn't 
generate any code and thus %M0 gets folded into the second GEP.  Thus, we need
to look through GEPs with all zero indices.
rdar://10443319

llvm-svn: 144730

291ce47d

Nov 15, 2011

Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used · 7c7ba1ba
Pete Cooper authored Nov 15, 2011
```
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
```
7c7ba1ba
Insert modified DBG_VALUE into LiveDbgValueMap. · 43bde96a
Devang Patel authored Nov 15, 2011
```
llvm-svn: 144696
```
43bde96a

We currently use a callback to handle an IL pass deleting a BB that still · f11e7f13

Rafael Espindola authored Nov 15, 2011

has a reference to it. Unfortunately, that doesn't work for codegen passes
since we don't get notified of MBB's being deleted (the original BB stays).

Use that fact to our advantage and after printing a function, check if
any of the IL BBs corresponds to a symbol that was not printed. This fixes
pr11202.

llvm-svn: 144674

f11e7f13

Remove all remaining uses of Value::getNameStr(). · 1f97a5a6
Benjamin Kramer authored Nov 15, 2011
```
llvm-svn: 144648
```
1f97a5a6
Twinify GraphWriter a little bit. · 4c93d15f
Benjamin Kramer authored Nov 15, 2011
```
llvm-svn: 144647
```
4c93d15f
Check all overlaps when looking for used registers. · e14ef7e6
Jakob Stoklund Olesen authored Nov 15, 2011
```
A function using any RC alias is enough to enable the ExeDepsFix pass.

llvm-svn: 144636
```
e14ef7e6
Make use of MachinePointerInfo::getFixedStack. · ab9ebd35
Jay Foad authored Nov 15, 2011
```
llvm-svn: 144635
```
ab9ebd35
Remove some unnecessary includes of PseudoSourceValue.h. · 70679df6
Jay Foad authored Nov 15, 2011
```
llvm-svn: 144634
```
70679df6
Set SeenStore to true to prevent loads from being moved; also eliminates a... · 7098c4e5
Evan Cheng authored Nov 15, 2011
```
Set SeenStore to true to prevent loads from being moved; also eliminates a non-deterministic behavior.

llvm-svn: 144628
```
7098c4e5

Rather than trying to use the loop block sequence *or* the function · 9b548a7f

Chandler Carruth authored Nov 15, 2011

block sequence when recovering from unanalyzable control flow
constructs, *always* use the function sequence. I'm not sure why I ever
went down the path of trying to use the loop sequence, it is
fundamentally not the correct sequence to use. We're trying to preserve
the incoming layout in the cases of unreasonable control flow, and that
is only encoded at the function level. We already have a filter to
select *exactly* the sub-set of blocks within the function that we're
trying to form into a chain.

The resulting code layout is also significantly better because of this.
In several places we were ending up with completely unreasonable control
flow constructs due to the ordering chosen by the loop structure for its
internal storage. This change removes a completely wasteful vector of
basic blocks, saving memory allocation in the common case even though it
costs us CPU in the fairly rare case of unnatural loops. Finally, it
fixes the latest crasher reduced out of GCC's single source. Thanks
again to Benjamin Kramer for the reduction, my bugpoint skills failed at
it.

llvm-svn: 144627

9b548a7f

Break false dependencies before partial register updates. · f8ad336b

Jakob Stoklund Olesen authored Nov 15, 2011

Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

llvm-svn: 144602

f8ad336b

Track register ages more accurately. · 543bef6e

Jakob Stoklund Olesen authored Nov 15, 2011

Keep track of the last instruction to define each register individually
instead of per DomainValue.  This lets us track more accurately when a
register was last written.

Also track register ages across basic blocks.  When entering a new
basic block, use the least stale predecessor def as a worst case
estimate for register age.

The register age is used to arbitrate between conflicting domains. The
most recently defined register wins.

llvm-svn: 144601

543bef6e

Nov 14, 2011
- Avoid dereferencing off the beginning of lists. · f2fc508d
  Evan Cheng authored Nov 14, 2011
```
llvm-svn: 144569
```
  f2fc508d
- At -O0, multiple uses of a virtual registers in the same BB are being marked · 28ffb7e4
  Evan Cheng authored Nov 14, 2011
```
"kill". This looks like a bug upstream. Since that's going to take some time
to understand, loosen the assertion and disable the optimization when
multiple kills are seen.

llvm-svn: 144568
```
  28ffb7e4