Commits · f4528ae0635c3dcba0f0b842c2b67f368d7b702f · Roger Ferrer / llvm-epi-0.8

Sep 13, 2011

Extract live range calculations from SplitKit. · 487f2a37

Jakob Stoklund Olesen authored Sep 13, 2011

SplitKit will soon need two copies of these data structures, and the
algorithms will also be useful when LiveIntervalAnalysis becomes
independent of LiveVariables.

llvm-svn: 139572

487f2a37

Sep 12, 2011

Introduce a bit of a hack. · ac5a8836

Bill Wendling authored Sep 12, 2011

Splitting a landing pad takes considerable care because of PHIs and other
nasties. The problem is that the jump table needs to jump to the landing pad
block. However, the landing pad block can be jumped to only by an invoke
instruction. So we clone the landingpad instruction into its own basic block,
have the invoke jump to there. The landingpad instruction's basic block's
successor is now the target for the jump table.

But because of PHI nodes, we need to create another basic block for the jump
table to jump to. This is definitely a hack, because the values for the PHI
nodes may not be defined on the edge from the jump table. But that's okay,
because the jump table is simply a construct to mimic what is happening in the
CFG. So the values are mysteriously there, even though there is no value for the
PHI from the jump table's edge (hence calling this a hack).

llvm-svn: 139545

ac5a8836

Remove the -compact-regions flag. · 45df7e0f

Jakob Stoklund Olesen authored Sep 12, 2011

It has been enabled by default for a while, it was only there to allow
performance comparisons.

llvm-svn: 139501

45df7e0f

Add an interface for SplitKit complement spill modes. · eecb2fb1

Jakob Stoklund Olesen authored Sep 12, 2011

SplitKit always computes a complement live range to cover the places
where the original live range was live, but no explicit region has been
allocated.

Currently, the complement live range is created to be as small as
possible - it never overlaps any of the regions.  This minimizes
register pressure, but if the complement is going to be spilled anyway,
that is not very important.  The spiller will eliminate redundant
spills, and hoist others by making the spill slot live range overlap
some of the regions created by splitting.  Stack slots are cheap.

This patch adds the interface to enable spill modes in SplitKit.  In
spill mode, SplitKit will assume that the complement is going to spill,
so it will allow it to overlap regions in order to avoid back-copies.
By doing some of the spiller's work early, the complement live range
becomes simpler.  In some cases, it can become much simpler because no
extra PHI-defs are required.  This will speed up both splitting and
spilling.

This is only the interface to enable spill modes, no implementation yet.

llvm-svn: 139500

eecb2fb1

Update comments to reflect some (not so) recent changes. · 72c0ddfb
Jakob Stoklund Olesen authored Sep 12, 2011
```
llvm-svn: 139498
```
72c0ddfb

Sep 10, 2011
- Fix asserts in CodeGen from: · 78a812bf
  Richard Trieu authored Sep 10, 2011
```
  assert("error");

to:

  assert(0 && "error");

llvm-svn: 139449
```
  78a812bf
- tidy up a bit · e74e0c80
  Chris Lattner authored Sep 09, 2011
```
llvm-svn: 139419
```
  e74e0c80
Sep 09, 2011

Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the... · b7910b79

Eli Friedman authored Sep 09, 2011

Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type.  Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs.  Fixes PR10897.

llvm-svn: 139407

b7910b79

Reapply r139247: Cache intermediate results during traceSiblingValue. · 278bf025

Jakob Stoklund Olesen authored Sep 09, 2011

In some cases such as interpreters using indirectbr, the CFG can be very
complicated, and live range splitting may be forced to insert a large
number of phi-defs.  When that happens, traceSiblingValue can spend a
lot of time zipping around in the CFG looking for defs and reloads.

This patch causes more information to be cached in SibValues, and the
cached values are used to terminate searches early.  This speeds up
spilling by 20x in one interpreter test case.  For more typical code,
this is just a 10% speedup of spilling.

The previous version had bugs that caused miscompilations. They have
been fixed.

llvm-svn: 139378

278bf025

Directly point debug info to the stack slot of the arugment, instead of trying... · 9d904e1a

Devang Patel authored Sep 08, 2011

Directly point debug info to the stack slot of the arugment, instead of trying to keep track of vreg in which it the arugment is copied. The LiveDebugVariable can keep track of variable's ranges.

llvm-svn: 139330

9d904e1a

Sep 07, 2011

Revert r139247 "Cache intermediate results during traceSiblingValue." · 946e0a46
Jakob Stoklund Olesen authored Sep 07, 2011
```
It broke the self host and clang-x86_64-darwin10-RA.

llvm-svn: 139259
```
946e0a46

Cache intermediate results during traceSiblingValue. · b77d5c14

Jakob Stoklund Olesen authored Sep 07, 2011

In some cases such as interpreters using indirectbr, the CFG can be very
complicated, and live range splitting may be forced to insert a large
number of phi-defs.  When that happens, traceSiblingValue can spend a
lot of time zipping around in the CFG looking for defs and reloads.

This patch causes more information to be cached in SibValues, and the
cached values are used to terminate searches early.  This speeds up
spilling by 20x in one interpreter test case.  For more typical code,
this is just a 10% speedup of spilling.

llvm-svn: 139247

b77d5c14

Refactor instprinter and mcdisassembler to take a SubtargetInfo. Add -mattr=... · 4c493e80

James Molloy authored Sep 07, 2011

Refactor instprinter and mcdisassembler to take a SubtargetInfo. Add -mattr= handling to llvm-mc. Reviewed by Owen Anderson.

llvm-svn: 139237

4c493e80

Relax the MemOperands on atomics a bit. Fixes -verify-machineinstrs failures... · e978d2f6

Eli Friedman authored Sep 07, 2011

Relax the MemOperands on atomics a bit.  Fixes -verify-machineinstrs failures for atomic laod/store on ARM.

(The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.)

llvm-svn: 139221

e978d2f6

While sinking machine instructions, sink matching DBG_VALUEs also otherwise... · 9de7a7db

Devang Patel authored Sep 07, 2011

While sinking machine instructions, sink matching DBG_VALUEs also otherwise live debug variable pass will drop DBG_VALUEs on the floor.

llvm-svn: 139208

9de7a7db

Sep 06, 2011

Add codegen support for vector select (in the IR this means a select · f2641e1b

Duncan Sands authored Sep 06, 2011

with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons.  Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all").  Patch mostly by
Nadav Rotem.

llvm-svn: 139159

f2641e1b

Split the init.trampoline intrinsic, which currently combines GCC's · a098436b

Duncan Sands authored Sep 06, 2011

init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC. While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function. To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function. Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!). Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC. Patch mostly by Sanjoy Das.

llvm-svn: 139140

a098436b

Sep 03, 2011

Fix a truly heinous bug in DAGCombine related to AssertZext. · 40d756ea

Owen Anderson authored Sep 03, 2011

If we have a chain of zext -> assert_zext -> zext -> use, the first zext would get simplified away because of the later zext, and then the later zext would get simplified away because of the assert. The solution is to teach SimplifyDemandedBits that assert_zext demands all of the high bits of its input, rather than only those demanded by its users. No testcase because the only example I have manifests as llvm-gcc miscompiling LLVM, and I haven't found a smaller case that reproduces this problem.
Fixes <rdar://problem/10063365>.

llvm-svn: 139059

40d756ea

Sep 02, 2011

Simplify by using isFullCopy(). · 97fe09ad
Jakob Stoklund Olesen authored Sep 02, 2011
```
llvm-svn: 139019
```
97fe09ad
Darwin wants ctors/dtors to be ordered the other way round to linux. · 5c04c627
Duncan Sands authored Sep 02, 2011
```
llvm-svn: 139015
```
5c04c627

Revert r131152, r129796, r129761. This code is currently considered · 3767be9a

Dan Gohman authored Sep 01, 2011

to be unreliable on platforms which require memcpy calls, and it is
complicating broader legalize cleanups. It is hoped that these cleanups
will make memcpy byval easier to implement in the future.

llvm-svn: 138977

3767be9a

Don't drop alignment info on local common symbols. · 6397051e

Benjamin Kramer authored Sep 01, 2011

- On COFF the .lcomm directive has an alignment argument.
- On ELF we fall back to .local + .comm

Based on a patch by NAKAMURA Takumi.

Fixes PR9337, PR9483 and PR10128.

llvm-svn: 138976

6397051e

Sep 01, 2011

Permit remat of partial register defs when it is safe. · 5dc87d0f

Jakob Stoklund Olesen authored Sep 01, 2011

An instruction may define part of a register where the other bits are
undefined. In that case, it is safe to rematerialize the instruction.
For example:

  %vreg2:ssub_0<def> = VLDRS <cp#0>, 0, pred:14, pred:%noreg, %vreg2<imp-def>

The extra <imp-def> operand indicates that the instruction does not read
the other parts of the virtual register, so a remat is safe.

This patch simply allows multiple def operands for the virtual register.
It is MI->readsVirtualRegister() that determines if we depend on a
previous value so remat is impossible.

llvm-svn: 138953

5dc87d0f

Revert r138794, "Do not try to rematerialize a value from a partial definition." · e417273f

Jakob Stoklund Olesen authored Sep 01, 2011

The problem is fixed for all register allocators by r138944, so this
patch is no longer necessary.

<rdar://problem/10032939>

llvm-svn: 138945

e417273f

Prevent remat of partial register redefinitions. · 6357fa2f

Jakob Stoklund Olesen authored Sep 01, 2011

An instruction that redefines only part of a larger register can never
be rematerialized since the virtual register value depends on the old
value in other parts of the register.

This was fixed for the inline spiller in r138794.  This patch fixes the
problem for all register allocators, and includes a small test case.

<rdar://problem/10032939>

llvm-svn: 138944

6357fa2f

Teach MachineLICM reg pressure tracking code to deal with MVT::untyped. Sorry,... · 90da66bb

Evan Cheng authored Sep 01, 2011

Teach MachineLICM reg pressure tracking code to deal with MVT::untyped. Sorry, I can't come up with a small test case. rdar://10043690

llvm-svn: 138934

90da66bb

PreRA scheduler should avoid cloning compares. · 832a6a19

Andrew Trick authored Sep 01, 2011

Added canClobberReachingPhysRegUse() to handle a particular pattern in
which a two-address instruction could be forced to interfere with
EFLAGS, causing a compare to be unnecessarilly cloned.
Fixes rdar://problem/5875261

llvm-svn: 138924

832a6a19

Aug 31, 2011
- Fix Size Typing · 7df940d6
  David Greene authored Aug 31, 2011
```
Stores sizes as uint64_t to avoid possible truncation.

llvm-svn: 138901
```
  7df940d6
- Misc cleanup; addresses Duncan's comments on r138877. · ae1acddb
  Eli Friedman authored Aug 31, 2011
```
llvm-svn: 138887
```
  ae1acddb
- Fill in type legalization for MERGE_VALUES in all the various cases. Patch by... · e839ecb7
  Eli Friedman authored Aug 31, 2011
```
Fill in type legalization for MERGE_VALUES in all the various cases.  Patch by Micah Villmow.  (No testcase because the issue only showed up in an out-of-tree backend.)

llvm-svn: 138877
```
  e839ecb7
- Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg;... · 7c3bdede
  Eli Friedman authored Aug 31, 2011
```
Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; implements 64-bit atomic load/store for ARM.

llvm-svn: 138872
```
  7c3bdede
- Compress Repeated Byte Output · cdef71f4
  David Greene authored Aug 31, 2011
```
Emit a repeated sequence of bytes using .zero.  This saves an enormous
amount of asm file space for certain programs.

llvm-svn: 138864
```
  cdef71f4
- Spelling and grammar fixes to problems found by Duncan. · 6e31dfea
  Rafael Espindola authored Aug 31, 2011
```
llvm-svn: 138858
```
  6e31dfea
Aug 30, 2011

Emit segmented-stack specific code into function prologues for · c2174211

Rafael Espindola authored Aug 30, 2011

X86. Modify the pass added in the previous patch to call this new
code.

This new prologues generated will call a libgcc routine (__morestack)
to allocate more stack space from the heap when required

Patch by Sanjoy Das.

llvm-svn: 138812

c2174211

Follow up to r138791. · e6fba779

Evan Cheng authored Aug 30, 2011

Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to
call a target hook to adjust the instruction. For ARM, this is used to
adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC
instructions have implicit def of CPSR (required since it now uses CPSR physical
register dependency rather than "glue"). If the carry flag is used, then the
target hook will *fill in* the optional operand with CPSR. Otherwise, the hook
will remove the CPSR implicit def from the MachineInstr.

llvm-svn: 138810

e6fba779

Do not try to rematerialize a value from a partial definition. · 358a5f6a

Bob Wilson authored Aug 30, 2011

I don't currently have a good testcase for this; will try to get one
tomorrow.  <rdar://problem/10032939>

llvm-svn: 138794

358a5f6a

Thumb2 parsing and encoding for IT blocks. · ed16ec42
Jim Grosbach authored Aug 29, 2011
```
llvm-svn: 138773
```
ed16ec42

Aug 28, 2011

Fix PR5329: pay attention to constructor/destructor priority · 4d63542b

Duncan Sands authored Aug 28, 2011

when outputting them.  With this, the entire LLVM testsuite
passes when built with dragonegg.

llvm-svn: 138724

4d63542b

Aug 27, 2011
- These splits should be done whether they are critical edges or not. · 4707d37a
  Bill Wendling authored Aug 27, 2011
```
llvm-svn: 138697
```
  4707d37a
Aug 26, 2011
- Update the dominator tree with the correct dominator for the new 'unwind' block. · 71fce2c8
  Bill Wendling authored Aug 26, 2011
```
llvm-svn: 138664
```
  71fce2c8