Commits · 9be547cfd3747e30963204187a99b59a11668cc7 · Roger Ferrer / llvm-epi-0.8

Jan 14, 2011

Add a possibility to switch between CFI directives- and table-based frame... · 9be547cf

Anton Korobeynikov authored Jan 14, 2011

Add a possibility to switch between CFI directives- and table-based frame description emission. Currently all the backends use table-based stuff.

llvm-svn: 123476

9be547cf

Cleanup · 4d9de6be
Anton Korobeynikov authored Jan 14, 2011
```
llvm-svn: 123475
```
4d9de6be
Add CFI directives-based frame information emission. Not hooked yet. · b46ef57d
Anton Korobeynikov authored Jan 14, 2011
```
llvm-svn: 123474
```
b46ef57d
Split stuff as a preparation for CFI directives-based frame information emission · 61d167e9
Anton Korobeynikov authored Jan 14, 2011
```
llvm-svn: 123473
```
61d167e9
Use common style for .cfi directives · e2bea1c8
Anton Korobeynikov authored Jan 14, 2011
```
llvm-svn: 123472
```
e2bea1c8

Support for precise scheduling of the instruction selection DAG, · 9ccce778

Andrew Trick authored Jan 14, 2011

disabled in this checkin. Sorry for the large diffs due to
refactoring. New functionality is all guarded by EnableSchedCycles.

Scheduling the isel DAG is inherently imprecise, but we give it a best
effort:
- Added MayReduceRegPressure to allow stalled nodes in the queue only
  if there is a regpressure need.
- Added BUHasStall to allow checking for either dependence stalls due to
  latency or resource stalls due to pipeline hazards.
- Added BUCompareLatency to encapsulate and standardize the heuristics
  for minimizing stall cycles (vs. reducing register pressure).
- Modified the bottom-up heuristic (now in BUCompareLatency) to
  prioritize nodes by their depth rather than height. As long as it
  doesn't stall, height is irrelevant. Depth represents the critical
  path to the DAG root.
- Added hybrid_ls_rr_sort::isReady to filter stalled nodes before
  adding them to the available queue.

Related Cleanup: most of the register reduction routines do not need
to be templates.

llvm-svn: 123468

9ccce778

switch SRoA to use LoadAndStorePromoter instead of its own copy of the code. · b498f9af
Chris Lattner authored Jan 14, 2011
```
llvm-svn: 123457
```
b498f9af
Add a new LoadAndStorePromoter class, which implements the general · 95294b87
Chris Lattner authored Jan 14, 2011
```
"promote a bunch of load and stores" logic, allowing the code to
be shared and reused.

llvm-svn: 123456
```
95294b87
OperandTraits<>::Layout isn't used for anything. Remove it. · cbe15056
Jay Foad authored Jan 14, 2011
```
llvm-svn: 123452
```
cbe15056
Update llvm-gcc's tests. · b1ebba9e
Rafael Espindola authored Jan 14, 2011
```
llvm-svn: 123447
```
b1ebba9e
Reorder macros on config.h.cmake to easily compare it against · 959d2534
Oscar Fuentes authored Jan 14, 2011
```
config.h.in.

Patch by arrowdodger!

llvm-svn: 123445
```
959d2534
Disable debug mode. · 610c41e7
Devang Patel authored Jan 14, 2011
```
llvm-svn: 123443
```
610c41e7

Turn X-(X-Y) into Y. According to my auto-simplifier this is the most common · d6f1a958

Duncan Sands authored Jan 14, 2011

simplification present in fully optimized code (I think instcombine fails to
transform some of these when "X-Y" has more than one use). Fires here and
there all over the test-suite, for example it eliminates 8 subtractions in
the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc.

llvm-svn: 123442

d6f1a958

Factorize common code out of the InstructionSimplify shift logic. Add in · 571fd9a6

Duncan Sands authored Jan 14, 2011

threading of shifts over selects and phis while there. This fires here and
there in the testsuite, to not much effect. For example when compiling spirit
it fires 5 times, during early-cse, resulting in 6 more cse simplifications,
and 3 more terminators being folded by jump threading, but the final bitcode
doesn't change in any interesting way: other optimizations would have caught
the opportunity anyway, only later.

llvm-svn: 123441

571fd9a6

Rename this test. · c3eb0f4b
Duncan Sands authored Jan 14, 2011
```
llvm-svn: 123440
```
c3eb0f4b

switch the second scalarrepl pass to use SSAUpdater. We run two scalarrepl passes: one · 8d7716a2

Chris Lattner authored Jan 14, 2011

early in the cleanup code and one late interlaced with the inliner. The second one is
important because inlining and other scalar optzns can unpin allocas, allowing them to
be split up and promoted. While important for performance, this is also relatively
rare, and we would previously force a (non-lazy) computation of DomFrontiers, which
happened even if nothing became unpinned.

With this patch, the first pass of scalarrepl still promotes the vast bulk of allocas
in programs, but hte second pass has changed to use SSAUpdater, which is more "sparse"
and lazy. This speeds up opt -O3 time on kimwitu++ (a c++ app) by about 1%. The
numbers are interesting: the first pass promotes ~17500 allocas. The second pass
promotes about 1600. For non-C++ codes, the compile time win should be greater,
because the second pass of scalarrepl does less.

llvm-svn: 123437

8d7716a2

split SROA into two passes: one that uses DomFrontiers (-scalarrepl) · 9987a6f4
Chris Lattner authored Jan 14, 2011
```
and one that uses SSAUpdater (-scalarrepl-ssa)

llvm-svn: 123436
```
9987a6f4

Remove casts between Value** and Constant**, which won't work if a · 1d4a8fe1

Jay Foad authored Jan 14, 2011

static_cast from Constant* to Value* has to adjust the "this" pointer.
This is groundwork for PR889.

llvm-svn: 123435

1d4a8fe1

Implement full support for promoting allocas to registers using SSAUpdater · 543384ef

Chris Lattner authored Jan 14, 2011

instead of DomTree/DomFrontier.  This may be interesting for reducing compile 
time.  This is currently disabled, but seems to work just fine.

When this is enabled, we eliminate two runs of dominator frontier, one in the
"early per-function" optimizations and one in the "interlaced with inliner"
function passes.

llvm-svn: 123434

543384ef

relax testcase a bit. · 5e0fef85
Chris Lattner authored Jan 14, 2011
```
llvm-svn: 123433
```
5e0fef85
Try for the third time to teach getFirstTerminator() about debug values. · ab3d6ecb
Jakob Stoklund Olesen authored Jan 14, 2011
```
This time let's rephrase to trick gcc-4.3 into not miscompiling.

llvm-svn: 123432
```
ab3d6ecb
revert my fastisel patch again which apparently still gives the · e93e4f11
Chris Lattner authored Jan 14, 2011
```
llvm-gcc-i386-linux-selfhost buildbot heartburn...

llvm-svn: 123431
```
e93e4f11
reapply r123414 now that the botz are calmed down and the fix is already in. · 5ca13910
Chris Lattner authored Jan 14, 2011
```
llvm-svn: 123427
```
5ca13910
indentation · 90f3a9a1
Chris Lattner authored Jan 14, 2011
```
llvm-svn: 123426
```
90f3a9a1

Completed :lower16: / :upper16: support for movw / movt pairs on Darwin. · d4a5c05c

Evan Cheng authored Jan 14, 2011

- Fixed :upper16: fix up routine. It should be shifting down the top 16 bits first.
- Added support for Thumb2 :lower16: and :upper16: fix up.
- Added :upper16: and :lower16: relocation support to mach-o object writer.

llvm-svn: 123424

d4a5c05c

Revert r123419. It still breaks llvm-gcc-i386-linux-selfhost. · c3810288
Jakob Stoklund Olesen authored Jan 14, 2011
```
llvm-svn: 123423
```
c3810288
r123414 broke llvm-gcc bootstrap apparently, revert · 21a64979
Chris Lattner authored Jan 14, 2011
```
llvm-svn: 123422
```
21a64979
Set the insertion point correctly for instructions generated by load folding: · 3be81e9b
Chris Lattner authored Jan 14, 2011
```
they should go *before* the new instruction not after it. 

llvm-svn: 123420
```
3be81e9b
Try again to teach getFirstTerminator() about debug values. · c0767e02
Jakob Stoklund Olesen authored Jan 14, 2011
```
Fix some callers to better deal with debug values.

llvm-svn: 123419
```
c0767e02

Rather than doing early instcombine, try doing early CSE instead. This should still handle · e3ed20ce

Owen Anderson authored Jan 14, 2011

most important simplifications, as well as resolving phase ordering issues where instcombine
would inhibit important CSE'ing opportunities, for instance on BitBench/drop3.

llvm-svn: 123418

e3ed20ce

Move some shift transforms out of instcombine and into InstructionSimplify. · 7f60dc1e

Duncan Sands authored Jan 14, 2011

While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything. I fixed this in the constant folder as well. Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero. This is in accordance with the LangRef, but I must
admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.

llvm-svn: 123417

7f60dc1e

Don't bother conditionalizing the use of SROA in -O1 mode. We're already... · ae6ce377

Owen Anderson authored Jan 14, 2011

Don't bother conditionalizing the use of SROA in -O1 mode.  We're already running it unconditionally
later in the pipeline.

llvm-svn: 123416

ae6ce377

fix PR8961 - a fast isel miscompilation where we'd insert a new instruction · 0c34cb42

Chris Lattner authored Jan 14, 2011

after sext's generated for addressing that got folded.  Previously we compiled
test5 into:

_test5:                                 ## @test5
## BB#0:
        movq    -8(%rsp), %rax          ## 8-byte Reload
        movq    (%rdi,%rax), %rdi
        addq    %rdx, %rdi
        movslq  %esi, %rax
        movq    %rax, -8(%rsp)          ## 8-byte Spill
        movq    %rdi, %rax
        ret

which is insane and wrong.  Now we produce:

_test5:                                 ## @test5
## BB#0:
	movslq	%esi, %rax
	movq	(%rdi,%rax), %rax
	addq	%rdx, %rax
	ret

llvm-svn: 123414

0c34cb42

Better terminator avoidance. · 088b30aa

Jakob Stoklund Olesen authored Jan 13, 2011

This approach also works when the terminator doesn't have a slot index. (Which
can happen??)

llvm-svn: 123413

088b30aa

Add comment about Thumb2 fixup comments being completely bogus. · 52899a9c
Evan Cheng authored Jan 13, 2011
```
llvm-svn: 123411
```
52899a9c

Add single entry / single exit accessors. · b1d11c19

Tobias Grosser authored Jan 13, 2011

Add methods for accessing the (single) entry / exit edge of a region. If no such
edge exists, null is returned.  Both accessors return the start block of the
corresponding edge. The edge can finally be formed by utilizing
Region::getEntry() or Region::getExit();

Contributed by: Andreas Simbuerger <simbuerg@fim.uni-passau.de>

llvm-svn: 123410

b1d11c19

Jan 13, 2011
- Recognize alternative register names like ip -> r12. · a098d150
  Owen Anderson authored Jan 13, 2011
```
Fixes <rdar://problem/8857982>.

llvm-svn: 123409
```
  a098d150
- Fix a few more places that should use MBB::getLastNonDebugInstr(). · bbb1a54b
  Jakob Stoklund Olesen authored Jan 13, 2011
```
llvm-svn: 123408
```
  bbb1a54b
- As far as I can tell, unified syntax uses c0-c15 instead of cr0-cr15 for mcr and friends. · ec47597e
  Owen Anderson authored Jan 13, 2011
```
llvm-svn: 123407
```
  ec47597e
- typo · b6c3aff1
  Chris Lattner authored Jan 13, 2011
```
llvm-svn: 123406
```
  b6c3aff1