Commits · 008c0a4a68169b9fb14a621db2e9511a232f4759 · Roger Ferrer / llvm-epi-0.8

Jan 15, 2011

implement an instcombine xform that canonicalizes casts outside of and-with-constant operations. · 9c10d587

Chris Lattner authored Jan 15, 2011

This fixes rdar://8808586 which observed that we used to compile:


union xy {
        struct x { _Bool b[15]; } x;
        __attribute__((packed))
        struct y {
                __attribute__((packed)) unsigned long b0to7;
                __attribute__((packed)) unsigned int b8to11;
                __attribute__((packed)) unsigned short b12to13;
                __attribute__((packed)) unsigned char b14;
        } y;
};

struct x
foo(union xy *xy)
{
        return xy->x;
}

into:

_foo:                                   ## @foo
	movq	(%rdi), %rax
	movabsq	$1095216660480, %rcx    ## imm = 0xFF00000000
	andq	%rax, %rcx
	movabsq	$-72057594037927936, %rdx ## imm = 0xFF00000000000000
	andq	%rax, %rdx
	movzbl	%al, %esi
	orq	%rdx, %rsi
	movq	%rax, %rdx
	andq	$65280, %rdx            ## imm = 0xFF00
	orq	%rsi, %rdx
	movq	%rax, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rdx, %rsi
	movl	%eax, %edx
	andl	$-16777216, %edx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rdx
	orq	%rcx, %rdx
	movabsq	$280375465082880, %rcx  ## imm = 0xFF0000000000
	movq	%rax, %rsi
	andq	%rcx, %rsi
	orq	%rdx, %rsi
	movabsq	$71776119061217280, %r8 ## imm = 0xFF000000000000
	andq	%r8, %rax
	orq	%rsi, %rax
	movzwl	12(%rdi), %edx
	movzbl	14(%rdi), %esi
	shlq	$16, %rsi
	orl	%edx, %esi
	movq	%rsi, %r9
	shlq	$32, %r9
	movl	8(%rdi), %edx
	orq	%r9, %rdx
	andq	%rdx, %rcx
	movzbl	%sil, %esi
	shlq	$32, %rsi
	orq	%rcx, %rsi
	movl	%edx, %ecx
	andl	$-16777216, %ecx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rcx
	movq	%rdx, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rcx, %rsi
	movq	%rdx, %rcx
	andq	$65280, %rcx            ## imm = 0xFF00
	orq	%rsi, %rcx
	movzbl	%dl, %esi
	orq	%rcx, %rsi
	andq	%r8, %rdx
	orq	%rsi, %rdx
	ret

We now compile this into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movzwl	12(%rdi), %eax
	movzbl	14(%rdi), %ecx
	shlq	$16, %rcx
	orl	%eax, %ecx
	shlq	$32, %rcx
	movl	8(%rdi), %edx
	orq	%rcx, %rdx
	movq	(%rdi), %rax
	ret

A small improvement :-)

llvm-svn: 123520

9c10d587

fix typo · c23ca1f2
Chris Lattner authored Jan 15, 2011
```
llvm-svn: 123519
```
c23ca1f2

Fix m_Not and m_Neg to not match random ConstantInt's. Before · 76580f0e

Chris Lattner authored Jan 15, 2011

these would try hard to match constants by inverting the bits
and recursively matching.  There are two problems with this:
1) some patterns would match when we didn't want them to (theoretical)
2) this is insanely expensive to do, and most often pointless.

This was apparently useful in just 2 instcombine cases, which I
added code to handle explicitly.  This change speeds up 'opt'
time on 176.gcc by 1% and produces bitwise identical code.

llvm-svn: 123518

76580f0e

one more instcombine variant that is needed to work with future changes, · e20dd530
Chris Lattner authored Jan 15, 2011
```
no functionality change currently.

llvm-svn: 123517
```
e20dd530
fix typo · 497459d5
Chris Lattner authored Jan 15, 2011
```
llvm-svn: 123516
```
497459d5
Catch ~x < cst just like ~x < ~y, we currently handle this through · f3c4eeff
Chris Lattner authored Jan 15, 2011
```
means that are about to disappear.

llvm-svn: 123515
```
f3c4eeff
reduce indentation · 311aa63c
Chris Lattner authored Jan 15, 2011
```
llvm-svn: 123514
```
311aa63c
80-col. · cc385c0c
Eric Christopher authored Jan 15, 2011
```
llvm-svn: 123505
```
cc385c0c
Generalize LoadAndStorePromoter a bit and switch LICM · b68ec5c3
Chris Lattner authored Jan 15, 2011
```
to use it.

llvm-svn: 123501
```
b68ec5c3
Fix a comment. · b7a3c42e
Bob Wilson authored Jan 15, 2011
```
llvm-svn: 123497
```
b7a3c42e
Fix 80-cols. · 2af9551e
Eric Christopher authored Jan 14, 2011
```
llvm-svn: 123494
```
2af9551e

Jan 14, 2011

Update CMake build. · e92b6e43
Ted Kremenek authored Jan 14, 2011
```
llvm-svn: 123491
```
e92b6e43

Fix some tablegen issues to allow using zero_reg for InstAlias definitions. · 03912aba

Bob Wilson authored Jan 14, 2011

This is needed to allow an InstAlias for an instruction with an "OptionalDef"
result register (like ARM's cc_out) where you want to set the optional register
to reg0.

llvm-svn: 123490

03912aba

Fix memory leak found by clang static analyzer. · 6677f65d
Ted Kremenek authored Jan 14, 2011
```
llvm-svn: 123487
```
6677f65d
'HiReg' is written but never read. Nuke its · b5241b2b
Ted Kremenek authored Jan 14, 2011
```
declaration and its assignments.

Found by clang static analyzer.

llvm-svn: 123486
```
b5241b2b
Fix a false-positive warning. · 3e2f6cf7
Owen Anderson authored Jan 14, 2011
```
llvm-svn: 123480
```
3e2f6cf7
Delete an assignment to ThisBB which isn't needed, and tidy up some · abac063b
Dan Gohman authored Jan 14, 2011
```
comments.

llvm-svn: 123479
```
abac063b
Enhance GlobalOpt to be able evaluate initializers that involve stores through · 9eb7cb48
Owen Anderson authored Jan 14, 2011
```
bitcasts, at least in simple cases.  This fixes clang's CodeGenCXX/virtual-base-dtor.cpp

llvm-svn: 123477
```
9eb7cb48

Add a possibility to switch between CFI directives- and table-based frame... · 9be547cf

Anton Korobeynikov authored Jan 14, 2011

Add a possibility to switch between CFI directives- and table-based frame description emission. Currently all the backends use table-based stuff.

llvm-svn: 123476

9be547cf

Cleanup · 4d9de6be
Anton Korobeynikov authored Jan 14, 2011
```
llvm-svn: 123475
```
4d9de6be
Add CFI directives-based frame information emission. Not hooked yet. · b46ef57d
Anton Korobeynikov authored Jan 14, 2011
```
llvm-svn: 123474
```
b46ef57d
Split stuff as a preparation for CFI directives-based frame information emission · 61d167e9
Anton Korobeynikov authored Jan 14, 2011
```
llvm-svn: 123473
```
61d167e9
Use common style for .cfi directives · e2bea1c8
Anton Korobeynikov authored Jan 14, 2011
```
llvm-svn: 123472
```
e2bea1c8

Support for precise scheduling of the instruction selection DAG, · 9ccce778

Andrew Trick authored Jan 14, 2011

disabled in this checkin. Sorry for the large diffs due to
refactoring. New functionality is all guarded by EnableSchedCycles.

Scheduling the isel DAG is inherently imprecise, but we give it a best
effort:
- Added MayReduceRegPressure to allow stalled nodes in the queue only
  if there is a regpressure need.
- Added BUHasStall to allow checking for either dependence stalls due to
  latency or resource stalls due to pipeline hazards.
- Added BUCompareLatency to encapsulate and standardize the heuristics
  for minimizing stall cycles (vs. reducing register pressure).
- Modified the bottom-up heuristic (now in BUCompareLatency) to
  prioritize nodes by their depth rather than height. As long as it
  doesn't stall, height is irrelevant. Depth represents the critical
  path to the DAG root.
- Added hybrid_ls_rr_sort::isReady to filter stalled nodes before
  adding them to the available queue.

Related Cleanup: most of the register reduction routines do not need
to be templates.

llvm-svn: 123468

9ccce778

switch SRoA to use LoadAndStorePromoter instead of its own copy of the code. · b498f9af
Chris Lattner authored Jan 14, 2011
```
llvm-svn: 123457
```
b498f9af
Add a new LoadAndStorePromoter class, which implements the general · 95294b87
Chris Lattner authored Jan 14, 2011
```
"promote a bunch of load and stores" logic, allowing the code to
be shared and reused.

llvm-svn: 123456
```
95294b87
OperandTraits<>::Layout isn't used for anything. Remove it. · cbe15056
Jay Foad authored Jan 14, 2011
```
llvm-svn: 123452
```
cbe15056
Update llvm-gcc's tests. · b1ebba9e
Rafael Espindola authored Jan 14, 2011
```
llvm-svn: 123447
```
b1ebba9e
Reorder macros on config.h.cmake to easily compare it against · 959d2534
Oscar Fuentes authored Jan 14, 2011
```
config.h.in.

Patch by arrowdodger!

llvm-svn: 123445
```
959d2534
Disable debug mode. · 610c41e7
Devang Patel authored Jan 14, 2011
```
llvm-svn: 123443
```
610c41e7

Turn X-(X-Y) into Y. According to my auto-simplifier this is the most common · d6f1a958

Duncan Sands authored Jan 14, 2011

simplification present in fully optimized code (I think instcombine fails to
transform some of these when "X-Y" has more than one use). Fires here and
there all over the test-suite, for example it eliminates 8 subtractions in
the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc.

llvm-svn: 123442

d6f1a958

Factorize common code out of the InstructionSimplify shift logic. Add in · 571fd9a6

Duncan Sands authored Jan 14, 2011

threading of shifts over selects and phis while there. This fires here and
there in the testsuite, to not much effect. For example when compiling spirit
it fires 5 times, during early-cse, resulting in 6 more cse simplifications,
and 3 more terminators being folded by jump threading, but the final bitcode
doesn't change in any interesting way: other optimizations would have caught
the opportunity anyway, only later.

llvm-svn: 123441

571fd9a6

Rename this test. · c3eb0f4b
Duncan Sands authored Jan 14, 2011
```
llvm-svn: 123440
```
c3eb0f4b

switch the second scalarrepl pass to use SSAUpdater. We run two scalarrepl passes: one · 8d7716a2

Chris Lattner authored Jan 14, 2011

early in the cleanup code and one late interlaced with the inliner. The second one is
important because inlining and other scalar optzns can unpin allocas, allowing them to
be split up and promoted. While important for performance, this is also relatively
rare, and we would previously force a (non-lazy) computation of DomFrontiers, which
happened even if nothing became unpinned.

With this patch, the first pass of scalarrepl still promotes the vast bulk of allocas
in programs, but hte second pass has changed to use SSAUpdater, which is more "sparse"
and lazy. This speeds up opt -O3 time on kimwitu++ (a c++ app) by about 1%. The
numbers are interesting: the first pass promotes ~17500 allocas. The second pass
promotes about 1600. For non-C++ codes, the compile time win should be greater,
because the second pass of scalarrepl does less.

llvm-svn: 123437

8d7716a2

split SROA into two passes: one that uses DomFrontiers (-scalarrepl) · 9987a6f4
Chris Lattner authored Jan 14, 2011
```
and one that uses SSAUpdater (-scalarrepl-ssa)

llvm-svn: 123436
```
9987a6f4

Remove casts between Value** and Constant**, which won't work if a · 1d4a8fe1

Jay Foad authored Jan 14, 2011

static_cast from Constant* to Value* has to adjust the "this" pointer.
This is groundwork for PR889.

llvm-svn: 123435

1d4a8fe1

Implement full support for promoting allocas to registers using SSAUpdater · 543384ef

Chris Lattner authored Jan 14, 2011

instead of DomTree/DomFrontier.  This may be interesting for reducing compile 
time.  This is currently disabled, but seems to work just fine.

When this is enabled, we eliminate two runs of dominator frontier, one in the
"early per-function" optimizations and one in the "interlaced with inliner"
function passes.

llvm-svn: 123434

543384ef

relax testcase a bit. · 5e0fef85
Chris Lattner authored Jan 14, 2011
```
llvm-svn: 123433
```
5e0fef85
Try for the third time to teach getFirstTerminator() about debug values. · ab3d6ecb
Jakob Stoklund Olesen authored Jan 14, 2011
```
This time let's rephrase to trick gcc-4.3 into not miscompiling.

llvm-svn: 123432
```
ab3d6ecb
revert my fastisel patch again which apparently still gives the · e93e4f11
Chris Lattner authored Jan 14, 2011
```
llvm-gcc-i386-linux-selfhost buildbot heartburn...

llvm-svn: 123431
```
e93e4f11