- Jan 15, 2011
-
-
Chris Lattner authored
This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) llvm-svn: 123520
-
Chris Lattner authored
llvm-svn: 123519
-
Chris Lattner authored
these would try hard to match constants by inverting the bits and recursively matching. There are two problems with this: 1) some patterns would match when we didn't want them to (theoretical) 2) this is insanely expensive to do, and most often pointless. This was apparently useful in just 2 instcombine cases, which I added code to handle explicitly. This change speeds up 'opt' time on 176.gcc by 1% and produces bitwise identical code. llvm-svn: 123518
-
Chris Lattner authored
no functionality change currently. llvm-svn: 123517
-
Chris Lattner authored
llvm-svn: 123516
-
Chris Lattner authored
means that are about to disappear. llvm-svn: 123515
-
Chris Lattner authored
llvm-svn: 123514
-
Eric Christopher authored
llvm-svn: 123505
-
Chris Lattner authored
to use it. llvm-svn: 123501
-
Bob Wilson authored
llvm-svn: 123497
-
Eric Christopher authored
llvm-svn: 123494
-
- Jan 14, 2011
-
-
Ted Kremenek authored
llvm-svn: 123491
-
Bob Wilson authored
This is needed to allow an InstAlias for an instruction with an "OptionalDef" result register (like ARM's cc_out) where you want to set the optional register to reg0. llvm-svn: 123490
-
Ted Kremenek authored
llvm-svn: 123487
-
Ted Kremenek authored
declaration and its assignments. Found by clang static analyzer. llvm-svn: 123486
-
Owen Anderson authored
llvm-svn: 123480
-
Dan Gohman authored
comments. llvm-svn: 123479
-
Owen Anderson authored
bitcasts, at least in simple cases. This fixes clang's CodeGenCXX/virtual-base-dtor.cpp llvm-svn: 123477
-
Anton Korobeynikov authored
Add a possibility to switch between CFI directives- and table-based frame description emission. Currently all the backends use table-based stuff. llvm-svn: 123476
-
Anton Korobeynikov authored
llvm-svn: 123475
-
Anton Korobeynikov authored
llvm-svn: 123474
-
Anton Korobeynikov authored
llvm-svn: 123473
-
Anton Korobeynikov authored
llvm-svn: 123472
-
Andrew Trick authored
disabled in this checkin. Sorry for the large diffs due to refactoring. New functionality is all guarded by EnableSchedCycles. Scheduling the isel DAG is inherently imprecise, but we give it a best effort: - Added MayReduceRegPressure to allow stalled nodes in the queue only if there is a regpressure need. - Added BUHasStall to allow checking for either dependence stalls due to latency or resource stalls due to pipeline hazards. - Added BUCompareLatency to encapsulate and standardize the heuristics for minimizing stall cycles (vs. reducing register pressure). - Modified the bottom-up heuristic (now in BUCompareLatency) to prioritize nodes by their depth rather than height. As long as it doesn't stall, height is irrelevant. Depth represents the critical path to the DAG root. - Added hybrid_ls_rr_sort::isReady to filter stalled nodes before adding them to the available queue. Related Cleanup: most of the register reduction routines do not need to be templates. llvm-svn: 123468
-
Chris Lattner authored
llvm-svn: 123457
-
Chris Lattner authored
"promote a bunch of load and stores" logic, allowing the code to be shared and reused. llvm-svn: 123456
-
Jay Foad authored
llvm-svn: 123452
-
Rafael Espindola authored
llvm-svn: 123447
-
Oscar Fuentes authored
config.h.in. Patch by arrowdodger! llvm-svn: 123445
-
Devang Patel authored
llvm-svn: 123443
-
Duncan Sands authored
simplification present in fully optimized code (I think instcombine fails to transform some of these when "X-Y" has more than one use). Fires here and there all over the test-suite, for example it eliminates 8 subtractions in the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc. llvm-svn: 123442
-
Duncan Sands authored
threading of shifts over selects and phis while there. This fires here and there in the testsuite, to not much effect. For example when compiling spirit it fires 5 times, during early-cse, resulting in 6 more cse simplifications, and 3 more terminators being folded by jump threading, but the final bitcode doesn't change in any interesting way: other optimizations would have caught the opportunity anyway, only later. llvm-svn: 123441
-
Duncan Sands authored
llvm-svn: 123440
-
Chris Lattner authored
early in the cleanup code and one late interlaced with the inliner. The second one is important because inlining and other scalar optzns can unpin allocas, allowing them to be split up and promoted. While important for performance, this is also relatively rare, and we would previously force a (non-lazy) computation of DomFrontiers, which happened even if nothing became unpinned. With this patch, the first pass of scalarrepl still promotes the vast bulk of allocas in programs, but hte second pass has changed to use SSAUpdater, which is more "sparse" and lazy. This speeds up opt -O3 time on kimwitu++ (a c++ app) by about 1%. The numbers are interesting: the first pass promotes ~17500 allocas. The second pass promotes about 1600. For non-C++ codes, the compile time win should be greater, because the second pass of scalarrepl does less. llvm-svn: 123437
-
Chris Lattner authored
and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436
-
Jay Foad authored
static_cast from Constant* to Value* has to adjust the "this" pointer. This is groundwork for PR889. llvm-svn: 123435
-
Chris Lattner authored
instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434
-
Chris Lattner authored
llvm-svn: 123433
-
Jakob Stoklund Olesen authored
This time let's rephrase to trick gcc-4.3 into not miscompiling. llvm-svn: 123432
-
Chris Lattner authored
llvm-gcc-i386-linux-selfhost buildbot heartburn... llvm-svn: 123431
-