- Jan 14, 2011
-
-
Anton Korobeynikov authored
llvm-svn: 123473
-
Anton Korobeynikov authored
llvm-svn: 123472
-
Andrew Trick authored
disabled in this checkin. Sorry for the large diffs due to refactoring. New functionality is all guarded by EnableSchedCycles. Scheduling the isel DAG is inherently imprecise, but we give it a best effort: - Added MayReduceRegPressure to allow stalled nodes in the queue only if there is a regpressure need. - Added BUHasStall to allow checking for either dependence stalls due to latency or resource stalls due to pipeline hazards. - Added BUCompareLatency to encapsulate and standardize the heuristics for minimizing stall cycles (vs. reducing register pressure). - Modified the bottom-up heuristic (now in BUCompareLatency) to prioritize nodes by their depth rather than height. As long as it doesn't stall, height is irrelevant. Depth represents the critical path to the DAG root. - Added hybrid_ls_rr_sort::isReady to filter stalled nodes before adding them to the available queue. Related Cleanup: most of the register reduction routines do not need to be templates. llvm-svn: 123468
-
Chris Lattner authored
llvm-svn: 123457
-
Chris Lattner authored
"promote a bunch of load and stores" logic, allowing the code to be shared and reused. llvm-svn: 123456
-
Duncan Sands authored
simplification present in fully optimized code (I think instcombine fails to transform some of these when "X-Y" has more than one use). Fires here and there all over the test-suite, for example it eliminates 8 subtractions in the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc. llvm-svn: 123442
-
Duncan Sands authored
threading of shifts over selects and phis while there. This fires here and there in the testsuite, to not much effect. For example when compiling spirit it fires 5 times, during early-cse, resulting in 6 more cse simplifications, and 3 more terminators being folded by jump threading, but the final bitcode doesn't change in any interesting way: other optimizations would have caught the opportunity anyway, only later. llvm-svn: 123441
-
Chris Lattner authored
and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436
-
Jay Foad authored
static_cast from Constant* to Value* has to adjust the "this" pointer. This is groundwork for PR889. llvm-svn: 123435
-
Chris Lattner authored
instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434
-
Jakob Stoklund Olesen authored
This time let's rephrase to trick gcc-4.3 into not miscompiling. llvm-svn: 123432
-
Chris Lattner authored
llvm-gcc-i386-linux-selfhost buildbot heartburn... llvm-svn: 123431
-
Chris Lattner authored
llvm-svn: 123427
-
Chris Lattner authored
llvm-svn: 123426
-
Evan Cheng authored
- Fixed :upper16: fix up routine. It should be shifting down the top 16 bits first. - Added support for Thumb2 :lower16: and :upper16: fix up. - Added :upper16: and :lower16: relocation support to mach-o object writer. llvm-svn: 123424
-
Jakob Stoklund Olesen authored
llvm-svn: 123423
-
Chris Lattner authored
llvm-svn: 123422
-
Chris Lattner authored
they should go *before* the new instruction not after it. llvm-svn: 123420
-
Jakob Stoklund Olesen authored
Fix some callers to better deal with debug values. llvm-svn: 123419
-
Duncan Sands authored
While there, I noticed that the transform "undef >>a X -> undef" was wrong. For example if X is 2 then the top two bits must be equal, so the result can not be anything. I fixed this in the constant folder as well. Also, I made the transform for "X << undef" stronger: it now folds to undef always, even though X might be zero. This is in accordance with the LangRef, but I must admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef" following the LangRef and the constant folder, likewise fairly aggressive. llvm-svn: 123417
-
Chris Lattner authored
after sext's generated for addressing that got folded. Previously we compiled test5 into: _test5: ## @test5 ## BB#0: movq -8(%rsp), %rax ## 8-byte Reload movq (%rdi,%rax), %rdi addq %rdx, %rdi movslq %esi, %rax movq %rax, -8(%rsp) ## 8-byte Spill movq %rdi, %rax ret which is insane and wrong. Now we produce: _test5: ## @test5 ## BB#0: movslq %esi, %rax movq (%rdi,%rax), %rax addq %rdx, %rax ret llvm-svn: 123414
-
Jakob Stoklund Olesen authored
This approach also works when the terminator doesn't have a slot index. (Which can happen??) llvm-svn: 123413
-
Evan Cheng authored
llvm-svn: 123411
-
Tobias Grosser authored
Add methods for accessing the (single) entry / exit edge of a region. If no such edge exists, null is returned. Both accessors return the start block of the corresponding edge. The edge can finally be formed by utilizing Region::getEntry() or Region::getExit(); Contributed by: Andreas Simbuerger <simbuerg@fim.uni-passau.de> llvm-svn: 123410
-
- Jan 13, 2011
-
-
Owen Anderson authored
Fixes <rdar://problem/8857982>. llvm-svn: 123409
-
Jakob Stoklund Olesen authored
llvm-svn: 123408
-
Chris Lattner authored
llvm-svn: 123406
-
Chris Lattner authored
llvm-svn: 123405
-
Owen Anderson authored
the symbolic immediate names used for these instructions, fixing their pretty-printers, and adding proper encoding information for them. With this, we can properly pretty-print and encode assembly like: mrc p15, #0, r3, c13, c0, #3 Fixes <rdar://problem/8857858>. llvm-svn: 123404
-
Evan Cheng authored
Relax an assertion. On archs like ARM, an immediate field may be scattered. So it's possible for some bits of every 8 bits to be encoded already, and the rest still needs to be fixed up. llvm-svn: 123403
-
Jakob Stoklund Olesen authored
llvm-svn: 123400
-
Jakob Stoklund Olesen authored
llvm-svn: 123399
-
Bob Wilson authored
llvm-svn: 123397
-
Bob Wilson authored
llvm-svn: 123396
-
Kevin Enderby authored
directional local labels like 1f and 2b. llvm-svn: 123393
-
Devang Patel authored
llvm-svn: 123389
-
Jim Grosbach authored
set up the source operands. The original instr has an immediate operand that should be replaced with the frame reg operand rather than just adding the reg operand. Previously, the instruction ended up with too many operands causing an assert() when adding the default predicate. rdar://8825456 llvm-svn: 123387
-
Jakob Stoklund Olesen authored
It will still return an iterator that points to the first terminator or end(), but there may be DBG_VALUE instructions following the first terminator. llvm-svn: 123384
-
Bob Wilson authored
llvm-svn: 123383
-
Bob Wilson authored
This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381
-