- May 26, 2011
-
-
Stuart Hastings authored
llvm-svn: 132108
-
Stuart Hastings authored
rdar://problem/6920088 llvm-svn: 132105
-
- May 24, 2011
-
-
Evan Cheng authored
non-zero. - Teach X86 cmov optimization to eliminate the cmov from ctlz, cttz extension when the source of X86ISD::BSR / X86ISD::BSF is proven to be non-zero. rdar://9490949 llvm-svn: 131948
-
- May 20, 2011
-
-
Chad Rosier authored
llvm-svn: 131709
-
-
- May 19, 2011
-
-
Eric Christopher authored
Fixes rdar://9218925 Fixes PR9601 llvm-svn: 131682
-
-
- May 18, 2011
-
-
Chad Rosier authored
Enables vararg functions that pass all arguments via registers to be optimized into tail-calls when possible. llvm-svn: 131560
-
- May 17, 2011
-
-
Eli Friedman authored
llvm-svn: 131471
-
Stuart Hastings authored
llvm-svn: 131469
-
Stuart Hastings authored
passed as the fifth parameter, insure it's passed correctly (in R9). rdar://problem/6920088 llvm-svn: 131467
-
Nadav Rotem authored
Fix a bug in PerformEXTRACT_VECTOR_ELTCombine. The code created an ADD SDNode with two different types, in cases where the index and the ptr had different types. llvm-svn: 131461
-
- May 16, 2011
-
-
Eli Friedman authored
llvm-svn: 131424
-
- May 11, 2011
-
-
Nadav Rotem authored
Add custom lowering of X86 vector SRA/SRL/SHL when the shift amount is a splat vector. llvm-svn: 131179
-
- May 06, 2011
-
-
Eli Friedman authored
llvm-svn: 131012
-
- Apr 20, 2011
-
-
Daniel Dunbar authored
triple component. llvm-svn: 129838
-
- Apr 19, 2011
-
-
Daniel Dunbar authored
llvm-svn: 129813
-
- Apr 15, 2011
-
-
Chris Lattner authored
Luis Felipe Strano Moraes! llvm-svn: 129558
-
- Mar 31, 2011
-
-
Evan Cheng authored
llvm-svn: 128586
-
- Mar 26, 2011
-
-
Benjamin Kramer authored
llvm-svn: 128338
-
- Mar 24, 2011
-
-
NAKAMURA Takumi authored
FIXME: Some cleanups would be needed. llvm-svn: 128206
-
Andrew Trick authored
I'm backing this out for the second time. It was supposed to be fixed by r128164, but the mingw self-host must be defeating the fix. llvm-svn: 128181
-
- Mar 23, 2011
-
-
Andrew Trick authored
(target-specific branchless method for double-width relational comparisons on x86) llvm-svn: 128175
-
- Mar 21, 2011
-
-
Evan Cheng authored
Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified. llvm-svn: 127981
-
- Mar 19, 2011
-
-
Daniel Dunbar authored
to canonicalize IR", it broke a lot of things. llvm-svn: 127954
-
Evan Cheng authored
to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953
-
Nadav Rotem authored
not have native support for this operation (such as X86). The legalized code uses two vector INT_TO_FP operations and is faster than scalarizing. llvm-svn: 127951
-
- Mar 18, 2011
-
-
Eli Friedman authored
llvm-svn: 127909
-
Eli Friedman authored
comparisons on x86. Essentially, the way this works is that SUB+SBB sets the relevant flags the same way a double-width CMP would. This is a substantial improvement over the generic lowering in LLVM. The output is also shorter than the gcc-generated output; I haven't done any detailed benchmarking, though. llvm-svn: 127852
-
- Mar 17, 2011
-
-
Cameron Zwarich authored
llvm-svn: 127809
-
Cameron Zwarich authored
llvm-svn: 127807
-
- Mar 16, 2011
-
-
Cameron Zwarich authored
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not generate incorrect code. This just fixes the zext at the return. We still insert an i32 ZextAssert when reading a function's arguments, but it is followed by a truncate and another i8 ZextAssert so it is not optimized. llvm-svn: 127766
-
- Mar 11, 2011
-
-
Eric Christopher authored
corresponding testcases back to the previous versions. Fixes some performance regressions only seen on 32-bit. llvm-svn: 127441
-
- Mar 10, 2011
-
-
Stuart Hastings authored
llvm-svn: 127382
-
- Mar 09, 2011
-
-
-
NAKAMURA Takumi authored
llvm-svn: 127328
-
- Mar 08, 2011
-
-
Benjamin Kramer authored
Found by inspection. llvm-svn: 127247
-
Eric Christopher authored
testcases accordingly. Some are currently xfailed and will be filed as bugs to be fixed or understood. Performance results: roughly neutral on SPEC some micro benchmarks in the llvm suite are up between 100 and 150%, only a pair of regressions that are due to be investigated john-the-ripper saw: 10% improvement in traditional DES 8% improvement in BSDI DES 59% improvement in FreeBSD MD5 67% improvement in OpenBSD Blowfish 14% improvement in LM DES Small compile time impact. llvm-svn: 127208
-
- Mar 07, 2011
-
-
Cameron Zwarich authored
llvm-svn: 127175
-
- Mar 05, 2011
-
-
Andrew Trick authored
regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067
-