- Mar 24, 2011
-
-
Andrew Trick authored
I'm backing this out for the second time. It was supposed to be fixed by r128164, but the mingw self-host must be defeating the fix. llvm-svn: 128181
-
- Mar 23, 2011
-
-
Andrew Trick authored
(target-specific branchless method for double-width relational comparisons on x86) llvm-svn: 128175
-
- Mar 22, 2011
-
-
Dan Gohman authored
outside of the current basic block. This fixes PR9500, rdar://9156159. llvm-svn: 128041
-
- Mar 21, 2011
-
-
Bill Wendling authored
the alias of an InstAlias instead of the thing being aliased. Because we need to know the features that are valid for an InstAlias. This is part of a work-in-progress. llvm-svn: 127986
-
Evan Cheng authored
Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified. llvm-svn: 127981
-
- Mar 19, 2011
-
-
Daniel Dunbar authored
to canonicalize IR", it broke a lot of things. llvm-svn: 127954
-
Evan Cheng authored
to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953
-
Nadav Rotem authored
not have native support for this operation (such as X86). The legalized code uses two vector INT_TO_FP operations and is faster than scalarizing. llvm-svn: 127951
-
- Mar 18, 2011
-
-
Eli Friedman authored
llvm-svn: 127909
-
Joerg Sonnenberger authored
For now, only the default segments are supported. llvm-svn: 127875
-
Eli Friedman authored
comparisons on x86. Essentially, the way this works is that SUB+SBB sets the relevant flags the same way a double-width CMP would. This is a substantial improvement over the generic lowering in LLVM. The output is also shorter than the gcc-generated output; I haven't done any detailed benchmarking, though. llvm-svn: 127852
-
- Mar 17, 2011
-
-
Cameron Zwarich authored
llvm-svn: 127809
-
Cameron Zwarich authored
llvm-svn: 127807
-
Eli Friedman authored
llvm-svn: 127786
-
- Mar 16, 2011
-
-
Cameron Zwarich authored
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not generate incorrect code. This just fixes the zext at the return. We still insert an i32 ZextAssert when reading a function's arguments, but it is followed by a truncate and another i8 ZextAssert so it is not optimized. llvm-svn: 127766
-
- Mar 15, 2011
-
-
Sean Callanan authored
in the instruction tables and fixed a few bugs that were causing decode conflicts. Rudimentary tests are coming up in the next patch. llvm-svn: 127646
-
Sean Callanan authored
instruction set. This code adds support for the VEX prefix and for the YMM registers accessible on AVX-enabled architectures. Instruction table support that enables AVX instructions for the disassembler is in an upcoming patch. llvm-svn: 127644
-
- Mar 11, 2011
-
-
Eric Christopher authored
corresponding testcases back to the previous versions. Fixes some performance regressions only seen on 32-bit. llvm-svn: 127441
-
- Mar 10, 2011
-
-
Stuart Hastings authored
llvm-svn: 127382
-
Evan Cheng authored
llvm-svn: 127380
-
Evan Cheng authored
llvm-svn: 127376
-
- Mar 09, 2011
-
-
Evan Cheng authored
flexible. If it returns a register class that's different from the input, then that's the register class used for cross-register class copies. If it returns a register class that's the same as the input, then no cross- register class copies are needed (normal copies would do). If it returns null, then it's not at all possible to copy registers of the specified register class. llvm-svn: 127368
-
Benjamin Kramer authored
llvm-svn: 127365
-
-
Jan Sjödin authored
Add createELFObjectTargetWriter method to TargetAsmBackend, which enables construction of non-standard ELFObjectWriters that can be used in MCJIT. llvm-svn: 127346
-
NAKAMURA Takumi authored
llvm-svn: 127328
-
- Mar 08, 2011
-
-
Benjamin Kramer authored
Found by inspection. llvm-svn: 127247
-
Eric Christopher authored
testcases accordingly. Some are currently xfailed and will be filed as bugs to be fixed or understood. Performance results: roughly neutral on SPEC some micro benchmarks in the llvm suite are up between 100 and 150%, only a pair of regressions that are due to be investigated john-the-ripper saw: 10% improvement in traditional DES 8% improvement in BSDI DES 59% improvement in FreeBSD MD5 67% improvement in OpenBSD Blowfish 14% improvement in LM DES Small compile time impact. llvm-svn: 127208
-
- Mar 07, 2011
-
-
Cameron Zwarich authored
llvm-svn: 127175
-
- Mar 05, 2011
-
-
Andrew Trick authored
regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067
-
Andrew Trick authored
llvm-svn: 127065
-
- Mar 04, 2011
-
-
Eli Friedman authored
llvm-svn: 126970
-
- Mar 03, 2011
-
-
Tilmann Scheller authored
llvm-svn: 126934
-
- Mar 02, 2011
-
-
Tilmann Scheller authored
llvm-svn: 126862
-
David Greene authored
missing patterns for them. Add a SIMD test subdirectory to hold tests for SIMD instruction selection correctness and quality. ' llvm-svn: 126845
-
- Mar 01, 2011
-
-
Duncan Sands authored
llvm-svn: 126780
-
- Feb 28, 2011
-
-
Chris Lattner authored
llvm-svn: 126682
-
David Greene authored
[AVX] Add decode support for VUNPCKLPS/D instructions, both 128-bit and 256-bit forms. Because the number of elements in a vector does not determine the vector type (4 elements could be v4f32 or v4f64), pass the full type of the vector to decode routines. llvm-svn: 126664
-
- Feb 27, 2011
-
-
Benjamin Kramer authored
llvm-svn: 126578
-
NAKAMURA Takumi authored
Target/X86: Always emit "push/pop GPRs" in prologue/epilogue and emit "spill/reload frames" for XMMs. It improves Win64's prologue/epilogue but it would not affect ia32 and amd64 (lack of nonvolatile XMMs). llvm-svn: 126568
-