- Jun 08, 2012
-
-
Hal Finkel authored
As Chris points out, this can now be removed! TODO: check if the associated section on viterbi's inner loop can also be removed. llvm-svn: 158224
-
Hal Finkel authored
Thanks to Jakob's help, this now causes no new test suite failures! Over the entire test suite, this gives an average 1% speedup. The largest speedups are: SingleSource/Benchmarks/Misc/pi - 108% SingleSource/Benchmarks/CoyoteBench/lpbench - 54% MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50% SingleSource/Benchmarks/Shootout/ary3 - 32% SingleSource/Benchmarks/Shootout-C++/matrix - 30% The largest slowdowns are: MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30% MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22% MultiSource/Applications/d/make_dparser - -14% SingleSource/Benchmarks/Shootout-C++/ary - -13% In light of these slowdowns, additional profiling work is obviously needed! llvm-svn: 158223
-
Hal Finkel authored
Marking these classes as non-alocatable allows CTR loop generation to work correctly with the block placement passes, etc. These register classes are currently used only by some unused TCRETURN patterns. In future cleanup, these will be removed. Thanks again to Jakob for suggesting this fix to the CTR loop problem! llvm-svn: 158221
-
Manman Ren authored
llvm-svn: 158220
-
Andrew Trick authored
Bulk move of TargetInstrInfo implementation into TargetInstrInfoImpl. This is dirty because the code isn't part of TargetInstrInfoImpl class, nor should it be, because the methods are not target hooks. However, it's the current mechanism for keeping libTarget useful outside the backend. You'll get a not-so-nice link error if you invoke a TargetInstrInfo method that depends on CodeGen. The TargetInstrInfoImpl class should probably be removed since it doesn't really solve this problem. To really fix this, we probably need separate interfaces for the CodeGen/nonCodeGen sides of TargetInstrInfo. llvm-svn: 158212
-
Hal Finkel authored
The pass itself works well, but the something in the Machine* infrastructure does not understand terminators which define registers. Without the ability to use the block-placement pass, etc. this causes performance regressions (and so is turned off by default). Turning off the analysis turns off the problems with the Machine* infrastructure. llvm-svn: 158206
-
Hal Finkel authored
The code which tests for an induction operation cannot assume that any ADDI instruction will have a register operand because the operand could also be a frame index; for example: %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16 llvm-svn: 158205
-
Hal Finkel authored
Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. llvm-svn: 158204
-
Manman Ren authored
This patch will generate the following for integer ABS: movl %edi, %eax negl %eax cmovll %edi, %eax INSTEAD OF movl %edi, %ecx sarl $31, %ecx leal (%rdi,%rcx), %eax xorl %ecx, %eax There exists a target-independent DAG combine for integer ABS, which converts integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. This is implemented in PerformXorCombine. rdar://10695237 llvm-svn: 158175
-
- Jun 07, 2012
-
-
Nadav Rotem authored
Do not optimize the used bits of the x86 vselect condition operand, when the condition operand is a vector of 1-bit predicates. This may happen on MIC devices. llvm-svn: 158168
-
Andrew Trick authored
llvm-svn: 158164
-
Andrew Trick authored
Match expectations of the new latency API. Cleanup and make the logic consistent. llvm-svn: 158163
-
Andrew Trick authored
llvm-svn: 158162
-
Andrew Trick authored
llvm-svn: 158161
-
Manman Ren authored
It will cause assertion failure later on. llvm-svn: 158160
-
Rafael Espindola authored
Fixes pr13048. llvm-svn: 158158
-
Manman Ren authored
This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 158126
-
Manman Ren authored
The commit is intended to fix rdar://11540023. It is implemented as part of peephole optimization. We can actually implement this in the SelectionDAG lowering phase. llvm-svn: 158122
-
- Jun 06, 2012
-
-
Benjamin Kramer authored
LLVM is now -Wunused-private-field clean except for - lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields. - gtest. llvm-svn: 158096
-
Benjamin Kramer authored
There are some that I didn't remove this round because they looked like obvious stubs. There are dead variables in gtest too, they should be fixed upstream. llvm-svn: 158090
-
Chad Rosier authored
X86. rdar://11496434 llvm-svn: 158087
-
Richard Barton authored
llvm-svn: 158055
-
Craig Topper authored
llvm-svn: 158049
-
- Jun 05, 2012
-
-
Andrew Trick authored
Minimum latency determines per-cycle scheduling groups. Expected latency determines critical path and cost. llvm-svn: 158021
-
Yuan Lin authored
llvm-svn: 158013
-
Roman Divacky authored
llvm-svn: 158004
-
Andrew Trick authored
llvm-svn: 157981
-
Andrew Trick authored
llvm-svn: 157980
-
Andrew Trick authored
This allows a subtarget to explicitly specify the issue width and other properties without providing pipeline stage details for every instruction. llvm-svn: 157979
-
Andrew Trick authored
llvm-svn: 157976
-
Joel Jones authored
llvm-svn: 157972
-
Joel Jones authored
when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. <rdar://problem/11481151> llvm-svn: 157966
-
- Jun 04, 2012
-
-
Akira Hatanaka authored
inserted after the shift-left-logical node. llvm-svn: 157937
-
Roman Divacky authored
llvm-svn: 157935
-
Hans Wennborg authored
This was mostly done already in r156162, but I missed one place. llvm-svn: 157929
-
Hans Wennborg authored
llvm-svn: 157920
-
Craig Topper authored
llvm-svn: 157917
-
Craig Topper authored
Add VFMADDSUB and VFMSUBADD FMA instructions to folding tables. Also add 213 forms of scalar FMA instructions. llvm-svn: 157914
-
Hal Finkel authored
llvm-svn: 157912
-
Hal Finkel authored
It seems that this no longer causes test suite failures on PPC64 (after r157159), and often gives a performance benefit, so it can be enabled by default. llvm-svn: 157911
-