- Jun 20, 2012
-
-
Andrew Trick authored
-stable-loops enables a new algorithm for generating the Loop forest. It differs from the original algorithm in a few respects: - Not determined by use-list order. - Initially guarantees RPO order of block and subloops. - Linear in the number of CFG edges. - Nonrecursive. I didn't want to change the LoopInfo API yet, so the block lists are still inclusive. This seems strange to me, and it means that building LoopInfo is not strictly linear, but it may not be a problem in practice. At least the block lists start out in RPO order now. In the future we may add an attribute or wrapper analysis that allows other passes to assume RPO order. The primary motivation of this work was not to optimize LoopInfo, but to allow reproducing performance issues by decomposing the compilation stages. I'm often unable to do this with the current LoopInfo, because the loop tree order determines Loop pass order. Serializing the IR tends to invert the order, which reverses the optimization order. This makes it nearly impossible to debug interdependent loop optimizations such as LSR. I also believe this will provide more stable performance results across time. llvm-svn: 158790
-
Francois Pichet authored
llvm-svn: 158788
-
Andrew Trick authored
The implementation only needs inclusion from LoopInfo.cpp and MachineLoopInfo.cpp. Clients of the interface should only include the interface. This makes the interface readable and speeds up rebuilds after modifying the implementation. llvm-svn: 158787
-
Nick Kledzik authored
Add permissions(), map_file_pages(), and unmap_file_pages() to llvm::sys::fs and add unit test. Unix is implemented. Windows side needs to be implemented. llvm-svn: 158770
-
Kaelyn Uhrain authored
llvm::RawMemoryObject handles empty ranges just fine, and the assert can be triggered in the wild by e.g. invoking clang with a file that included an empty pre-compiled header file when clang has been built with assertions enabled. Without assertions enabled, clang will properly report that the empty file is not a valid PCH. llvm-svn: 158769
-
Jakob Stoklund Olesen authored
When LiveIntervals is tracking fixed interference in regunits, make sure to update those intervals as well. Currently guarded by -live-regunits. llvm-svn: 158766
-
Chad Rosier authored
llvm-svn: 158762
-
Chad Rosier authored
ensureAlignment() in MachineFunction). Also, drop setMaxAlignment() in favor of this new function. This creates a main entry point to setting MaxAlignment, which will be helpful for future work. No functionality change intended. llvm-svn: 158758
-
Lang Hames authored
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757
-
Jakob Stoklund Olesen authored
llvm-svn: 158755
-
- Jun 19, 2012
-
-
Jakob Stoklund Olesen authored
The PPC::EXTSW instruction preserves the low 32 bits of its input, just like some of the x86 instructions. Use it to reduce register pressure when the low 32 bits have multiple uses. This requires a small change to PeepholeOptimizer since EXTSW takes a 64-bit input register. This is related to PR5997. llvm-svn: 158743
-
Jakob Stoklund Olesen authored
No functional change. llvm-svn: 158742
-
Chandler Carruth authored
StringMap suffered from the same bug as DenseMap: when you explicitly construct it with a small number of buckets, you can arrange for the tombstone-based growth path to be followed when the number of buckets was less than '8'. In that case, even with a full map, it would compare '0' as not less than '0', and refuse to grow the table, leading to inf-loops trying to find an empty bucket on the next insertion. The fix is very simple: use '<=' as the comparison. The same fix was applied to DenseMap as well during its recent refactoring. Thanks to Alex Bolz for the great report and test case. =] llvm-svn: 158725
-
Benjamin Kramer authored
Should silence warnings when compiling the X86 disassembler. llvm-svn: 158723
-
Jan Wen Voung authored
The condition code didn't actually matter for arm "b" instructions, unlike "bl". It should just use the R_ARM_JUMP24 reloc. llvm-svn: 158722
-
Hal Finkel authored
For processors with the G5-like instruction-grouping scheme, this helps avoid early group termination due to a write-after-write dependency within the group. It should also help on pipelined embedded cores. On POWER7, over the test suite, this gives an average 0.5% speedup. The largest speedups are: SingleSource/Benchmarks/Stanford/Quicksort - 33% MultiSource/Applications/d/make_dparser - 21% MultiSource/Benchmarks/FreeBench/analyzer/analyzer - 12% MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft - 12% Largest slowdowns: SingleSource/Benchmarks/Stanford/Bubblesort - 23% MultiSource/Benchmarks/Prolangs-C++/city/city - 21% MultiSource/Benchmarks/BitBench/uuencode/uuencode - 16% MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode - 13% llvm-svn: 158719
-
Michael J. Spencer authored
llvm-svn: 158704
-
Akira Hatanaka authored
llvm-svn: 158702
-
Akira Hatanaka authored
MipsCodeEmitter.cpp. llvm-svn: 158701
-
Hal Finkel authored
PPC will now generate STWUX and friends. llvm-svn: 158698
-
Rafael Espindola authored
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM, on X86 it is not easy to find out if .init_array should be used or not, so the decision is made via TargetOptions and defaults to off. Add a command line option to llc that enables it. llvm-svn: 158692
-
Nuno Lopes authored
revert r158660, since Chris has some issues with this patch (namely using code to reprent information only used by the compiler) Original commit msg: add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers. This metadata can be attached to any instruction returning a pointer llvm-svn: 158688
-
Manman Ren authored
This change is to be enabled in clang. rdar://9877866 llvm-svn: 158684
-
- Jun 18, 2012
-
-
Hal Finkel authored
This patch changes the type used to hold the FU bitset from unsigned to uint64_t. This will be needed for some upcoming PowerPC itineraries. llvm-svn: 158679
-
Marshall Clow authored
llvm-svn: 158675
-
Jim Grosbach authored
The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/ a different immediate value in bits [7,0]. Define a generic HINT instruction and refactor NOP, WFI, WFI, SEV and YIELD to be assembly aliases of that. rdar://11600518 llvm-svn: 158674
-
Nuno Lopes authored
This metadata can be attached to any instruction returning a pointer llvm-svn: 158660
-
Joel Jones authored
when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't need to be handled, as the 8 bit constants are encoded directly, thereby not needing a separate load instruction to form the constant into a register. <rdar://problem/11481151> llvm-svn: 158659
-
Chandler Carruth authored
This patch causes problems when both dynamic stack realignment and dynamic allocas combine in the same function. With this patch, we no longer build the epilog correctly, and silently restore registers from the wrong position in the stack. Thanks to Matt for tracking this down, and getting at least an initial test case to Chad. I'm going to try to check a variation of that test case in so we can easily track the fixes required. llvm-svn: 158654
-
- Jun 17, 2012
-
-
Pete Cooper authored
Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts llvm-svn: 158623
-
- Jun 16, 2012
-
-
Benjamin Kramer authored
llvm-svn: 158610
-
Benjamin Kramer authored
llvm-svn: 158608
-
Hal Finkel authored
This cleans up the method used to find trip counts in order to form CTR loops on PPC. This refactoring allows the pass to find loops which have a constant trip count but also happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different classes of loops that are currently ignored. In addition, we now search through all potential induction operations instead of just the first. Also, we check the predicate code on the conditional branch and abort the transformation if the code is not EQ or NE, and we then make sure that the branch to be transformed matches the condition register defined by the comparison (multiple possible comparisons will be considered). llvm-svn: 158607
-
Hal Finkel authored
The present implementation handles only TBAA and FP metadata, discarding everything else. For debug metadata, the current behavior is maintained (the debug metadata associated with one of the instructions will be kept, discarding that attached to the other). This should address PR 13040. llvm-svn: 158606
-
Hal Finkel authored
There are other passes, BBVectorize specifically, that also need some of this functionality. llvm-svn: 158605
-
Rafael Espindola authored
llvm-svn: 158604
-
Kay Tiong Khoo authored
llvm-svn: 158603
-
NAKAMURA Takumi authored
llvm-svn: 158602
-
Evan Cheng authored
It's not deterministic to iterate over SmallPtrSet. Replace it with SmallSetVector. Patch by Daniel Reynaud. rdar://11671029 llvm-svn: 158594
-
Pete Cooper authored
Dynamic GEPs created by SROA needed to insert extra "i32 0" operands to index through structs and arrays to get to the vector being indexed. llvm-svn: 158590
-