- Jun 24, 2013
-
-
Michael Gottesman authored
llvm-svn: 184713
-
Michael Gottesman authored
llvm-svn: 184712
-
Michael Gottesman authored
[APFloat] Removed a assert from significandParts() which says that one can only access the significand of FiniteNonZero/NaN floats. The method significandParts() is a helper method meant to ease access to APFloat's significand by allowing the user to not need to be aware of whether or not the APFloat is using memory allocated in the instance itself or in an external array. This assert says that one can only access the significand of FiniteNonZero/NaN floats. This makes it cumbersome and more importantly dangerous when one wishes to zero out the significand of a zero/infinity value since one will have to deal with the aforementioned quandary related to how the memory in APFloat is allocated. llvm-svn: 184711
-
Michael Gottesman authored
[APFloat] Rename macro convolve => PackCategoriesIntoKey so that it is clear what APFloat is actually using said macro for. In the context of APFloat, seeing a macro called convolve suggests that APFloat is using said value in some sort of convolution somewhere in the source code. This is misleading. I also added a documentation comment to the macro. llvm-svn: 184710
-
Amaury de la Vieuville authored
When encoded to thumb, VFP instruction and VMOV/VDUP between scalar and core registers, must have their predicate bit to 0b1110. llvm-svn: 184707
-
Amaury de la Vieuville authored
llvm-svn: 184706
-
Andrew Trick authored
Sorry for the unit test churn. I'll try to make the change permanently next time. llvm-svn: 184705
-
Amaury de la Vieuville authored
In thumb1, NOP is a pseudo-instruction equivalent to mov r8, r8. However the disassembler should not use this alias. llvm-svn: 184703
-
Amaury de la Vieuville authored
mask == 0 -> UNPRED llvm-svn: 184702
-
Amaury de la Vieuville authored
llvm-svn: 184701
-
Chandler Carruth authored
CGSCC pass manager. This should insulate the inlining decisions from the vectorization decisions, however it may have both compile time and code size problems so it is just an experimental option right now. Adding this based on a discussion with Arnold and it seems at least worth having this flag for us to both run some experiments to see if this strategy is workable. It may solve some of the regressions seen with the loop vectorizer. llvm-svn: 184698
-
Arnold Schwaighofer authored
This reverts commit cbfa1ca993363ca5c4dbf6c913abc957c584cbac. We are seeing a stage2 and stage3 miscompare on some dragonegg bots. llvm-svn: 184690
-
Michael Gottesman authored
exponent_t is only used internally in APFloat and no exponent_t values are exposed via the APFloat API. In light of such conditions it does not make any sense to gum up the llvm namespace with said type. Plus it makes it clearer that exponent_t is associated with APFloat. llvm-svn: 184686
-
Arnold Schwaighofer authored
We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598 llvm-svn: 184685
-
Arnold Schwaighofer authored
This class checks dependences by subtracting two Scalar Evolution access functions allowing us to catch very simple linear dependences. The checker assumes source order in determining whether vectorization is safe. We currently don't reorder accesses. Positive true dependencies need to be a multiple of VF otherwise we impede store-load forwarding. llvm-svn: 184684
-
Arnold Schwaighofer authored
Sets of dependent accesses are built by unioning sets based on underlying objects. This class will be used by the upcoming dependence checker. llvm-svn: 184683
-
Nadav Rotem authored
Untill now we detected the vectorizable tree and evaluated the cost of the entire tree. With this patch we can decide to trim-out branches of the tree that are not profitable to vectorizer. Also, increase the max depth from 6 to 12. In the worse possible case where all of the code is made of diamond-shaped graph this can bring the cost to 2**10, but diamonds are not very common. llvm-svn: 184681
-
Andrew Trick authored
This makes it possible to write unit tests that are less susceptible to minor code motion, particularly copy placement. block-placement.ll covers this case with -pre-RA-sched=source which will soon be default. One incorrectly named block is already fixed, but without this fix, enabling new coalescing and scheduling would cause more failures. llvm-svn: 184680
-
- Jun 23, 2013
-
-
Nadav Rotem authored
Make sure that we don't replace and RAUW two sequences if one does not dominate the other. llvm-svn: 184674
-
Nadav Rotem authored
The RAII builder location guard is saving a reference to instructions, so we can't erase instructions during vectorization. llvm-svn: 184671
-
David Blaikie authored
llvm-svn: 184669
-
Andrew Trick authored
This is an awful implementation of the target hook. But we don't have abstractions yet for common machine ops, and I don't see any quick way to make it table-driven. llvm-svn: 184664
-
Nadav Rotem authored
llvm-svn: 184660
-
- Jun 22, 2013
-
-
Nadav Rotem authored
Rewrote the SLP-vectorization as a whole-function vectorization pass. It is now able to vectorize chains across multiple basic blocks. It still does not vectorize PHIs, but this should be easy to do now that we scan the entire function. I removed the support for extracting values from trees. We are now able to vectorize more programs, but there are some serious regressions in many workloads (such as flops-6 and mandel-2). llvm-svn: 184647
-
David Blaikie authored
llvm-svn: 184643
-
Chad Rosier authored
llvm-svn: 184642
-
Benjamin Kramer authored
It doesn't work as I intended it to. This reverts commit r184638. llvm-svn: 184641
-
Benjamin Kramer authored
It has become an expensive operation. No functionality change. llvm-svn: 184638
-
Sean Silva authored
Although in reality the symbol table in ELF resides in a section, the standard requires that there be no more than one SHT_SYMTAB. To enforce this constraint, it is cleaner to group all the symbols under a top-level `Symbols` key on the object file. llvm-svn: 184627
-
Andrew Trick authored
We have no targets on trunk that bundle before regalloc. However, we have been advertising regalloc as bundle safe for use with out-of-tree targets. We need to at least contain the parts of the code that are still unsafe. llvm-svn: 184620
-
David Blaikie authored
A FastISel optimization was causing us to emit no information for such parameters & when they go missing we end up emitting a different function type. By avoiding that shortcut we not only get types correct (very important) but also location information (handy) - even if it's only live at the start of a function & may be clobbered later. Reviewed/discussion by Evan Cheng & Dan Gohman. llvm-svn: 184604
-
- Jun 21, 2013
-
-
Michael Gottesman authored
Thanks to Bill Wendling for pointing this out! llvm-svn: 184593
-
Kevin Enderby authored
that have been run through the 'C' pre-processor. The implementation of SrcMgr.FindLineNumber() is slow but OK if it uses its cache when called multiple times with an SMLoc that is forward of the previous call. In the case of generating dwarf for assembly source files that have been run through the 'C' pre-processor we need to calculate the logical line number based on the last parsed cpp hash file line comment. And the current code calls SrcMgr.FindLineNumber() twice to do this causing its cache not to work and results in very slow compile times: % time /Volumes/SandBox/build-llvm/Debug+Asserts/bin/llvm-mc -triple thumbv7-apple-ios -filetype=obj -o /tmp/x.o mscorlib.dll.E -g 672.542u 0.299s 11:13.15 99.9% 0+0k 0+2io 2106pf+0w So we save the info from the last parsed cpp hash file line comment to avoid making the second call to SrcMgr.FindLineNumber() most times and end up with compile times like: % time /Volumes/SandBox/build-llvm/Debug+Asserts/bin/llvm-mc -triple thumbv7-apple-ios -filetype=obj -o /tmp/x.o mscorlib.dll.E -g 3.404u 0.104s 0:03.80 92.1% 0+0k 0+3io 2105pf+0w rdar://14156934 llvm-svn: 184592
-
Benjamin Kramer authored
Revert "BlockFrequency: Saturate at 1 instead of 0 when multiplying a frequency with a branch probability." This reverts commit r184584. Breaks PPC selfhost. llvm-svn: 184590
-
Michael Gottesman authored
[objc-arc-opts] Now that PtrState.RRI is encapsulated in PtrState, make PtrState.RRI private and delete the TODO. llvm-svn: 184587
-
Michael Gottesman authored
[objc-arc-opts] Encapsulated PtrState.RRI.{Calls,ReverseInsertPts} into several methods on PtrState. llvm-svn: 184586
-
Benjamin Kramer authored
Zero is used by BlockFrequencyInfo as a special "don't know" value. It also causes a sink for frequencies as you can't ever get off a zero frequency with more multiplies. This recovers a 10% regression on MultiSource/Benchmarks/7zip. A zero frequency was propagated into an inner loop causing excessive spilling. PR16402. llvm-svn: 184584
-
Michael Gottesman authored
[objcarcopts] Encapsulated PtrState.RRI.IsTrackingImpreciseRelease() => PtrState.IsTrackingImpreciseRelease(). llvm-svn: 184583
-
Michael Gottesman authored
[objcarcopts] Encapsulate PtrState.RRI.CFGHazardAfflicted via methods PtrState.{IsCFGHazardAfflicted,SetCFGHazardAfflicted}. llvm-svn: 184582
-
Justin Holewinski authored
IR for CUDA should use "nvptx[64]-nvidia-cuda", and IR for NV OpenCL should use "nvptx[64]-nvidia-nvcl" llvm-svn: 184579
-