- Mar 28, 2016
-
-
Mike Spertus authored
llvm-svn: 264602
-
Mike Spertus authored
When using Visual Studio 2015, cmake now puts the native visualizers in llvm.sln, so the developer automatically sees custom visualizations. Much thanks to ariccio who provided extensive help on this change. (manual installation still needed on VS2013) llvm-svn: 264601
-
Hal Finkel authored
The A2 cores support the popcntw/popcntd instructions, but they're microcoded, and slower than our default software emulation. Specifically, popcnt[dw] take approximately 74 cycles, whereas our software emulation takes only 24-28 cycles. I've added a new target feature to indicate a slow popcnt[dw], instead of just removing the existing target feature from the a2/a2q processor models, because: 1. This allows us to return more accurate information via the TTI interface (I recognize that this currently makes no practical difference) 2. Is hopefully easier to understand (it allows the core's features to match its manual while still having the desired effect). llvm-svn: 264600
-
David Blaikie authored
llvm-svn: 264599
-
Eugene Zelenko authored
Differential revision: http://reviews.llvm.org/D18469 llvm-svn: 264598
-
Hyojin Sung authored
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264596
-
Hemant Kulkarni authored
llvm-svn: 264595
-
Rong Xu authored
Don't set the function hotness attribute on the fly. This changes the CFG branch probability of the caller function, which leads to inconsistent BB ordering. This patch moves the attribute setting to a separated loop after the counts in all functions are populated. Fixes PR27024 - PGO instrumentation profile data is not reflected in correct basic blocks. Differential Revision: http://reviews.llvm.org/D18491 llvm-svn: 264594
-
Derek Schuff authored
MachineFunctionProperties represents a set of properties that a MachineFunction can have at particular points in time. Existing examples of this idea are MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which will eventually be switched to use this mechanism. This change introduces the AllVRegsAllocated property; i.e. the property that all virtual registers have been allocated and there are no VReg operands left. With this mechanism, passes can declare that they require a particular property to be set, or that they set or clear properties by implementing e.g. MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class verifies that the requirements are met, and handles the setting and clearing based on the delcarations. Passes can also directly query and update the current properties of the MF if they want to have conditional behavior. This change annotates the target-independent post-regalloc passes; future changes will also annotate target-specific ones. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D18421 llvm-svn: 264593
-
Hemant Kulkarni authored
Differential Revision: http://reviews.llvm.org/D16820 llvm-svn: 264591
-
Vedant Kumar authored
This reverts commit r264587. Reverting to investigate 6 unexpected failures on the ppc bot: http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/2822 llvm-svn: 264590
-
Tom Stellard authored
Summary: This helps prevent load clustering from drastically increasing register pressure by trying to cluster 4 SMRDx8 loads together. The limit of 16 bytes was chosen, because it seems like that was the original intent of setting the limit to 4 instructions, but more analysis could show that a different limit is better. This fixes yields small decreases in register usage with shader-db, but also helps avoid a large increase in register usage when lane mask tracking is enabled in the machine scheduler, because lane mask tracking enables more opportunities for load clustering. shader-db stats: 2379 shaders in 477 tests Totals: SGPRS: 49744 -> 48600 (-2.30 %) VGPRS: 34120 -> 34076 (-0.13 %) Code Size: 1282888 -> 1283184 (0.02 %) bytes LDS: 28 -> 28 (0.00 %) blocks Scratch: 495616 -> 492544 (-0.62 %) bytes per wave Max Waves: 6843 -> 6853 (0.15 %) Wait states: 0 -> 0 (0.00 %) Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18451 llvm-svn: 264589
-
Davide Italiano authored
llvm-svn: 264588
-
Vedant Kumar authored
Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264587
-
Vedant Kumar authored
Patch suggested by David Li! llvm-svn: 264586
-
Krzysztof Parzyszek authored
llvm-svn: 264584
-
James Y Knight authored
llvm-svn: 264583
-
Krzysztof Parzyszek authored
llvm-svn: 264581
-
Krzysztof Parzyszek authored
- Do not optimize stack slots in optnone functions. - Get aligned-base register from HexagonMachineFunctionInfo instead of looking for ALIGNA instruction in the function's body. llvm-svn: 264580
-
Douglas Katzman authored
Differential Revision: http://reviews.llvm.org/D18463 llvm-svn: 264579
-
Jacques Pienaar authored
Add the Lanai backend to lib/Target. General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html). Differential Revision: http://reviews.llvm.org/D17011 llvm-svn: 264578
-
Hal Finkel authored
llvm-svn: 264573
-
Hal Finkel authored
We require C++11 to build, so remove a few remaining preprocessor checks for '__cplusplus >= 201103L'. This should always be true. llvm-svn: 264572
-
Chuang-Yu Cheng authored
This patch implements the following altivec instructions: - Decimal Convert From/to National/Zoned/Signed-QWord: bcdcfn. bcdcfz. bcdctn. bcdctz. bcdcfsq. bcdctsq. - Decimal Copy-Sign/Set-Sign: bcdcpsgn. bcdsetsgn. - Decimal Shift/Unsigned-Shift/Shift-and-Round: bcds. bcdus. bcdsr. - Decimal (Unsigned) Truncate: bcdtrunc. bcdutrunc. Total 13 instructions Thanks Amehsan's advice! Thanks Kit's great help! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D17838 llvm-svn: 264568
-
Chuang-Yu Cheng authored
[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat This change implements the following vsx instructions: - Scalar Insert/Extract xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp - Vector Insert/Extract xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp xxextractuw xxinsertw - Scalar/Vector Test Data Class xststdcdp xststdcsp xststdcqp xvtstdcdp xvtstdcsp - Maximum/Minimum xsmaxcdp xsmaxjdp xsmincdp xsminjdp - Vector Byte-Reverse/Permute/Splat xxbrd xxbrh xxbrq xxbrw xxperm xxpermr xxspltib 30 instructions Thanks Nemanja for invaluable discussion! Thanks Kit's great help! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16842 llvm-svn: 264567
-
Elena Demikhovsky authored
ICMP instruction selection fails on SKX and KNL for i1 operand. I use XOR to resolve: (A == B) is equivalent to (A xor B) == 0 Differential Revision: http://reviews.llvm.org/D18511 llvm-svn: 264566
-
Chuang-Yu Cheng authored
This change implements the following vsx instructions: - quad-precision move xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp - quad-precision fp-arithmetic xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o) xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o) 22 instructions Thanks Nemanja and Kit for careful review and invaluable discussion! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16110 llvm-svn: 264565
-
NAKAMURA Takumi authored
llvm-svn: 264561
-
Vedant Kumar authored
When emitting coverage mappings for functions with local linkage and an unknown filename, we use "<unknown>:func" for the PGO function name. The problem is that we don't strip "<unknown>" from the name when loading coverage data, like we do for other file names. Fix that and add a test. llvm-svn: 264559
-
Duncan P. N. Exon Smith authored
The caller of ValueEnumerator::EnumerateOperandType never sends in metadata. Assert that, and remove the unnecessary logic. llvm-svn: 264558
-
Duncan P. N. Exon Smith authored
Change writeFunctionMetadata to call writeMetadataRecords. For now there's no functionality change, but makes it easy to serialize other types of metadata in the function block in the future. llvm-svn: 264557
-
Duncan P. N. Exon Smith authored
To match writeMetadataRecords, writeNamedMetadata and writeMetadataStrings, change: WriteModuleMetadata => writeModuleMetadata WriteFunctionLocalMetadata => writeFunctionMetadata Write##CLASS => write##CLASS The only major change is "FunctionLocal" => "Function". The point is to be less specific, in preparation for emitting normal metadata records inside function metadata blocks (currently we only emit `LocalAsMetadata` there). llvm-svn: 264556
-
Duncan P. N. Exon Smith authored
Besides being a nice cleanup, this is preparation for reusing the code in function metadata blocks. llvm-svn: 264555
-
Duncan P. N. Exon Smith authored
Use an early return to simplify logic. llvm-svn: 264554
-
Duncan P. N. Exon Smith authored
make_unique => llvm::make_unique llvm-svn: 264553
-
Duncan P. N. Exon Smith authored
We don't really need a separate vector here; instead, point at a range inside the main MDs array. This matches how r264551 references the ranges of strings and non-strings. llvm-svn: 264552
-
Duncan P. N. Exon Smith authored
Spiritually reapply commit r264409 (reverted in r264410), albeit with a bit of a redesign. Firstly, avoid splitting the big blob into multiple chunks of strings. r264409 imposed an arbitrary limit to avoid a massive allocation on the shared 'Record' SmallVector. The bug with that commit only reproduced when there were more than "chunk-size" strings. A test for this would have been useless long-term, since we're liable to adjust the chunk-size in the future. Thus, eliminate the motivation for chunk-ing by storing the string sizes in the blob. Here's the layout: vbr6: # of strings vbr6: offset-to-blob blob: [vbr6]: string lengths [char]: concatenated strings Secondly, make the output of llvm-bcanalyzer readable. I noticed when debugging r264409 that llvm-bcanalyzer was outputting a massive blob all in one line. Past a small number, the strings were impossible to split in my head, and the lines were way too long. This version adds support in llvm-bcanalyzer for pretty-printing. <STRINGS abbrevid=4 op0=3 op1=9/> num-strings = 3 { 'abc' 'def' 'ghi' } From the original commit: Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this should (a) slightly reduce bitcode size, since there is less record overhead, and (b) greatly improve reading speed, since blobs are super cheap to deserialize. llvm-svn: 264551
-
Duncan P. N. Exon Smith authored
Split helper out of EmitRecordWithAbbrevImpl called emitBlob to reduce code duplication, and add a few tests for it. No functionality change intended. llvm-svn: 264550
-
Duncan P. N. Exon Smith authored
The implementation is fairly obvious. This is preparation for using some blobs in bitcode. For clarity (and perhaps future-proofing?), I moved the call to JumpToBit in BitstreamCursor::readRecord ahead of calling MemoryObject::getPointer, since JumpToBit can theoretically (a) read bytes, which (b) invalidates the blob pointer. This isn't strictly necessary the two memory objects we have: - The return of RawMemoryObject::getPointer is valid until the memory object is destroyed. - StreamingMemoryObject::getPointer is valid until the next chunk is read from the stream. Since the JumpToBit call is only going ahead to a word boundary, we'll never load another chunk. However, reordering makes it clear by inspection that the blob returned by BitstreamCursor::readRecord will be valid. I added some tests for StreamingMemoryObject::getPointer and BitstreamCursor::readRecord. llvm-svn: 264549
-
Duncan P. N. Exon Smith authored
Change the filename to indicate this is a test, rename the tests, move them into an anonymous namespace, and rename some variables. All to match our usual style before making further changes. llvm-svn: 264548
-