Commits · 1c80677834cfe16793671469f1ae6dfa8518fb23 · Roger Ferrer / llvm-epi

Mar 28, 2016

Forgot to commit this file in revision 264601 · 1c806778
Mike Spertus authored Mar 28, 2016
```
llvm-svn: 264602
```
1c806778

Use VS2015 Project Support for Natvis to eliminate the need to manually install natvis files · 0b96a2e8

Mike Spertus authored Mar 28, 2016

When using Visual Studio 2015, cmake now puts the native visualizers in llvm.sln, so the developer automatically sees custom visualizations.
Much thanks to ariccio who provided extensive help on this change. (manual installation still needed on VS2013)

llvm-svn: 264601

0b96a2e8

[PowerPC] On the A2, popcnt[dw] are very slow · 7059d416

Hal Finkel authored Mar 28, 2016

The A2 cores support the popcntw/popcntd instructions, but they're microcoded,
and slower than our default software emulation. Specifically, popcnt[dw] take
approximately 74 cycles, whereas our software emulation takes only 24-28
cycles.

I've added a new target feature to indicate a slow popcnt[dw], instead of just
removing the existing target feature from the a2/a2q processor models, because:
  1. This allows us to return more accurate information via the TTI interface
     (I recognize that this currently makes no practical difference)
  2. Is hopefully easier to understand (it allows the core's features to match
     its manual while still having the desired effect).

llvm-svn: 264600

7059d416

Remove else after return · b805f73a
David Blaikie authored Mar 28, 2016
```
llvm-svn: 264599
```
b805f73a
Fix Clang-tidy modernize-deprecated-headers warnings in some files; other minor fixes. · 35623fb7
Eugene Zelenko authored Mar 28, 2016
```
Differential revision: http://reviews.llvm.org/D18469

llvm-svn: 264598
```
35623fb7

[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops · 0ada5b0d

Hyojin Sung authored Mar 28, 2016

When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).

The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.

llvm-svn: 264596

0ada5b0d

[llvm-readobj] NFC Replace case by macros for PT_* enums · 7d564ba1
Hemant Kulkarni authored Mar 28, 2016
```
llvm-svn: 264595
```
7d564ba1

[PGO] Don't set the function hotness attribute when populating counters · 6090afd7

Rong Xu authored Mar 28, 2016

Don't set the function hotness attribute on the fly. This changes the CFG
branch probability of the caller function, which leads to inconsistent BB
ordering. This patch moves the attribute setting to a separated loop after
 the counts in all functions are populated.

Fixes PR27024 - PGO instrumentation profile data is not reflected in correct
basic blocks.

Differential Revision: http://reviews.llvm.org/D18491

llvm-svn: 264594

6090afd7

Introduce MachineFunctionProperties and the AllVRegsAllocated property · ad154c83

Derek Schuff authored Mar 28, 2016

MachineFunctionProperties represents a set of properties that a MachineFunction
can have at particular points in time. Existing examples of this idea are
MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which
will eventually be switched to use this mechanism.
This change introduces the AllVRegsAllocated property; i.e. the property that
all virtual registers have been allocated and there are no VReg operands
left.

With this mechanism, passes can declare that they require a particular property
to be set, or that they set or clear properties by implementing e.g.
MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class
verifies that the requirements are met, and handles the setting and clearing
based on the delcarations. Passes can also directly query and update the current
properties of the MF if they want to have conditional behavior.

This change annotates the target-independent post-regalloc passes; future
changes will also annotate target-specific ones.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D18421

llvm-svn: 264593

ad154c83

[llvm-size] Implement --common option · 274457e5
Hemant Kulkarni authored Mar 28, 2016
```
Differential Revision: http://reviews.llvm.org/D16820

llvm-svn: 264591
```
274457e5

Revert "[PGO] Fix name encoding for ObjC-like functions" · 088a726f

Vedant Kumar authored Mar 28, 2016

This reverts commit r264587. Reverting to investigate 6 unexpected
failures on the ppc bot:

http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/2822

llvm-svn: 264590

088a726f

AMDGPU/SI: Limit load clustering to 16 bytes instead of 4 instructions · a76bcc2e

Tom Stellard authored Mar 28, 2016

Summary:
This helps prevent load clustering from drastically increasing register
pressure by trying to cluster 4 SMRDx8 loads together.  The limit of 16
bytes was chosen, because it seems like that was the original intent
of setting the limit to 4 instructions, but more analysis could show
that a different limit is better.

This fixes yields small decreases in register usage with shader-db, but
also helps avoid a large increase in register usage when lane mask
tracking is enabled in the machine scheduler, because lane mask tracking
enables more opportunities for load clustering.

shader-db stats:

2379 shaders in 477 tests
Totals:
SGPRS: 49744 -> 48600 (-2.30 %)
VGPRS: 34120 -> 34076 (-0.13 %)
Code Size: 1282888 -> 1283184 (0.02 %) bytes
LDS: 28 -> 28 (0.00 %) blocks
Scratch: 495616 -> 492544 (-0.62 %) bytes per wave
Max Waves: 6843 -> 6853 (0.15 %)
Wait states: 0 -> 0 (0.00 %)

Reviewers: nhaehnle, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18451

llvm-svn: 264589

a76bcc2e

[SimplifyLibCalls] Transform printf("%s", "a") -> putchar('a'). · 6db1dcbf
Davide Italiano authored Mar 28, 2016
```
llvm-svn: 264588
```
6db1dcbf

[PGO] Fix name encoding for ObjC-like functions · e44e0be8

Vedant Kumar authored Mar 28, 2016

Function names in ObjC can have spaces in them. This interacts poorly
with name compression, which uses spaces to separate PGO names. Fix the
issue by using a different separator and update a test.

I chose "\01" as the separator because 1) it's non-printable, 2) we
strip it from PGO names, and 3) it's the next natural choice once "\00"
is discarded (that one's overloaded).

Differential Revision: http://reviews.llvm.org/D18516

llvm-svn: 264587

e44e0be8

[Coverage] Strip <unknown> from PGO names if no filenames are available · 43a8565b
Vedant Kumar authored Mar 28, 2016
```
Patch suggested by David Li!

llvm-svn: 264586
```
43a8565b
[Hexagon] Improve handling of unaligned vector loads and stores · 2d65ea74
Krzysztof Parzyszek authored Mar 28, 2016
```
llvm-svn: 264584
```
2d65ea74
NFC: skip FenceInst up-front in AtomicExpandPass. · 01f2ca56
James Y Knight authored Mar 28, 2016
```
llvm-svn: 264583
```
01f2ca56
[Hexagon] Only use restore functions for single register at -Oz · bb63f666
Krzysztof Parzyszek authored Mar 28, 2016
```
llvm-svn: 264581
```
bb63f666

[Hexagon] Speed up frame lowering when no optimizations are enabled · a34901aa

Krzysztof Parzyszek authored Mar 28, 2016

- Do not optimize stack slots in optnone functions.
- Get aligned-base register from HexagonMachineFunctionInfo instead of
  looking for ALIGNA instruction in the function's body.

llvm-svn: 264580

a34901aa

Sparc: silently ignore .proc assembler directive · d0c11cf7
Douglas Katzman authored Mar 28, 2016
```
Differential Revision: http://reviews.llvm.org/D18463

llvm-svn: 264579
```
d0c11cf7

[lanai] Add Lanai backend. · fcef3e46

Jacques Pienaar authored Mar 28, 2016

Add the Lanai backend to lib/Target.

General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html).

Differential Revision: http://reviews.llvm.org/D17011

llvm-svn: 264578

fcef3e46

[SROA] Fix typo in comment · 5c83a090
Hal Finkel authored Mar 28, 2016
```
llvm-svn: 264573
```
5c83a090

C++11 is required, remove some preprocessor checks for it · 29f5131d

Hal Finkel authored Mar 28, 2016

We require C++11 to build, so remove a few remaining preprocessor checks for
'__cplusplus >= 201103L'. This should always be true.

llvm-svn: 264572

29f5131d

[Power9] Implement new altivec instructions: bcd* series · d5eb774e

Chuang-Yu Cheng authored Mar 28, 2016

This patch implements the following altivec instructions:

- Decimal Convert From/to National/Zoned/Signed-QWord:
    bcdcfn. bcdcfz. bcdctn. bcdctz. bcdcfsq. bcdctsq.

- Decimal Copy-Sign/Set-Sign:
    bcdcpsgn. bcdsetsgn.

- Decimal Shift/Unsigned-Shift/Shift-and-Round:
    bcds. bcdus. bcdsr.

- Decimal (Unsigned) Truncate:
    bcdtrunc. bcdutrunc.

Total 13 instructions

Thanks Amehsan's advice! Thanks Kit's great help!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D17838

llvm-svn: 264568

d5eb774e

[Power9] Implement new vsx instructions: insert, extract, test data class,... · 80722719

Chuang-Yu Cheng authored Mar 28, 2016

[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat

This change implements the following vsx instructions:

- Scalar Insert/Extract
    xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp

- Vector Insert/Extract
    xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp
    xxextractuw xxinsertw

- Scalar/Vector Test Data Class
    xststdcdp xststdcsp xststdcqp
    xvtstdcdp xvtstdcsp

- Maximum/Minimum
    xsmaxcdp xsmaxjdp
    xsmincdp xsminjdp

- Vector Byte-Reverse/Permute/Splat
    xxbrd xxbrh xxbrq xxbrw
    xxperm xxpermr
    xxspltib

30 instructions

Thanks Nemanja for invaluable discussion! Thanks Kit's great help!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16842

llvm-svn: 264567

80722719

AVX-512: Fixed ICMP instruction selection for i1 operands · 83f0647d

Elena Demikhovsky authored Mar 28, 2016

ICMP instruction selection fails on SKX and KNL for i1 operand.
I use XOR to resolve:
(A == B) is equivalent to (A xor B) == 0

Differential Revision: http://reviews.llvm.org/D18511

llvm-svn: 264566

83f0647d

[Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic · 56638489

Chuang-Yu Cheng authored Mar 28, 2016

This change implements the following vsx instructions:

- quad-precision move
    xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp

- quad-precision fp-arithmetic
    xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o)
    xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o)

22 instructions

Thanks Nemanja and Kit for careful review and invaluable discussion!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16110

llvm-svn: 264565

56638489

llvm/test/Transforms/FunctionImport/funcimport.ll: -stats REQUIRES +Asserts. · a51d6ea9
NAKAMURA Takumi authored Mar 28, 2016
```
llvm-svn: 264561
```
a51d6ea9

[Coverage] Fix the way we load "<unknown>:func" records · 141ed944

Vedant Kumar authored Mar 28, 2016

When emitting coverage mappings for functions with local linkage and an
unknown filename, we use "<unknown>:func" for the PGO function name. The
problem is that we don't strip "<unknown>" from the name when loading
coverage data, like we do for other file names. Fix that and add a test.

llvm-svn: 264559

141ed944

BitcodeWriter: Replace dead code with an assertion, NFC · 544e4f97

Duncan P. N. Exon Smith authored Mar 28, 2016

The caller of ValueEnumerator::EnumerateOperandType never sends in
metadata.  Assert that, and remove the unnecessary logic.

llvm-svn: 264558

544e4f97

BitcodeWriter: Reuse writeMetadataRecords, NFC · b42fa2e5

Duncan P. N. Exon Smith authored Mar 27, 2016

Change writeFunctionMetadata to call writeMetadataRecords.  For now
there's no functionality change, but makes it easy to serialize other
types of metadata in the function block in the future.

llvm-svn: 264557

b42fa2e5

BitcodeWriter: Rename some functions for consistency, NFC · cffd8cb9

Duncan P. N. Exon Smith authored Mar 27, 2016

To match writeMetadataRecords, writeNamedMetadata and
writeMetadataStrings, change:

    WriteModuleMetadata        => writeModuleMetadata
    WriteFunctionLocalMetadata => writeFunctionMetadata
    Write##CLASS               => write##CLASS

The only major change is "FunctionLocal" => "Function".  The point is to
be less specific, in preparation for emitting normal metadata records
inside function metadata blocks (currently we only emit
`LocalAsMetadata` there).

llvm-svn: 264556

cffd8cb9

BitcodeWriter: Split out writeMetadataRecords, NFC · 80d153f6

Duncan P. N. Exon Smith authored Mar 27, 2016

Besides being a nice cleanup, this is preparation for reusing the code
in function metadata blocks.

llvm-svn: 264555

80d153f6

BitcodeWriter: Restructure WriteFunctionLocalMetadata, NFC · 5465f0ad
Duncan P. N. Exon Smith authored Mar 27, 2016
```
Use an early return to simplify logic.

llvm-svn: 264554
```
5465f0ad
Bitcode: Fix MSVC bot failure from r264549 · 0f571458
Duncan P. N. Exon Smith authored Mar 27, 2016
```
make_unique => llvm::make_unique

llvm-svn: 264553
```
0f571458

BitcodeWriter: Simplify tracking of function-local metadata, NFC · 2766e4d4

Duncan P. N. Exon Smith authored Mar 27, 2016

We don't really need a separate vector here; instead, point at a range
inside the main MDs array.  This matches how r264551 references the
ranges of strings and non-strings.

llvm-svn: 264552

2766e4d4

Reapply ~"Bitcode: Collect all MDString records into a single blob" · 6565a0d4

Duncan P. N. Exon Smith authored Mar 27, 2016

Spiritually reapply commit r264409 (reverted in r264410), albeit with a
bit of a redesign.

Firstly, avoid splitting the big blob into multiple chunks of strings.

r264409 imposed an arbitrary limit to avoid a massive allocation on the
shared 'Record' SmallVector.  The bug with that commit only reproduced
when there were more than "chunk-size" strings.  A test for this would
have been useless long-term, since we're liable to adjust the chunk-size
in the future.

Thus, eliminate the motivation for chunk-ing by storing the string sizes
in the blob.  Here's the layout:

    vbr6: # of strings
    vbr6: offset-to-blob
    blob:
       [vbr6]: string lengths
       [char]: concatenated strings

Secondly, make the output of llvm-bcanalyzer readable.

I noticed when debugging r264409 that llvm-bcanalyzer was outputting a
massive blob all in one line.  Past a small number, the strings were
impossible to split in my head, and the lines were way too long.  This
version adds support in llvm-bcanalyzer for pretty-printing.

    <STRINGS abbrevid=4 op0=3 op1=9/> num-strings = 3 {
      'abc'
      'def'
      'ghi'
    }

From the original commit:

Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this
should (a) slightly reduce bitcode size, since there is less record
overhead, and (b) greatly improve reading speed, since blobs are super
cheap to deserialize.

llvm-svn: 264551

6565a0d4

BitcodeWriter: Simplify and test writing blobs, NFC · 376fa260

Duncan P. N. Exon Smith authored Mar 27, 2016

Split helper out of EmitRecordWithAbbrevImpl called emitBlob to reduce
code duplication, and add a few tests for it.

No functionality change intended.

llvm-svn: 264550

376fa260

Support: Implement StreamingMemoryObject::getPointer · 456c9968

Duncan P. N. Exon Smith authored Mar 27, 2016

The implementation is fairly obvious.  This is preparation for using
some blobs in bitcode.

For clarity (and perhaps future-proofing?), I moved the call to
JumpToBit in BitstreamCursor::readRecord ahead of calling
MemoryObject::getPointer, since JumpToBit can theoretically (a) read
bytes, which (b) invalidates the blob pointer.

This isn't strictly necessary the two memory objects we have:

  - The return of RawMemoryObject::getPointer is valid until the memory
    object is destroyed.

  - StreamingMemoryObject::getPointer is valid until the next chunk is
    read from the stream.  Since the JumpToBit call is only going ahead
    to a word boundary, we'll never load another chunk.

However, reordering makes it clear by inspection that the blob returned
by BitstreamCursor::readRecord will be valid.

I added some tests for StreamingMemoryObject::getPointer and
BitstreamCursor::readRecord.

llvm-svn: 264549

456c9968

Support: Move StreamingMemoryObject{,Test}.cpp, NFC · 6648a081

Duncan P. N. Exon Smith authored Mar 27, 2016

Change the filename to indicate this is a test, rename the tests, move
them into an anonymous namespace, and rename some variables.  All to
match our usual style before making further changes.

llvm-svn: 264548

6648a081