Commits · 0a7a8dee2bf83dbfff3a36702b8f1ed472bd18ce · Roger Ferrer / llvm-epi

May 20, 2016

[X86] Fix some AVX patterns to only be disabled if VLX and BWI are supported.... · 0a7a8dee

Craig Topper authored May 20, 2016

[X86] Fix some AVX patterns to only be disabled if VLX and BWI are supported. Without this we get isel failures on the avx-intrinsics-x86.ll test in AVX512VL.

llvm-svn: 270174

0a7a8dee

[LibFuzzer] Fix implementation of ``GetPeakRSSMb()`` on Mac OSX. · 11565444

Dan Liew authored May 20, 2016

On Linux ``rusage.ru_maxrss`` is in KiB but on Mac OSX it is in bytes.

Differential Revision: http://reviews.llvm.org/D20410

llvm-svn: 270173

11565444

[LibFuzzer] Fix ``NumberOfCpuCores()`` on Mac OSX. · e6ac1fd0

Dan Liew authored May 20, 2016

The ``nprocs`` command does not exist under Mac OSX so use
``sysctl`` instead on that platform.

Whilst I'm here

* Use ``pclose()`` instead of ``fclose()`` which the ``popen()``
  documentation says should be used.
* Check for errors that were previously unhandled.

Differential Revision: http://reviews.llvm.org/D20409

llvm-svn: 270172

e6ac1fd0

Add AVRTargetStreamers · 7ec6f560

Dylan McKay authored May 20, 2016

Reviewed by Matt Arsenault in http://reviews.llvm.org/D16311

llvm-svn: 270171

7ec6f560

Re-alphabetize this file list. · b391930b
Richard Smith authored May 20, 2016
```
llvm-svn: 270170
```
b391930b
Revert incorrect module map changes in r269907 and replace them with the · f5c3a63c
Richard Smith authored May 20, 2016
```
appropriate changes.

llvm-svn: 270169
```
f5c3a63c

[RegBankSelect] Refactor the code to split the repairing and mapping of · d84d00ba

Quentin Colombet authored May 20, 2016

an instruction.

Use the previously introduced RepairingPlacement class to split the code
computing the repairing placement from the code doing the actual
placement. That way, we will be able to consider different placement and
then, only apply the best one.

llvm-svn: 270168

d84d00ba

[RegBankSelect] Add helper class for repairing code placement. · 55650754

Quentin Colombet authored May 20, 2016

When assigning the register banks we may have to insert repairing code
to move already assigned values accross register banks.

Introduce a few helper classes to keep track of what is involved in the
repairing of an operand:
- InsertPoint and its derived classes record the positions, in the CFG,
  where repairing has to be inserted.
- RepairingPlacement holds all the insert points for the repairing of an
  operand plus the kind of action that is required to do the repairing.

This is going to be used to keep track of how the repairing should be
done, while comparing different solutions for an instruction. Indeed, we
will need the repairing placement to capture the cost of a solution and
we do not want to compute it a second time when we do the actual
repairing.

llvm-svn: 270167

55650754

[RegBankSelect] Refactor assignmentMatch to avoid testing the current · 0d77da4e

Quentin Colombet authored May 20, 2016

register bank twice.

Prior to this change, we were checking if the assignment for the current
machine operand was matching, then we would check if the mismatch
requires to insert repair code.
We actually already have this information from the first check, so just
pass it along.

NFCI.

llvm-svn: 270166

0d77da4e

Fix pr27728. · 78d947b4

Rafael Espindola authored May 20, 2016

Sorry for the lack testcase. There is one in the pr, but it depends on
std::sort and the .ll version is 110 lines, so I don't think it is
wort it.

The bug was that we were sorting after adding a terminator, and the
sorting algorithm could end up putting the terminator in the middle of
the List vector.

With that we would create a Spans map entry keyed on nullptr which would
then be added to CUs and fail in that sorting.

llvm-svn: 270165

78d947b4

Avoid depending on test inputes that aren't in Inputs · a769fd50

Reid Kleckner authored May 20, 2016

Some people have weird CI systems that run each test subdirectory
independently without access to other parallel trees.

Unfortunately, this means we have to suffer some duplication until Art
can sort out how to share these types.

llvm-svn: 270164

a769fd50

[RegBankSelect] Introduce MappingCost helper class. · cfd97b93

Quentin Colombet authored May 20, 2016

This helper class will be used to represent the cost of mapping an
instruction to a specific register bank.
The particularity of these costs is that they are mostly local, thus the
frequency of the basic block is irrelevant. However, for few
instructions (e.g., phis and terminators), the cost may be non-local and
then, we need to account for the frequency of the involved basic blocks.

This will be used by the greedy mode I am working on.

llvm-svn: 270163

cfd97b93

Some changes to prevent searching down the stack for saved register · 1ebb2c92

Jason Molenda authored May 20, 2016

values for the pc or return address register.

On ios with arm64 and a binary that has multiple functions without 
individual symbol boundaries, we end up with an assembly profile
unwind plan that says lr=<same> - that is, the link register contents
are unmodified from the caller's value.  This gets the unwinder in
a loop.  

When we're off the 0th frame, we never want to look to a caller for
a pc or return-address register value.

Add checks to ReadGPRValue and ReadRegister to prevent both the pc
and ra register values from recursing.

If this causes problems with backtraces on android, let me know or
back it out and I'll look into it -- but I think these are
straightforward and don't expect problems.

<rdar://problem/24610365> 

llvm-svn: 270162

1ebb2c92

Restore ASCIIbetical order. · dcccd929
Richard Smith authored May 20, 2016
```
llvm-svn: 270161
```
dcccd929

[Lexer] Don't merge macro args from different macro files · 95a2a7f2

Vedant Kumar authored May 19, 2016

The lexer sets the end location of macro arguments incorrectly *if*,
while merging consecutive args to fit into a single SLocEntry, it finds
args which come from different macro files.

Fix the issue by using separate SLocEntries in this situation.

This fixes a code coverage crasher (rdar://problem/26181005). Because
the lexer reported end locations for certain macro args incorrectly, we
would generate bogus coverage mappings with negative line offsets.

Reviewed-by: akyrtzi

Differential Revision: http://reviews.llvm.org/D20401

llvm-svn: 270160

95a2a7f2

[obj2yaml] [yaml2obj] Adding a test for r270124 · db373bed
Chris Bieneman authored May 19, 2016
```
This test covers strings after load command structs and zero fill bytes.

llvm-svn: 270159
```
db373bed

[yaml2obj] Removing debug code that scribbled 0xDEADBEEF · 1abf005f

Chris Bieneman authored May 19, 2016

Now that MachO load command fields are fully covered we can fill unaccounted for bytes with 0. That allows us to sparsely specify YAML to simplify tests.

Simplifying load_commands test accordingly.

llvm-svn: 270158

1abf005f

[RuntimeDyld][MachO] Add support for SUBTRACTOR relocations between anonymous · 45bd7ca7
Lang Hames authored May 19, 2016
```
symbols on x86-64.

llvm-svn: 270157
```
45bd7ca7
clang-format. NFC. · 0a78f8c4
Rafael Espindola authored May 19, 2016
```
llvm-svn: 270156
```
0a78f8c4
Add const qualifiers to appease bots; NFC · 23519758
Sanjoy Das authored May 19, 2016
```
llvm-svn: 270155
```
23519758

[analyzer] Fix for PR23790 : constrain return value of strcmp() rather than... · 8a88b908

Anton Yartsev authored May 19, 2016

[analyzer] Fix for PR23790 : constrain return value of strcmp() rather than returning a concrete value.

The function strcmp() can return any value, not just {-1,0,1} : "The strcmp(const char *s1, const char *s2) function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2." [C11 7.24.4.2p3]
https://llvm.org/bugs/show_bug.cgi?id=23790
http://reviews.llvm.org/D16317

llvm-svn: 270154

8a88b908

Allow -inline-threshold to override default threshold. · bb578ef0

Easwaran Raman authored May 19, 2016

Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior.

Differential Revision: http://reviews.llvm.org/D20452

llvm-svn: 270153

bb578ef0

Forgotten file from r269992. · ad940f23
Richard Smith authored May 19, 2016
```
llvm-svn: 270152
```
ad940f23

[GuardWidening] Introduce range check merging · f5f0331a

Sanjoy Das authored May 19, 2016

Sequences of range checks expressed using guards, like

  guard((I - 2) u< L)
  guard((I - 1) u< L)
  guard((I + 0) u< L)
  guard((I + 1) u< L)
  guard((I + 2) u< L)

can sometimes be combined into a smaller sequence:

  guard((I - 2) u< L AND (I + 2) u< L)

if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks
expressed in the previous sequence.

This change teaches GuardWidening to do this kind of merging when
feasible.

llvm-svn: 270151

f5f0331a

[CUDA] Implement __ldg using intrinsics. · 2e4ecfde

Justin Lebar authored May 19, 2016

Summary:
Previously it was implemented as inline asm in the CUDA headers.

This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions.  This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.

Reviewers: tra, rsmith

Subscribers: jhen, cfe-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D19990

llvm-svn: 270150

2e4ecfde

Reapply r263460: [SpillPlacement] Fix a quadratic behavior in spill placement. · b926bdac

Quentin Colombet authored May 19, 2016

Using Chandler's words from r265331:
This commit was greatly exacerbating PR17409 and effectively regressed
build time for lot of (very large) code when compiled with ASan or MSan.

PR17409 is fixed by r269249, so this is fine to reapply r263460.

Original commit message:
The bad behavior happens when we have a function with a long linear
chain of basic blocks, and have a live range spanning most of this
chain, but with very few uses.

Let say we have only 2 uses.

The Hopfield network is only seeded with two active blocks where the
uses are, and each iteration of the outer loop in
`RAGreedy::growRegion()` only adds two new nodes to the network due to
the completely linear shape of the CFG.  Meanwhile,
`SpillPlacer->iterate()` visits the whole set of discovered nodes, which
adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening
on the frontier where new nodes are being added. The internal nodes in
the network are not likely to be flip-flopping much, or they will at
least settle down very quickly. This means that while
`SpillPlacer->iterate()` is recomputing all the nodes in the network, it
is probably only the two frontier nodes that are changing their output.

Instead of recomputing the whole network on each iteration, we can
maintain a SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its
  neighbors are added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should
converge to a local minimum, but there is no guarantee that it will find
a global minimum. It is possible that updating nodes in a different
order will cause us to switch to a different local minimum. In other
words, this is not NFC, but although I saw a few runtime improvements
and regressions when I benchmarked this change, those were side effects
and actually the performance change is in the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his
feedbacks, guidance and time for the review.

llvm-svn: 270149

b926bdac

Remove a should have been deleted extra assignment to a variable. · d9e02c4f

Jim Ingham authored May 19, 2016

Also fix up the formatting a bit, it looks like something was inserting
actual tabs.  Replace with 4 spaces.

llvm-svn: 270148

d9e02c4f

Record a TargetMachine instead of a Reloc::Model. · ab03eb00
Rafael Espindola authored May 19, 2016
```
Addresses r270095's code review.

llvm-svn: 270147
```
ab03eb00

[LibFuzzer] · 3868e468

Dan Liew authored May 19, 2016

Work around crashes in ``__sanitizer_malloc_hook()`` under Mac OSX.

Under Mac OSX we intercept calls to malloc before thread local
storage is initialised leading to a crash when accessing
``AllocTracer``. To workaround this ``AllocTracer`` is only accessed
in the hook under Linux. For symmetry ``__sanitizer_free_hook()``
is also modified in the same way.

To support this change a set of new macros
LIBFUZZER_LINUX and LIBFUZZER_APPLE has been defined which can be
used to check the target being compiled for.

Differential Revision: http://reviews.llvm.org/D20402

llvm-svn: 270145

3868e468

May 19, 2016

[Sema] Fix use after move. Found by ubsan. · 97d7a662
Benjamin Kramer authored May 19, 2016
```
llvm-svn: 270144
```
97d7a662

Remove specializations of ProfileSummary · 7cefdb81

Easwaran Raman authored May 19, 2016

This removes the subclasses of ProfileSummary, moves the members of the derived classes to the base class.

Differential Revision: http://reviews.llvm.org/D20390

llvm-svn: 270143

7cefdb81

[ARM, AArch64] Match additional patterns to ldN instructions · 476c0afc

Matthew Simpson authored May 19, 2016

When matching an interleaved load to an ldN pattern, the interleaved access
pass checks that all users of the load are shuffles. If the load is used by an
instruction other than a shuffle, the pass gives up and an ldN is not
generated. This patch considers users of the load that are extractelement
instructions. It attempts to modify the extracts to use one of the available
shuffles rather than the load. After the transformation, the load is only used
by shuffles and will then be matched with an ldN pattern.

Differential Revision: http://reviews.llvm.org/D20250

llvm-svn: 270142

476c0afc

[profile] entry eviction support in value profiler · 5f153e68
Xinliang David Li authored May 19, 2016
```
Differential revision: http://reviews.llvm.org/D20408

llvm-svn: 270141
```
5f153e68
AMDGPU: Remove pointless conversions · 4e3d383c
Matt Arsenault authored May 19, 2016
```
llvm-svn: 270139
```
4e3d383c
[WebAssembly] Simplify code that never has to handle physical registers. NFC. · 847afa22
Dan Gohman authored May 19, 2016
```
llvm-svn: 270137
```
847afa22

Move ProfileSummary to IR. · e5a17e3f

Easwaran Raman authored May 19, 2016

This splits ProfileSummary into two classes: a ProfileSummary class that has methods to convert from/to metadata and a ProfileSummaryBuilder class that computes the profiles summary which is in ProfileData.

Differential Revision: http://reviews.llvm.org/D20314

llvm-svn: 270136

e5a17e3f

[InstCombine] Avoid combining the bitcast of a var that is used as both... · b1d37199

Guozhi Wei authored May 19, 2016

[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions

This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703.

If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it.

llvm-svn: 270135

b1d37199

comment out line that is causing UBSAN bot failures · cfe75fa7
Sanjay Patel authored May 19, 2016
```
Patch is awaiting review here:
http://reviews.llvm.org/D20434

llvm-svn: 270128
```
cfe75fa7

[obj2yaml] [yaml2obj] Support for MachO Load Command data · 9f243e9a

Chris Bieneman authored May 19, 2016

This re-applies r270115.

Many of the MachO load commands can have data appended after the command structure. This data is frequently strings, but can actually be anything. This patch adds support for three optional fields on load command yaml descriptions.

The new PayloadString YAML field is populated with the data after load commands known to have strings as extra data.

The new ZeroPadBytes YAML field is a count of zero'd bytes after the end of the load command structure before the next command. This can apply anywhere in the file. MachO2YAML verifies that bytes are zero before populating this field, and YAML2MachO will add zero'd bytes.

The new PayloadBytes YAML field stores all bytes after the end of the load command structure before the next command if they are non-zero. This is a catch all for all unhandled bytes. If MachO2Yaml populates PayloadBytes it will not populate ZeroPadBytes, instead zero'd bytes will be in the PayloadBytes structure.

llvm-svn: 270124

9f243e9a

Revert "[obj2yaml] [yaml2obj] Support for MachO Load Command data" · f605d10a
Chris Bieneman authored May 19, 2016
```
This reverts commit r270115.

This failed on several builders using GCC.

llvm-svn: 270121
```
f605d10a