Commits · 88f4919a0c2a5ca3a70aa8f5b5fe58808753ae82 · Roger Ferrer / llvm-epi

Nov 28, 2015

Keno Fischer authored Nov 28, 2015

This is the autoconf analog of r251201. I realize autoconf is
deprecated, but while it's in tree, it should at least be kept working.

Also add the deprecation message to configure.ac such that AutoRegen
actually picks ip up.

llvm-svn: 254215

88f4919a

Pass .ll directly to llvm-link. · 5aafbac0
Rafael Espindola authored Nov 27, 2015
```
llvm-svn: 254214
```
5aafbac0
Pass .ll directly to llvm-link · 57e61231
Rafael Espindola authored Nov 27, 2015
```
llvm-svn: 254213
```
57e61231

SamplePGO - Add initial support for inliner annotations. · 84f06cc8

Diego Novillo authored Nov 27, 2015

This adds two thresholds to the sample profiler to affect inlining
decisions: the concept of global hotness and coldness.

Functions that have accumulated more than a certain fraction of samples at
runtime, are annotated with the InlineHint attribute. Conversely,
functions that accumulate less than a certain fraction of samples, are
annotated with the Cold attribute.

This is very similar to the hints emitted by Clang when using
instrumentation profiles.

Notice that this is a very blunt instrument. A function may have
globally collected a significant fraction of samples, but that does not
necessarily mean that every callsite for that function is hot.

Ideally, we would annotate each callsite with the samples collected at
that callsite. This way, the inliner can incorporate all these weights
into its cost model.

Once the inliner offers this functionality, we can change the hints
emitted here to a more precise per-callsite annotation. For now, this is
providing some measure of speedups with our internal benchmarks. I've
observed speedups of up to 23% (though the geo mean is about 3%). I expect
these numbers to improve as the inliner gets better annotations.

llvm-svn: 254212

84f06cc8

SamplePGO - Fix default threshold for hot callsites. · b5792408

Diego Novillo authored Nov 27, 2015

Based on testing of internal benchmarks, I'm lowering this threshold to
a value of 0.1%.  This means that SamplePGO will respect 99.9% of the
original inline decisions when following a profile.

The performance difference is noticeable in some tests. With the
previous threshold, the speedups over baseline -O2 was about 0.63%. With
the new default, the speedups are around 3% on average.

The point of this threshold is not to do more aggressive inlining. When
an inlined callsite crosses this threshold, SamplePGO will redo the
inline decision so that it can better apply the input profile.

By respecting most original inline decisions, we can apply more of the
input profile because the shape of the code follows the profile more
closely.

In the next series, I'll be looking at adding some inline hints for the
cold callsites and for toplevel functions that are hot/cold as well.

llvm-svn: 254211

b5792408

Modernize the test a bit · 138f8956
Rafael Espindola authored Nov 27, 2015
```
Remove out of date comment.
Pass .ll files to llvm-link.

llvm-svn: 254210
```
138f8956

Nov 27, 2015

Simplify the linking of recursive data. · 19b52383

Rafael Espindola authored Nov 27, 2015

Now the ValueMapper has two callbacks. The first one maps the
declaration. The ValueMapper records the mapping and then materializes
the body/initializer.

llvm-svn: 254209

19b52383

Follow-up fix for r254201 · f01a59f9
Artyom Skrobov authored Nov 27, 2015
```
llvm-svn: 254202
```
f01a59f9

[ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM. · b955b905

Artyom Skrobov authored Nov 27, 2015

Summary:
Since this build attribute corresponds to a whole module, and
different functions in a module may differ in the optimizations
enabled for them, this attribute is emitted after all functions,
and only in the case that the optimization goals for all
functions match.

Reviewers: logan, hans

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D14934

llvm-svn: 254201

b955b905

[AArch64] Add ARMv8.2-A FP16 scalar instructions · b25914e0

Oliver Stannard authored Nov 27, 2015

ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

Most of these instructions are the same as the 32- and 64-bit versions,
but with the type field (bits 23-22) set to 0b11. Previously the top bit
of the size field was always 0, so the instruction classes only provided
a 1-bit size field, which I have widened to 2 bits.

Differential Revision: http://reviews.llvm.org/D15014

llvm-svn: 254198

b25914e0

[sanitizer] [dfsan] Unify aarch64 mapping · d93c0c4d

Adhemerval Zanella authored Nov 27, 2015

This patch changes the DFSan instrumentation for aarch64 to instead
of using fixes application mask defined by SANITIZER_AARCH64_VMA
to read the application shadow mask value from compiler-rt. The value
is initialized based on runtime VAM detection.

Along with this patch a compiler-rt one will also be added to export
the shadow mask variable.

llvm-svn: 254196

d93c0c4d

[SimplifyLibCalls] Use range-based loop. NFC. · ac0953a2
Davide Italiano authored Nov 27, 2015
```
llvm-svn: 254193
```
ac0953a2

[TableGen] Sort pattern predicates before concatenating into a string so that... · 8985efe5

Craig Topper authored Nov 27, 2015

[TableGen] Sort pattern predicates before concatenating into a string so that different orders of the same set will produce the same string. This can reduce the number of unique predicates in the isel tables. NFC

llvm-svn: 254192

8985efe5

[X86] Pair a NoVLX with HasAVX512 to match the others and remove a unique... · e38c57a4

Craig Topper authored Nov 27, 2015

[X86] Pair a NoVLX with HasAVX512 to match the others and remove a unique predicate check in the isel tables. NFC

llvm-svn: 254191

e38c57a4

test: bail early if tool_path is None · 522eb9c5
Andrew Wilkins authored Nov 27, 2015
```
tool_path will be None for llvm-go if Go cannot be found

llvm-svn: 254190
```
522eb9c5
test: check if go_executable is set · 572fe6e9
Andrew Wilkins authored Nov 27, 2015
```
llvm-svn: 254189
```
572fe6e9

Use $GO_EXECUTABLE in Go-based lit tests · caa3b51a

Andrew Wilkins authored Nov 27, 2015

Summary:
When running tests, pass the GO_EXECUTABLE CMake
cache variable to llvm-go. The "go" binary may
not be in $PATH, or may be different to the one
passed to CMake.

Reviewers: pcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14041

llvm-svn: 254187

caa3b51a

Test both input file orders. · 8e8183b8
Rafael Espindola authored Nov 27, 2015
```
llvm-svn: 254186
```
8e8183b8
Add missing file. · 60b57863
Rafael Espindola authored Nov 27, 2015
```
llvm-svn: 254185
```
60b57863
Make the test a bit more interesting. · 1d3465f6
Rafael Espindola authored Nov 27, 2015
```
It now covers a regular function replacing an available_externally one.

llvm-svn: 254184
```
1d3465f6

MC: Simplify handling of temporary symbols in COFF writer. · 8359a6a8

Peter Collingbourne authored Nov 26, 2015

The COFF object writer was previously adding unnecessary symbols to its
temporary data structures and cleaning them up later. This made the code
harder to understand and caused a bug (aliases classed as temporary symbols
would cause an assertion failure). A much simpler way of handling such
symbols is to ask the layout for their section-relative position when needed.

Tested with a bootstrap on Windows and by building Chrome.

Differential Revision: http://reviews.llvm.org/D14975

llvm-svn: 254183

8359a6a8

Nov 26, 2015

[X86][FMA] Begun adding AVX512 FMA tests · 1d881ae2
Simon Pilgrim authored Nov 26, 2015
```
As discussed on D14909

llvm-svn: 254180
```
1d881ae2

[LoopVectorize] Use MapVector rather than DenseMap for MinBWs. · 54336a5a

Charlie Turner authored Nov 26, 2015

The order in which instructions are truncated in truncateToMinimalBitwidths
effects code generation. Switch to a map with a determinisic order, since the
iteration order over a DenseMap is not defined.

This code is not hot, so the difference in container performance isn't
interesting.

Many thanks to David Blaikie for making me aware of MapVector!

Fixes PR25490.

Differential Revision: http://reviews.llvm.org/D14981

llvm-svn: 254179

54336a5a

[X86] Now that X86VPermt2 is used in all the avx512_perm_t_sizes just hardcode... · a47576f2

Craig Topper authored Nov 26, 2015

[X86] Now that X86VPermt2 is used in all the avx512_perm_t_sizes just hardcode it into the patterns instead of passing as an argument. NFC

llvm-svn: 254177

a47576f2

[X86] Merge X86VPermt2Fp and X86VPermt2Int back together by weakening them... · 05858f52

Craig Topper authored Nov 26, 2015

[X86] Merge X86VPermt2Fp and X86VPermt2Int back together by weakening them just enough. The SDTCisSameSizeAs introduced in r254138 helps here.

llvm-svn: 254176

05858f52

Add a few passing lto tests. · b38f7b5a

Rafael Espindola authored Nov 26, 2015

I found these while trying to get a prototype to bootstrap.

They cover things like
* Handling of non linker visible stuff (append, available_externally)
* Type merging
* Alias to dropped globals
* Dropping linkage when converting to a declaration.

These should hopefully be generally useful for anyone refactoring the
plugin.

llvm-svn: 254174

b38f7b5a

[X86] Split ISD node for Vfpclass and Vfpclasss so that we can write strong... · 00096563

Craig Topper authored Nov 26, 2015

[X86] Split ISD node for Vfpclass and Vfpclasss so that we can write strong type constraints for each that don't cause ambiguous isel.

llvm-svn: 254172

00096563

[bugpoint] Fix "Alias must point to a definition" problems · 28ad2b47

Hal Finkel authored Nov 26, 2015

GlobalAliases may reference function definitions, but not function declarations.

bugpoint would sometimes create invalid IR by deleting a function's body (thus
mutating a function definition into a declaration) without first 'fixing' any
GlobalAliases that reference that function definition.

This change iteratively prevents that issue. Before deleting a function's body,
it scans the module for GlobalAliases which reference that function. When
found, it eliminates them using replaceAllUsesWith.

Fixes PR20788.

Patch by Nick Johnson!

llvm-svn: 254171

28ad2b47

Disallow aliases to available_externally. · 89345771

Rafael Espindola authored Nov 26, 2015

They are as much trouble as aliases to declarations. They are requiring
the code generator to define a symbol with the same value as another
symbol, but the second symbol is undefined.

If representing this is important for some optimization, we could add
support for available_externally aliases. They would be *required* to
point to a declaration (or available_externally definition).

llvm-svn: 254170

89345771

[X86] Revert part of r254167 to recover bots. · ff2f1473
Craig Topper authored Nov 26, 2015
```
llvm-svn: 254169
```
ff2f1473
[Hexagon] Lowering of V60/HVX vector types · 08ff8883
Krzysztof Parzyszek authored Nov 26, 2015
```
llvm-svn: 254168
```
08ff8883
[X86] Strengthen more type constraints to reduce isel table size. · 9d1deb4b
Craig Topper authored Nov 26, 2015
```
llvm-svn: 254167
```
9d1deb4b
[Hexagon] Hexagon V60 HVX intrinsic defintions · 4eb6d4d1
Krzysztof Parzyszek authored Nov 26, 2015
```
Author: Ron Lieberman <ronl@codeaurora.org>
llvm-svn: 254165
```
4eb6d4d1

[mips][ias] Range check uimm5 operands and fix several bugs this revealed. · daa4b6fb

Daniel Sanders authored Nov 26, 2015

Summary:
The bugs were:
* append, prepend, and balign were not tested
* balign takes a uimm2 not a uimm5.
* drotr32 was correctly implemented with a uimm5 but the tests expected
  '52' to be valid.
* li/la were implemented with a uimm5 instead of simm32. simm32 isn't
  completely correct either but I'll fix that when I get to simm32.

A notable omission are some of the shift instructions. Several of these
have been implemented using a single uimm6 instruction (rather than two
uimm5 instructions and a CodeGen-only uimm6 pseudo). These will be updated
in the uimm6 patch.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14712

llvm-svn: 254164

daa4b6fb

[AArch64] Add ARMv8.2-A new AT instruction variants · 64c167db

Oliver Stannard authored Nov 26, 2015

ARMv8.2-A adds new variants of the "at" (address translate) system
instruction, which take the PSTATE.PAN bit (added in ARMv8.1-A). These
are a required part of ARMv8.2-A, so no additional subtarget features
are required.

Differential Revision: http://reviews.llvm.org/D15018

llvm-svn: 254159

64c167db

ARM: address WOA unsigned division overflow crash · d1229248

Martell Malone authored Nov 26, 2015

Building on r253865 the crash is not limited to signed overflows.

Disable custom handling of unsigned 32-bit and 64-bit integer divide.
Add test cases for both 32-bit and 64-bit unsigned integer overflow.

llvm-svn: 254158

d1229248

[AArch64] Add ARMv8.2-A UAO PSTATE bit · 911ea20f

Oliver Stannard authored Nov 26, 2015

ARMv8.2-A adds a new PSTATE bit, PSTATE.UAO, which allows the LDTR/STTR
instructions to behave the same as LDR/STR with respect to execute-only
pages at higher privilege levels. New variants of the MSR/MRS
instructions are added to allow reading and writing this bit. It is a
required part of ARMv8.2-A, so no additional subtarget features are
required.

Differential Revision: http://reviews.llvm.org/D15020

llvm-svn: 254157

911ea20f

[AArch64] Add ARMv8.2-A persistent memory instruction · 1a81cc9f

Oliver Stannard authored Nov 26, 2015

ARMv8.2-A adds the "dc cvap" instruction, which is a system instruction
that cleans caches to the point of persistence (for systems that have
persistent memory). It is a required part of ARMv8.2-A, so no additional
subtarget features are required.

Differential Revision: http://reviews.llvm.org/D15016

llvm-svn: 254156

1a81cc9f

[AArch64] Add ARMv8.2-A ID_A64MMFR2_EL1 register · 48b43741

Oliver Stannard authored Nov 26, 2015

ARMv8.2-A adds a new ID register, ID_A64MMFR2_EL1, which behaves in the
same way as ID_A64MMFR0_EL1 and ID_A64MMFR1_EL1. It is a required part
of ARMv8.2-A, so no additional subtarget features are required.

Differential Revision: http://reviews.llvm.org/D15017

llvm-svn: 254155

48b43741

[AArch64] Add subtarget features for ARMv8.2-A · 7cc0c4e6

Oliver Stannard authored Nov 26, 2015

This adds subtarget features for ARMv8.2-A, which builds on (and
requires the features from) ARMv8.1-A. Most assembler-visible features
of ARMv8.2-A are system instructions, and are all required parts of the
architecture, so just depend on the HasV8_2aOps subtarget feature. There
is also one large, optional feature, which adds 16-bit floating point
versions of all existing floating-point instructions (VFP and SIMD),
this is represented by the FeatureFullFP16 subtarget feature.

Differential Revision: http://reviews.llvm.org/D15013

llvm-svn: 254154

7cc0c4e6