Commits · 8e67000f1a0a9a03b6f8a002a01d7c3fd7b945b6 · Roger Ferrer / llvm-epi

Nov 28, 2016

Always create a PT_ARM_EXIDX if needed. · 8e67000f

Rafael Espindola authored Nov 28, 2016

Unfortunatelly PT_ARM_EXIDX is special. There is no way to create it
from linker scripts, so we have to create it even if PHDRS is used.

This matches bfd and is required for the lld output to survive bfd's strip.

llvm-svn: 288012

8e67000f

Nov 27, 2016

[X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't... · 17786f77

Craig Topper authored Nov 27, 2016

[X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't commutable as operand 0 should pass its upper bits through to the output.

llvm-svn: 288011

17786f77

[X86][FMA] Add missing Predicates qualifier around scalar FMA intrinsic patterns. · 13b27a27
Craig Topper authored Nov 27, 2016
```
llvm-svn: 288010
```
13b27a27
[X86][FMA4] Add load folding support for FMA4 scalar intrinsic instructions. · ff9d4587
Craig Topper authored Nov 27, 2016
```
llvm-svn: 288009
```
ff9d4587
[X86][FMA4] Add test cases to demonstrate missed folding opportunities for FMA4 scalar intrinsics. · b00872b9
Craig Topper authored Nov 27, 2016
```
llvm-svn: 288008
```
b00872b9

[X86] Add SHL by 1 to the load folding tables. · 3674f44e

Craig Topper authored Nov 27, 2016

I don't think isel selects these today, favoring adding the register to itself instead. But the load folding tables shouldn't be so concerned with what isel will use and just represent the relationships.

llvm-svn: 288007

3674f44e

[X86][SSE] Add support for combining target shuffles to 128/256-bit PSLL/PSRL bit shifts · 91d6f5fb
Simon Pilgrim authored Nov 27, 2016
```
llvm-svn: 288006
```
91d6f5fb

[InstSimplify] allow integer vector types to use computeKnownBits · 8ca30ab0

Sanjay Patel authored Nov 27, 2016

Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.

llvm-svn: 288005

8ca30ab0

[AVX-512] Add integer and fp unpck instructions to load folding tables. · 4fab4872
Craig Topper authored Nov 27, 2016
```
llvm-svn: 288004
```
4fab4872

[X86][SSE] Split lowerVectorShuffleAsShift ready for combines. NFCI. · cdb2ce66

Simon Pilgrim authored Nov 27, 2016

Moved most of matching code into matchVectorShuffleAsShift to share with target shuffle combines (in a future commit).

llvm-svn: 288003

cdb2ce66

Add paralell_for and use it where appropriate. · 1dd86a66

Rui Ueyama authored Nov 27, 2016

When we iterate over numbers as opposed to iterable elements,
parallel_for fits better than parallel_for_each.

llvm-svn: 288002

1dd86a66

[X86] Add TB_NO_REVERSE to entries in the load folding table where the... · 7ad961cc

Craig Topper authored Nov 27, 2016

[X86] Add TB_NO_REVERSE to entries in the load folding table where the instruction's load size is smaller than the register size.

If we were to unfold these, the load size would be increased to the register size. This is not safe to do since the enlarged load can do things like cross a page boundary into a page that doesn't exist.

I probably missed some instructions, but this should be a large portion of them.

llvm-svn: 288001

7ad961cc

[X86][SSE] Added tests showing missed combines for shuffle to shifts. · 4571157d
Simon Pilgrim authored Nov 27, 2016
```
llvm-svn: 288000
```
4571157d

Adjust type-trait evaluation to properly handle Using(Shadow)Decls · fec83451

Hal Finkel authored Nov 27, 2016

Since r274049, for an inheriting constructor declaration, the name of the using
declaration (and using shadow declaration comes from the using declaration) is
the name of a derived class, not the base class (line 8225-8232 of
lib/Sema/SemaDeclCXX.cpp in https://reviews.llvm.org/rL274049). Because of
this, name-based lookup performed inside Sema::LookupConstructors returns not
only CXXConstructorDecls but also Using(Shadow)Decls, which results assertion
failure reported in PR29087.

Patch by Taewook Oh, thanks!

Differential Revision: https://reviews.llvm.org/D23765

llvm-svn: 287999

fec83451

add tests to show missing analysis; NFC · dc2917b9
Sanjay Patel authored Nov 27, 2016
```
llvm-svn: 287998
```
dc2917b9
fix formatting; NFC · da9f7bf0
Sanjay Patel authored Nov 27, 2016
```
llvm-svn: 287997
```
da9f7bf0

Also skip regular symbol assignment at the start of a script. · 5fcc99c2

Rafael Espindola authored Nov 27, 2016

Unfortunatelly some scripts look like

kernphys = ...
. = ....

and the expectation in that every orphan section is after the
assignment.

llvm-svn: 287996

5fcc99c2

[AVX-512] Add masked EVEX vpmovzx/sx instructions to load folding tables. · c3b3926f
Craig Topper authored Nov 27, 2016
```
llvm-svn: 287995
```
c3b3926f

Don't put an orphan before the first . assignment. · 7fe4ec9b

Rafael Espindola authored Nov 27, 2016

This is an horrible special case, but seems to match bfd's behaviour
and is important for avoiding placing an orphan section before the
expected start of the file.

llvm-svn: 287994

7fe4ec9b

[SLP] Add new and update existing lit testfor providing more context to... · 2f5cb60b

Mohammad Shahid authored Nov 27, 2016

[SLP] Add new and update existing lit testfor providing more context to incoming patch for vectorization of jumbled load

Change-Id: Ifb9091bb0f84c1937c2c8bd2fc345734f250d2f9
llvm-svn: 287992

2f5cb60b

[X86] Remove alignment restrictions from load folding table for some... · fb64a25b

Craig Topper authored Nov 27, 2016

[X86] Remove alignment restrictions from load folding table for some instructions that don't have a restriction.

Most of these are the SSE4.1 PMOVZX/PMOVSX instructions which all read less than 128-bits. The only other was PMOVUPD which by definition is an unaligned load.

llvm-svn: 287991

fb64a25b

Nov 26, 2016

[DOXYGEN] Updated instruction names corresponding to avxintrin.h intrinsics. · 4c77e894

Ekaterina Romanova authored Nov 26, 2016

Documentation for some of the avxintrin.h's intrinsics errorneously said that
non VEX-prefixed instructions could be generated. This was fixed.

I tried several different solutions to achieve pretty printing of unordered lists (nested and non-nested) in param sections in doxygen.

llvm-svn: 287990

4c77e894

[tsan] Fix the lit expansion of %deflake not to eat a space · 3a481cf0

Kuba Mracek authored Nov 26, 2016

The lit expansion of "%deflake " (notice the space after) expands in a way that the space is removed, this fixes that.

Differential Revision: https://reviews.llvm.org/D27139

llvm-svn: 287989

3a481cf0

Implement conjuntion/disjuntion/negation for LFTS v2. Same code and tests for the ones in std:: · 13320a50
Marshall Clow authored Nov 26, 2016
```
llvm-svn: 287988
```
13320a50
[X86] Remove hasOneUse check that is redundant with the one in IsProfitableToFold. · 837ff25d
Craig Topper authored Nov 26, 2016
```
llvm-svn: 287987
```
837ff25d

[X86] Fix the zero extending load detection in... · e266e126

Craig Topper authored Nov 26, 2016

[X86] Fix the zero extending load detection in X86DAGToDAGISel::selectScalarSSELoad to pass the load node to IsProfitableToFold and IsLegalToFold.

Previously we were passing the SCALAR_TO_VECTOR node.

llvm-svn: 287986

e266e126

[X86] Simplify control flow. NFCI · d3ab1a39
Craig Topper authored Nov 26, 2016
```
llvm-svn: 287985
```
d3ab1a39
[ScopInfo] Use SCEVRewriteVisitor to simplify SCEVSensitiveParameterRewriter [NFC] · 278f9e7d
Tobias Grosser authored Nov 26, 2016
```
llvm-svn: 287984
```
278f9e7d

[X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from... · 991d1ca3

Craig Topper authored Nov 26, 2016

[X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from being folded multiple times.

Summary: When selectScalarSSELoad is looking for a scalar_to_vector of a scalar load, it makes sure the load is only used by the scalar_to_vector. But it doesn't make sure the scalar_to_vector is only used once. This can cause the same load to be folded multiple times. This can be bad for performance. This also causes the chain output to be duplicated, but not connected to anything so chain dependencies will not be satisfied.

Reviewers: RKSimon, zvi, delena, spatel

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D26790

llvm-svn: 287983

991d1ca3

[InstCombine] add test to show missing vector optimization; NFC · 12a2af44
Sanjay Patel authored Nov 26, 2016
```
llvm-svn: 287982
```
12a2af44
Implement the 'detection idiom' from LFTS v2 · 3b3352de
Marshall Clow authored Nov 26, 2016
```
llvm-svn: 287981
```
3b3352de
[InstCombine] don't drop metadata in FoldOpIntoSelect() · 8bd69b7e
Sanjay Patel authored Nov 26, 2016
```
llvm-svn: 287980
```
8bd69b7e

Change return types of split{Non,}Strings. · e8a077ba

Rui Ueyama authored Nov 26, 2016

They return new vectors, but at the same time they mutate other vectors,
so returning values doesn't make much sense. We should just mutate two
vectors.

llvm-svn: 287979

e8a077ba

Make getColorDiagnostics return a boolean value instead of an enum. · 72b1ee25

Rui Ueyama authored Nov 26, 2016

Config->ColorDiagnostics was of type enum before. Now it is just a
boolean flag. Thanks Rafael for suggestion.

llvm-svn: 287978

72b1ee25

Split MergeOutputSection::finalize. · 1880bbed
Rui Ueyama authored Nov 26, 2016
```
llvm-svn: 287977
```
1880bbed

add optional param to copy metadata when creating selects; NFC · 91e73a7b

Sanjay Patel authored Nov 26, 2016

There are other spots where we can use this; we're currently dropping 
metadata in some places, and there are proposed changes where we will
want to propagate metadata.

IRBuilder's CreateSelect() already has a parameter like this, so this
change makes the regular 'Create' API line up with that.

llvm-svn: 287976

91e73a7b

[AVX-512] Add unmasked EVEX vpmovzx/sx instructions to load folding tables. · 10d5eec1
Craig Topper authored Nov 26, 2016
```
llvm-svn: 287975
```
10d5eec1
[AVX-512] Add masked 128/256-bit integer add/sub instructions to load folding tables. · 97169ea5
Craig Topper authored Nov 26, 2016
```
llvm-svn: 287974
```
97169ea5

[ScopDetect] Expand statistics of the detected scops · b45ae560

Tobias Grosser authored Nov 26, 2016

We now collect:

  Number of total loops
  Number of loops in scops
  Number of scops
  Number of scops with maximal loop depth 1
  Number of scops with maximal loop depth 2
  Number of scops with maximal loop depth 3
  Number of scops with maximal loop depth 4
  Number of scops with maximal loop depth 5
  Number of scops with maximal loop depth 6 and larger
  Number of loops in scops (profitable scops only)
  Number of scops (profitable scops only)
  Number of scops with maximal loop depth 1 (profitable scops only)
  Number of scops with maximal loop depth 2 (profitable scops only)
  Number of scops with maximal loop depth 3 (profitable scops only)
  Number of scops with maximal loop depth 4 (profitable scops only)
  Number of scops with maximal loop depth 5 (profitable scops only)
  Number of scops with maximal loop depth 6 and larger (profitable scops only)

These statistics are certainly completely accurate as we might drop scops
when building up their polyhedral representation, but they should give a good
indication of the number of scops we detect.

llvm-svn: 287973

b45ae560

[AVX-512] Add masked 512-bit integer add/sub instructions to load folding tables. · 53b33de1
Craig Topper authored Nov 26, 2016
```
llvm-svn: 287972
```
53b33de1