Commits · fb501330761d3f121e530add1055face976ddb27 · Roger Ferrer / llvm-epi

Nov 28, 2016

[CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio · fb501330

Daniil Fukalov authored Nov 28, 2016

At the moment optimized tablegen is generated by LLVM_USE_HOST_TOOLS variable that is not set for Visual Sudio since LLVM_ENABLE_ASSERTIONS depends on CMAKE_BUILD_TYPE value that is not equal to "DEBUG" in case of Visual Studio soltion generation.

Modified to do not depend on LLVM_ENABLE_ASSERTIONS value in VS and Xcode cases

Reviewers: beanz

Subscribers: RKSimon, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D27135

llvm-svn: 288042

fb501330

[LTO] Move finishOptimizationRemarks after codegen · 58951d38
Adam Nemet authored Nov 28, 2016
```
This addresses the comment D26832.

llvm-svn: 288041
```
58951d38

[X86][SSE] Added support for combining bit-shifts with shuffles. · 3f10e669

Simon Pilgrim authored Nov 28, 2016

Bit-shifts by a whole number of bytes can be represented as a shuffle mask suitable for combining.

Added a 'getFauxShuffleMask' function to allow us to create shuffle masks from other suitable operations.

llvm-svn: 288040

3f10e669

[OPENMP] Fix for PR31137: Wrong DSA for members in struct. · 7fcacd8e

Alexey Bataev authored Nov 28, 2016

If member expression is used in the task region and the base expression
is a DeclRefExp and the variable used in this ref expression is private,
it should be marked as implicitly firstprivate inside this region. Patch
fixes this issue.

llvm-svn: 288039

7fcacd8e

Fix floating point register reads x86_64 linux on targets with no AVX support · 6ec13991

Pavel Labath authored Nov 28, 2016

Summary:
On for 64-bit targets, the correct register set to read the fxsave are is
NT_PRFPREG (only 32-bit targets need NT_PRXFPREG, presumably for historic
reasons). Reference:
<https://github.com/torvalds/linux/blob/v4.8/arch/x86/kernel/ptrace.c#L1261>.

Reviewers: tberghammer, valentinagiusti

Subscribers: lldb-commits

Differential Revision: https://reviews.llvm.org/D27161

llvm-svn: 288038

6ec13991

[X86][SSE] Added tests showing missed combines of shifts with shuffles. · 3def9e11
Simon Pilgrim authored Nov 28, 2016
```
llvm-svn: 288037
```
3def9e11
Test commit · 59168e28
Daniel Cederman authored Nov 28, 2016
```
llvm-svn: 288036
```
59168e28
Revert "[DAG] Improve loads-from-store forwarding to handle TokenFactor" · a4133617
Nirav Dave authored Nov 28, 2016
```
This reverts commit r287773 which caused issues with ppc64le builds.

llvm-svn: 288035
```
a4133617
ClangMoveTests.cpp: Fix a bogus comparison of iterator. · 5843abc9
NAKAMURA Takumi authored Nov 28, 2016
```
msc Debug build detected it.

llvm-svn: 288034
```
5843abc9

[SystemZ] Fix build bot fallout from r288030 · a29bf16e

Ulrich Weigand authored Nov 28, 2016

Remove unused variable that came in due to a copy-and-paste bug
and caused build bot failures.

llvm-svn: 288033

a29bf16e

XFAIL: TestNoreturnUnwind on android x86_64 · 1776986d
Pavel Labath authored Nov 28, 2016
```
llvm-svn: 288032
```
1776986d

[SystemZ] Support execution hint instructions · 84404f30

Ulrich Weigand authored Nov 28, 2016

This adds assembler support for the instructions provided by the
execution-hint facility (NIAI and BP(R)P).  This required adding
support for the new relocation types for 12-bit and 24-bit PC-
relative offsets used by the BP(R)P instructions.

llvm-svn: 288031

84404f30

[SystemZ] Support load-and-trap instructions · 2d9e3d9d

Ulrich Weigand authored Nov 28, 2016

This adds support for the instructions provided with the
load-and-trap facility.

llvm-svn: 288030

2d9e3d9d

[SystemZ] Add remaining branch instructions · 75839913

Ulrich Weigand authored Nov 28, 2016

This patch adds assembler support for the remaining branch instructions:
the non-relative branch on count variants, and all variants of branch
on index.

The only one of those that can be readily exploited for code generation
is BRCTH (branch on count using a high 32-bit register as count).  Do
use it, however, it is necessary to also introduce a hew CHIMux pseudo
to allow comparisons of a 32-bit value agains a short immediate to go
into a high register as well (implemented via CHI/CIH).

This causes a bit of codegen changes overall, but those have proven to
be neutral (or even beneficial) in performance measurements.

llvm-svn: 288029

75839913

[SystemZ] Improve use of conditional instructions · 524f276c

Ulrich Weigand authored Nov 28, 2016

This patch moves formation of LOC-type instructions from (late)
IfConversion to the early if-conversion pass, and in some cases
additionally creates them directly from select instructions
during DAG instruction selection.

To make early if-conversion work, the patch implements the
canInsertSelect / insertSelect callbacks.  It also implements
the commuteInstructionImpl and FoldImmediate callbacks to
enable generation of the full range of LOC instructions.

Finally, the patch adds support for all instructions of the
load-store-on-condition-2 facility, which allows using LOC
instructions also for high registers.

Due to the use of the GRX32 register class to enable high registers,
we now also have to handle the cases where there are still no single
hardware instructions (conditional move from a low register to a high
register or vice versa).  These are converted back to a branch sequence
after register allocation.  Since the expandRAPseudos callback is not
allowed to create new basic blocks, this requires a simple new pass,
modelled after the ARM/AArch64 ExpandPseudos pass.

Overall, this patch causes significantly more LOC-type instructions
to be used, and results in a measurable performance improvement.

llvm-svn: 288028

524f276c

skip android in @skipIfHostIncompatibleWithRemote · 79724fc0

Pavel Labath authored Nov 28, 2016

The current implementation of the decorator does not skip if the android target
arch is the same as host arch (as in both cases the platform comes out as linux).
Nonetheless android x86_64 binaries are not compatible with linux ones.

Technically this should be "skip if target is android and host is *not* android",
but currently nobody runs lldb test suite on an android host, so we don't even
have a way of specifying that the host is android.

llvm-svn: 288027

79724fc0

Fix a crash in ProcessPOSIXLog · 4fd57542

Pavel Labath authored Nov 28, 2016

We are getting a null pointer for the list of categories here (presumably due to
the args refactor).

llvm-svn: 288026

4fd57542

[Sema] Set range end of constructors and destructors in template instantiations · 57ae8575

Malcolm Parsons authored Nov 28, 2016

Summary:
clang-tidy checks frequently use source ranges of functions.
The source range of constructors and destructors in template instantiations
is currently a single token.
The factory method for constructors and destructors does not allow the
end source location to be specified.
Set end location manually after creating instantiation.

Reviewers: aaron.ballman, rsmith, arphaman

Subscribers: arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D26849

llvm-svn: 288025

57ae8575

[InlineCost] Reduce inline thresholds to compensate for cost changes · 6bed13c5

James Molloy authored Nov 28, 2016

In r286814, the algorithm for calculating inline costs changed. This
caused more inlining to take place which is especially apparent
in optsize and minsize modes.

As the cost calculation removed a skewed behaviour (we were inconsistent
about the cost of calls) it isn't possible to update the thresholds to
get exactly the same behaviour as before. However, this threshold change
accounts for the very common case where an inline candidate has no
calls within it. In this case, r286814 would inline around 5-6 more (IR)
instructions.

The changes to -Oz have been heavily benchmarked. The "obvious" value
for the inline threshold at -Oz is zero, but due to inaccuracies in the
inline heuristics this can actually cause code size increases due to
not inlining key thunk functions (that then disappear). Experimentally,
5 was the sweet spot for code size over the test-suite.

For -Os, this change removes the outlier results shown up by green dragon
(http://104.154.54.203/db_default/v4/nts/13248).

Fixes D26848.

llvm-svn: 288024

6bed13c5

[PM] Remove weird marking of invalidated analyses as "preserved". · 0c6efff1

Chandler Carruth authored Nov 28, 2016

This never made a lot of sense. They've been invalidated for one IR unit
but they aren't really preserved in any normal sense. It seemed like it
would be an elegant way of communicating to outer IR units that pass
managers and adaptors had already handled invalidation, but we've since
ended up adding sets that model this more clearly: we're now using
the 'AllAnalysesOn<IRUnitT>' set to handle cases where the trick of
"preserving" invalidated analyses didn't work.

This patch moves to rely on that technique exclusively and removes the
cumbersome API aspect of updating the preserved set when doing
invalidation. This in turn will simplify a *number* of upcoming patches.

This has a side benefit of exposing a number of places where we were
failing to mark the 'AllAnalysesOn<IRUnitT>' set as preserved. This
patch fixes those, and with those fixes shouldn't change any observable
behavior.

llvm-svn: 288023

0c6efff1

[ELF] - Do not put non exec sections first when -no-rosegment · 1642c5d8

George Rimar authored Nov 28, 2016

That unifies handling cases when we have SECTIONS and when
-no-rosegment is given in compareSectionsNonScript()

Now Config->SingleRoRx is used for check, testcase is provided.

llvm-svn: 288022

1642c5d8

[ELF] - Set Config->SingleRoRx differently. NFC. · 18a30962

George Rimar authored Nov 28, 2016

Previously Config->SingleRoRx was set in
createFiles() and used HasSections.

This change moves it to readConfigs at place of
common flags handling, and adds logic that sets
this flag separatelly from ScriptParser if SECTIONS present.

llvm-svn: 288021

18a30962

[ELF] - Implemented -no-rosegment. · 63bf0110

George Rimar authored Nov 28, 2016

--no-rosegment: Do not put read-only non-executable sections in their own segment

Differential revision: https://reviews.llvm.org/D26889

llvm-svn: 288020

63bf0110

[ELF] Print file:line for 'undefined section' errors · ed30ce7a
Eugene Leviant authored Nov 28, 2016
```
Differential revision: https://reviews.llvm.org/D27108

llvm-svn: 288019
```
ed30ce7a
[ThreadPool] Rollback recent changes until I figure out the breakage. · 0f0d5d8f
Davide Italiano authored Nov 28, 2016
```
llvm-svn: 288018
```
0f0d5d8f
[ThreadPool] Remove outdated comment after r288016. · 3dd87dad
Davide Italiano authored Nov 28, 2016
```
llvm-svn: 288017
```
3dd87dad
[ThreadPool] Simplify the interface. NFCI. · 3ea0bfa7
Davide Italiano authored Nov 28, 2016
```
The callers don't use the return value. Found by Michael
Spencer.

llvm-svn: 288016
```
3ea0bfa7
Revert "Improve error handling in YAML parsing" · 43c24282
Mehdi Amini authored Nov 28, 2016
```
This reverts commit r288014, the unittest isn't passing

llvm-svn: 288015
```
43c24282

Improve error handling in YAML parsing · c54281be

Mehdi Amini authored Nov 28, 2016

Some scanner errors were not checked and reported by the parser.

Fix PR30934

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D26419

llvm-svn: 288014

c54281be

[PM] Add an ASCII-art diagram for the call graph in the CGSCC unit test. · 4cf2c898
Chandler Carruth authored Nov 28, 2016
```
No functionality changed.

llvm-svn: 288013
```
4cf2c898

Always create a PT_ARM_EXIDX if needed. · 8e67000f

Rafael Espindola authored Nov 28, 2016

Unfortunatelly PT_ARM_EXIDX is special. There is no way to create it
from linker scripts, so we have to create it even if PHDRS is used.

This matches bfd and is required for the lld output to survive bfd's strip.

llvm-svn: 288012

8e67000f

Nov 27, 2016
- [X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't... · 17786f77
  Craig Topper authored Nov 27, 2016
```
[X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't commutable as operand 0 should pass its upper bits through to the output.

llvm-svn: 288011
```
  17786f77
- [X86][FMA] Add missing Predicates qualifier around scalar FMA intrinsic patterns. · 13b27a27
  Craig Topper authored Nov 27, 2016
```
llvm-svn: 288010
```
  13b27a27
- [X86][FMA4] Add load folding support for FMA4 scalar intrinsic instructions. · ff9d4587
  Craig Topper authored Nov 27, 2016
```
llvm-svn: 288009
```
  ff9d4587
- [X86][FMA4] Add test cases to demonstrate missed folding opportunities for FMA4 scalar intrinsics. · b00872b9
  Craig Topper authored Nov 27, 2016
```
llvm-svn: 288008
```
  b00872b9
- [X86] Add SHL by 1 to the load folding tables. · 3674f44e
  Craig Topper authored Nov 27, 2016
```
I don't think isel selects these today, favoring adding the register to itself instead. But the load folding tables shouldn't be so concerned with what isel will use and just represent the relationships.

llvm-svn: 288007
```
  3674f44e
- [X86][SSE] Add support for combining target shuffles to 128/256-bit PSLL/PSRL bit shifts · 91d6f5fb
  Simon Pilgrim authored Nov 27, 2016
```
llvm-svn: 288006
```
  91d6f5fb
- [InstSimplify] allow integer vector types to use computeKnownBits · 8ca30ab0
  Sanjay Patel authored Nov 27, 2016
```
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.

llvm-svn: 288005
```
  8ca30ab0
- [AVX-512] Add integer and fp unpck instructions to load folding tables. · 4fab4872
  Craig Topper authored Nov 27, 2016
```
llvm-svn: 288004
```
  4fab4872
- [X86][SSE] Split lowerVectorShuffleAsShift ready for combines. NFCI. · cdb2ce66
  Simon Pilgrim authored Nov 27, 2016
```
Moved most of matching code into matchVectorShuffleAsShift to share with target shuffle combines (in a future commit).

llvm-svn: 288003
```
  cdb2ce66