Commits · 5879a48c17352192f8f3d04cb31418d80ca2ac1b · Roger Ferrer / llvm-epi

Feb 09, 2017

Tweak the implementation of llvm-objdump’s -objc-meta-data option so · 5879a48c

Kevin Enderby authored Feb 09, 2017

that it works when the ObjC metadata sections end up in the
__DATA_CONST or __DATA_DIRTY segments.

rdar://26315238

llvm-svn: 294599

5879a48c

[X86][BMI2] Regenerate mulx tests · b25f6021
Simon Pilgrim authored Feb 09, 2017
```
llvm-svn: 294598
```
b25f6021
Fixed documentation bug found by Ramana who sent a patch to the lldb-dev list. · 7bcf46e2
Greg Clayton authored Feb 09, 2017
```
llvm-svn: 294597
```
7bcf46e2
[X86][MMX] Remove the (long time) unused MMX_PINSRW ISD opcode. · 6bf1bd3e
Simon Pilgrim authored Feb 09, 2017
```
llvm-svn: 294596
```
6bf1bd3e

[docs] Documentation update for Scudo · 3b399344

Kostya Kortchinsky authored Feb 09, 2017

Summary:
Documentation update to reflect the changes that occured in the allocator:
- additional architectures support;
- modification of the header;
- options default values for 32 & 64-bit.

Reviewers: kcc, alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29592

llvm-svn: 294595

3b399344

Object: add a comment explaining a divergence · d3faeaf8

Saleem Abdulrasool authored Feb 09, 2017

Add a note about the reason for the divergence from the specification
for ld64.  Addresses post-commit review comments from Davide.  NFC.

llvm-svn: 294594

d3faeaf8

Revert: "[Stack Protection] Add diagnostic information for why stack... · 93e773e9

David Bozier authored Feb 09, 2017

Revert: "[Stack Protection] Add diagnostic information for why stack protection was applied to a function"

this reverts revision r294590 as it broke some buildbots.

llvm-svn: 294593

93e773e9

Handle gcing user created metadata. · ed441c70
Rafael Espindola authored Feb 09, 2017
```
In particular, this should allow us to gc unused asan metadata.

llvm-svn: 294592
```
ed441c70

Add DAGCombiner load combine tests for partially available values · 0e4583b5

Artur Pilipenko authored Feb 09, 2017

If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift. Currently we don't support it, so the tests check the current codegen without load combine. This change will make the patch to support this kind of combine a bit more clear.

llvm-svn: 294591

0e4583b5

[Stack Protection] Add diagnostic information for why stack protection was applied to a function · 6a44b7c2

David Bozier authored Feb 09, 2017

Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function.

This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled.

Patch by: James Henderson

Differential Revision: https://reviews.llvm.org/D29023

llvm-svn: 294590

6a44b7c2

Make it possible to set SHF_LINK_ORDER explicitly. · dc1c3011
Rafael Espindola authored Feb 09, 2017
```
This will make it possible to add support for gcing user metadata
(asan for example).

llvm-svn: 294589
```
dc1c3011

[X86][btver2] PR31902: Fix a crash in combineOrCmpEqZeroToCtlzSrl under fast math. · 6953b324

Pierre Gousseau authored Feb 09, 2017

In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants.

Differential Revision: https://reviews.llvm.org/D29756

llvm-svn: 294588

6953b324

[X86][SSE] Added extra FMA/NO-FMA reciprocal test cases for D26855 · 05ac1f70
Simon Pilgrim authored Feb 09, 2017
```
Test for expected codegen for nr reciprocal cases with/without FMA

llvm-svn: 294587
```
05ac1f70

[docs] cleanup documentation on lit substitutions · 9126f542

David Bozier authored Feb 09, 2017

1. Added missing substitutions to the documentation in docs/TestingGuide.rst
2. Modified docs/CommandGuide/lit.rst to only document the "base" set of substitutions and to refer the reader to docs/TestingGuide.rst for more detailed info on substitutions.

Patch by bd1976llvm

Differential Revision: https://reviews.llvm.org/D29281

llvm-svn: 294586

9126f542

Use protected name for the prototype arguments. · 9e8a082d
Joerg Sonnenberger authored Feb 09, 2017
```
llvm-svn: 294585
```
9e8a082d

[ARM] GlobalISel: Lower single precision FP args · 7232af35

Diana Picus authored Feb 09, 2017

Both for aapcscc and aapcs_vfpcc. We currently filter out soft float targets
because we don't support libcalls yet.

llvm-svn: 294584

7232af35

Revert r294580 , it didn't fix the shared build · 75dcfe84
Ismail Donmez authored Feb 09, 2017
```
llvm-svn: 294583
```
75dcfe84

[DAGCombiner] Support non-zero offset in load combine · 4a640319

Artur Pilipenko authored Feb 09, 2017

Enable folding patterns which load the value from non-zero offset:

  i8 *a = ...
  i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24)
=>
  i32 val = *((i32*)(a+4))

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D29394

llvm-svn: 294582

4a640319

[X86][SSE] Attempt to break register dependencies during lowerBuildVector · 563e23e6

Simon Pilgrim authored Feb 09, 2017

LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register.

This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD.

On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.)

Differential Revision: https://reviews.llvm.org/D29720

llvm-svn: 294581

563e23e6

Fix shared library build · 4200948c
Ismail Donmez authored Feb 09, 2017
```
llvm-svn: 294580
```
4200948c

[ELF] Refactor PltSection and IPltSection into PltSection [NFC] · f09245a6

Peter Smith authored Feb 09, 2017

    
Much of the code in PltSection and IPltSection is similar, we identify
the IPlt by a HeaderSize of 0 and alter our behaviour in the member
functions appropriately:
-Iplt does not have a header
-Iplt always follows after the Plt
    
Differential Revision: https://reviews.llvm.org/D29664

llvm-svn: 294579

f09245a6

[clang-tidy] Fix misc-unused-using-decls false positives in presence of compile errors · 28239b16
Alexander Kornienko authored Feb 09, 2017
```
llvm-svn: 294578
```
28239b16

[ELF] Use synthetic section to hold copy relocation · ebfe9941

Peter Smith authored Feb 09, 2017

    
When we need a copy relocation we create a synthetic SHT_NOBITS
section that contains the right amount of ZI and assign it to either
.bss or .rel.ro.bss as appropriate. This allows the dynamic relocation
to be placed on the InputSection, removing the last case where a
dynamic relocation is stored as an offset from the OutputSection. This
has the side effect that we can run assignOffsets() after scanRelocs()
without losing the additional ZI needed for the copy relocations.

Differential Revision: https://reviews.llvm.org/D29637

llvm-svn: 294577

ebfe9941

[ScopInfo] Expect the OriginalBaseAddr when looking at underlying instructions [NFC] · be372d5a

Tobias Grosser authored Feb 09, 2017

During SCoP construction we sometimes inspect the underlying IR by looking at
the base address of a MemoryAccess. In such cases, we always want the original
base address. Make this clear by calling getOriginalBaseAddr().

This is a non-functional change as getBaseAddr maps to getOriginalBaseAddr
at the moment.

This change removes unnecessary uses of MemoryAddress::getBaseAddr() in
preparation for https://reviews.llvm.org/D28518.

llvm-svn: 294576

be372d5a

[ScopInfo] Remove unnecessary indirection through SCEV [NFC] · e0e0e4d4

Tobias Grosser authored Feb 09, 2017

The base address of a memory access is already an llvm::Value. Hence, there is
no need to go through SCEV, but we can directly work with the llvm::Value.

Also use 'Value *' instead of 'auto' for cases where the type is not obvious.

llvm-svn: 294575

e0e0e4d4

[IRBuilder] Extract base pointers directly from ScopArray · 4553463b

Tobias Grosser authored Feb 09, 2017

Instead of iterating over statements and their memory accesses to extract the
set of available base pointers, just directly iterate over all ScopArray
objects. This reflects more the actual intend of the code: collect all arrays
(and their base pointers) to emit alias information that specifies that accesses
to different arrays cannot alias.

This change removes unnecessary uses of MemoryAddress::getBaseAddr() in
preparation for https://reviews.llvm.org/D28518.

llvm-svn: 294574

4553463b

Threading support: externalize sleep_for() function. · 54a987e1

Asiri Rathnayake authored Feb 09, 2017

Different platforms implement the wait/sleep functions in difrerent ways.
It makes sense to externalize this into the threading API.

Differential revision: https://reviews.llvm.org/D29630

Reviewers: EricWF, joerg
llvm-svn: 294573

54a987e1

LVI: Fix use-of-uninitialized-value after r294463 · 9987d983
Vitaly Buka authored Feb 09, 2017
```
BlockValueStack can be reallocated making reference e invalid.

llvm-svn: 294572
```
9987d983

[FIX] Disable the problematic run lines · 028ba370

Roman Gareev authored Feb 09, 2017

There are problems with using the machine information to derive the precise
vector size on polly-amd64-linux and polly-arm-linux. We temporarily disable
the problematic run lines.

llvm-svn: 294571

028ba370

[clang-format] Fix typo in comment. · b8b987f5
Krasimir Georgiev authored Feb 09, 2017
```
llvm-svn: 294570
```
b8b987f5
[FIX] Specify the CPU to overwrite the machine info and set a fixed vector · 2d0d294e
Roman Gareev authored Feb 09, 2017
```
size.

llvm-svn: 294569
```
2d0d294e

[IslAst] Print the ScopArray name to mark reductions · 26fb7d75

Tobias Grosser authored Feb 09, 2017

Before this change we used the name of the base pointer to mark reductions. This
is imprecise as the canonical reference is the ScopArray itself and not the
basepointer of a reduction. Using the base pointer of reductions is problematic
in cases where a single ScopArray is referenced through two different base
pointers.

This change removes unnecessary uses of MemoryAddress::getBaseAddr() in
preparation for https://reviews.llvm.org/D28518.

llvm-svn: 294568

26fb7d75

[DependenceInfo] Use ScopArrayInfo to keep track of arrays [NFC] · 114f6d6f

Tobias Grosser authored Feb 09, 2017

When computing reduction dependences we first identify all ScopArrays which are
part of reductions and then only compute for these ScopArrays the more detailed
data dependences that allow us to identify reductions and optimize across them.
Instead of using the base pointer as identifier of a ScopArray, it is clearer
and more understandable to directly use the ScopArray as identifier. This change
implements such a switch.

This change removes unnecessary uses of MemoryAddress::getBaseAddr() in
preparation for https://reviews.llvm.org/D28518.

llvm-svn: 294567

114f6d6f

[BlockGenerator] BBMap uses original BaseAddress for scalar loads [NFC] · 02400a0e

Tobias Grosser authored Feb 09, 2017

When regenerating code in the BlockGenerator we copy instructions that may
references scalar values, for which the new value of a given scalar is looked up
in BBMap using the original scalar llvm::Value as index. It is consequently
necessary that (re)loaded scalar values are made available in BBMap using the
original llvm::Value as key independently if the llvm::Value was (re)loaded from
the original scalar or a new access function has been specified that caused the
value to be reloaded from an array with a differnet base address. We make this
clear by using MemoryAccess::getOriginalBaseAddr() instead of
MemoryAccess::getBaseAddr() as index to BBMap.

This change removes unnecessary uses of MemoryAddress::getBaseAddr() in
preparation for https://reviews.llvm.org/D28518.

llvm-svn: 294566

02400a0e

Add new tests for EXTRACT_VECTOR_ELT (vector of packed i8/16/i32/i64/ps/pd data) · ed43f156
Igor Breger authored Feb 09, 2017
```
llvm-svn: 294565
```
ed43f156

Isolate a set of partial tile prefixes in case of the matrix multiplication · 9989088e

Roman Gareev authored Feb 09, 2017

optimization

Isolate a set of partial tile prefixes to allow hoisting and sinking out of
the unrolled innermost loops produced by the optimization of the matrix
multiplication.

In case it cannot be proved that the number of loop iterations can be evenly
divided by tile sizes and we tile and unroll the point loop, the isl generates
conditional expressions. Subsequently, the conditional expressions can prevent
stores and loads of the unrolled loops from being sunk and hoisted.

The patch isolates a set of partial tile prefixes, which have exactly Mr x Nr
iterations of the two innermost loops, the result of the loop tiling performed
by the matrix multiplication optimization, where Mr and Mr are parameters of
the micro-kernel. This helps to get rid of the conditional expressions of
the unrolled innermost loops. Probably this approach can be replaced with
padding in future.

In case of, for example, the gemm from Polybench/C 3.2 and parametric loop
bounds, it helps to increase the performance from 7.98 GFlops (27.71% of
theoretical peak) to 21.47 GFlops (74.57% of theoretical peak). Hence, we
get the same performance as in case of scalar loops bounds.

It also cause compile time regression. The compile-time is increased from
0.795 seconds to 0.837 seconds in case of scalar loops bounds and from 1.222
seconds to 1.490 seconds in case of parametric loops bounds.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D29244

llvm-svn: 294564

9989088e

[XRAY] [compiler-rt] [NFC] Fixing the bit twiddling of Function Id in FDR logging mode. · 860247b1

Dean Michael Berris authored Feb 09, 2017

Summary:
Fixing a bug I found when testing a reader for the FDR format. Function ID is
now correctly packed into the 28 bits which are documented for it instead of being
masked to all ones.

Reviewers: dberris, pelikan, eugenis

Reviewed By: dberris

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29698

llvm-svn: 294563

860247b1

[X86] Remove the HLE feature flag. · 3cac7635

Craig Topper authored Feb 09, 2017

We only implemented it for one of the 3 HLE instructions and that instruction is also under the RTM flag. Clang only implements the RTM flag from its command line.

llvm-svn: 294562

3cac7635

[X86] Remove INVPCID and SMAP feature flags. They aren't currently used by any... · 86576bd9

Craig Topper authored Feb 09, 2017

[X86] Remove INVPCID and SMAP feature flags. They aren't currently used by any instructions and not tested.

If we implement intrinsics for their instructions in the future, the feature flags can be added back with proper testing.

llvm-svn: 294561

86576bd9

[X86] Fix copy and paste bug in clzero test from r294559. · 41cb8ffc
Craig Topper authored Feb 09, 2017
```
llvm-svn: 294560
```
41cb8ffc