- Feb 09, 2017
-
-
Kevin Enderby authored
that it works when the ObjC metadata sections end up in the __DATA_CONST or __DATA_DIRTY segments. rdar://26315238 llvm-svn: 294599
-
Simon Pilgrim authored
llvm-svn: 294598
-
Greg Clayton authored
llvm-svn: 294597
-
Simon Pilgrim authored
llvm-svn: 294596
-
Kostya Kortchinsky authored
Summary: Documentation update to reflect the changes that occured in the allocator: - additional architectures support; - modification of the header; - options default values for 32 & 64-bit. Reviewers: kcc, alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29592 llvm-svn: 294595
-
Saleem Abdulrasool authored
Add a note about the reason for the divergence from the specification for ld64. Addresses post-commit review comments from Davide. NFC. llvm-svn: 294594
-
David Bozier authored
Revert: "[Stack Protection] Add diagnostic information for why stack protection was applied to a function" this reverts revision r294590 as it broke some buildbots. llvm-svn: 294593
-
Rafael Espindola authored
In particular, this should allow us to gc unused asan metadata. llvm-svn: 294592
-
Artur Pilipenko authored
If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift. Currently we don't support it, so the tests check the current codegen without load combine. This change will make the patch to support this kind of combine a bit more clear. llvm-svn: 294591
-
David Bozier authored
Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function. This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled. Patch by: James Henderson Differential Revision: https://reviews.llvm.org/D29023 llvm-svn: 294590
-
Rafael Espindola authored
This will make it possible to add support for gcing user metadata (asan for example). llvm-svn: 294589
-
Pierre Gousseau authored
In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants. Differential Revision: https://reviews.llvm.org/D29756 llvm-svn: 294588
-
Simon Pilgrim authored
Test for expected codegen for nr reciprocal cases with/without FMA llvm-svn: 294587
-
David Bozier authored
1. Added missing substitutions to the documentation in docs/TestingGuide.rst 2. Modified docs/CommandGuide/lit.rst to only document the "base" set of substitutions and to refer the reader to docs/TestingGuide.rst for more detailed info on substitutions. Patch by bd1976llvm Differential Revision: https://reviews.llvm.org/D29281 llvm-svn: 294586
-
Joerg Sonnenberger authored
llvm-svn: 294585
-
Diana Picus authored
Both for aapcscc and aapcs_vfpcc. We currently filter out soft float targets because we don't support libcalls yet. llvm-svn: 294584
-
Ismail Donmez authored
llvm-svn: 294583
-
Artur Pilipenko authored
Enable folding patterns which load the value from non-zero offset: i8 *a = ... i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24) => i32 val = *((i32*)(a+4)) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29394 llvm-svn: 294582
-
Simon Pilgrim authored
LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register. This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD. On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.) Differential Revision: https://reviews.llvm.org/D29720 llvm-svn: 294581
-
Ismail Donmez authored
llvm-svn: 294580
-
Peter Smith authored
Much of the code in PltSection and IPltSection is similar, we identify the IPlt by a HeaderSize of 0 and alter our behaviour in the member functions appropriately: -Iplt does not have a header -Iplt always follows after the Plt Differential Revision: https://reviews.llvm.org/D29664 llvm-svn: 294579
-
Alexander Kornienko authored
llvm-svn: 294578
-
Peter Smith authored
When we need a copy relocation we create a synthetic SHT_NOBITS section that contains the right amount of ZI and assign it to either .bss or .rel.ro.bss as appropriate. This allows the dynamic relocation to be placed on the InputSection, removing the last case where a dynamic relocation is stored as an offset from the OutputSection. This has the side effect that we can run assignOffsets() after scanRelocs() without losing the additional ZI needed for the copy relocations. Differential Revision: https://reviews.llvm.org/D29637 llvm-svn: 294577
-
Tobias Grosser authored
During SCoP construction we sometimes inspect the underlying IR by looking at the base address of a MemoryAccess. In such cases, we always want the original base address. Make this clear by calling getOriginalBaseAddr(). This is a non-functional change as getBaseAddr maps to getOriginalBaseAddr at the moment. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294576
-
Tobias Grosser authored
The base address of a memory access is already an llvm::Value. Hence, there is no need to go through SCEV, but we can directly work with the llvm::Value. Also use 'Value *' instead of 'auto' for cases where the type is not obvious. llvm-svn: 294575
-
Tobias Grosser authored
Instead of iterating over statements and their memory accesses to extract the set of available base pointers, just directly iterate over all ScopArray objects. This reflects more the actual intend of the code: collect all arrays (and their base pointers) to emit alias information that specifies that accesses to different arrays cannot alias. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294574
-
Asiri Rathnayake authored
Different platforms implement the wait/sleep functions in difrerent ways. It makes sense to externalize this into the threading API. Differential revision: https://reviews.llvm.org/D29630 Reviewers: EricWF, joerg llvm-svn: 294573
-
Vitaly Buka authored
BlockValueStack can be reallocated making reference e invalid. llvm-svn: 294572
-
Roman Gareev authored
There are problems with using the machine information to derive the precise vector size on polly-amd64-linux and polly-arm-linux. We temporarily disable the problematic run lines. llvm-svn: 294571
-
Krasimir Georgiev authored
llvm-svn: 294570
-
Roman Gareev authored
size. llvm-svn: 294569
-
Tobias Grosser authored
Before this change we used the name of the base pointer to mark reductions. This is imprecise as the canonical reference is the ScopArray itself and not the basepointer of a reduction. Using the base pointer of reductions is problematic in cases where a single ScopArray is referenced through two different base pointers. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294568
-
Tobias Grosser authored
When computing reduction dependences we first identify all ScopArrays which are part of reductions and then only compute for these ScopArrays the more detailed data dependences that allow us to identify reductions and optimize across them. Instead of using the base pointer as identifier of a ScopArray, it is clearer and more understandable to directly use the ScopArray as identifier. This change implements such a switch. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294567
-
Tobias Grosser authored
When regenerating code in the BlockGenerator we copy instructions that may references scalar values, for which the new value of a given scalar is looked up in BBMap using the original scalar llvm::Value as index. It is consequently necessary that (re)loaded scalar values are made available in BBMap using the original llvm::Value as key independently if the llvm::Value was (re)loaded from the original scalar or a new access function has been specified that caused the value to be reloaded from an array with a differnet base address. We make this clear by using MemoryAccess::getOriginalBaseAddr() instead of MemoryAccess::getBaseAddr() as index to BBMap. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294566
-
Igor Breger authored
llvm-svn: 294565
-
Roman Gareev authored
optimization Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication. In case it cannot be proved that the number of loop iterations can be evenly divided by tile sizes and we tile and unroll the point loop, the isl generates conditional expressions. Subsequently, the conditional expressions can prevent stores and loads of the unrolled loops from being sunk and hoisted. The patch isolates a set of partial tile prefixes, which have exactly Mr x Nr iterations of the two innermost loops, the result of the loop tiling performed by the matrix multiplication optimization, where Mr and Mr are parameters of the micro-kernel. This helps to get rid of the conditional expressions of the unrolled innermost loops. Probably this approach can be replaced with padding in future. In case of, for example, the gemm from Polybench/C 3.2 and parametric loop bounds, it helps to increase the performance from 7.98 GFlops (27.71% of theoretical peak) to 21.47 GFlops (74.57% of theoretical peak). Hence, we get the same performance as in case of scalar loops bounds. It also cause compile time regression. The compile-time is increased from 0.795 seconds to 0.837 seconds in case of scalar loops bounds and from 1.222 seconds to 1.490 seconds in case of parametric loops bounds. Reviewed-by:
Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29244 llvm-svn: 294564
-
Dean Michael Berris authored
Summary: Fixing a bug I found when testing a reader for the FDR format. Function ID is now correctly packed into the 28 bits which are documented for it instead of being masked to all ones. Reviewers: dberris, pelikan, eugenis Reviewed By: dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29698 llvm-svn: 294563
-
Craig Topper authored
We only implemented it for one of the 3 HLE instructions and that instruction is also under the RTM flag. Clang only implements the RTM flag from its command line. llvm-svn: 294562
-
Craig Topper authored
[X86] Remove INVPCID and SMAP feature flags. They aren't currently used by any instructions and not tested. If we implement intrinsics for their instructions in the future, the feature flags can be added back with proper testing. llvm-svn: 294561
-
Craig Topper authored
llvm-svn: 294560
-