- Jun 26, 2015
-
-
Sergey Dmitrouk authored
Without explicit exception for the path, it matches projects/* rule. llvm-svn: 240771
-
David Majnemer authored
No functionality changed, just keeping things clean. llvm-svn: 240762
-
Hao Liu authored
llvm-svn: 240760
-
Hao Liu authored
This patch also adds a function to calculate the cost of interleaved memory accesses. E.g. Lower an interleaved load: %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4 %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> into: %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4) %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0 %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1 E.g. Lower an interleaved store: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4 into: %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3> %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7> %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11> call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4) Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240755
-
Hao Liu authored
[AArch64] Lower interleaved memory accesses to ldN/stN intrinsics. This patch also adds a function to calculate the cost of interleaved memory accesses. E.g. Lower an interleaved load: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> into: %ld2 = { <4 x i32>, <4 x i32> } call llvm.aarch64.neon.ld2(%ptr) %vec0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0 %vec1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1 E.g. Lower an interleaved store: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr into: %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3> %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7> %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11> call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr) Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240754
-
Hao Liu authored
[InterleavedAccess] Add a pass InterleavedAccess to identify interleaved memory accesses and transform into target specific intrinsics. E.g. An interleaved load (Factor = 2): %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle <8 x i32> %wide.vec, <8 x i32> undef, <0, 2, 4, 6> %v1 = shuffle <8 x i32> %wide.vec, <8 x i32> undef, <1, 3, 5, 7> It can be transformed into a ld2 intrinsic in AArch64 backend or a vld2 intrinsic in ARM backend. E.g. An interleaved store (Factor = 3): %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr It can be transformed into a st3 intrinsic in AArch64 backend or a vst3 intrinsic in ARM backend. Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240751
-
Duncan P. N. Exon Smith authored
r240748 seems to be on the right path. Be more explicit. http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/1961/ llvm-svn: 240750
-
Duncan P. N. Exon Smith authored
Try to placate bots by explicitly scoping a conversion constructor from `iterator` to `const_iterator`. http://lab.llvm.org:8011/builders/sanitizer-windows/builds/5931/ llvm-svn: 240748
-
Matthias Braun authored
Revert until http://llvm.org/PR23955 is investigated. This reverts commit r239309. llvm-svn: 240746
-
Matthias Braun authored
llvm-svn: 240745
-
Matthias Braun authored
llvm-svn: 240744
-
Alexey Samsonov authored
It can be more robust than copying debug info from first non-alloca instruction in the entry basic block. We use the same strategy in coverage instrumentation. llvm-svn: 240738
-
Duncan P. N. Exon Smith authored
Replace the `std::vector<>` for `DIE::Children` with an intrusively linked list. This is a strict memory improvement: it requires no auxiliary storage, and reduces `sizeof(DIE)` by one pointer. It also factors out the DIE-related malloc traffic. This drops llc memory usage from 735 MB down to 718 MB, or ~2.3%. (I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`; see r236629 for details.) llvm-svn: 240736
-
Duncan P. N. Exon Smith authored
Change `DIE::Values` to a singly linked list, where each node is allocated on a `BumpPtrAllocator`. In order to support `push_back()`, the list is circular, and points at the tail element instead of the head. I abstracted the core list logic out to `IntrusiveBackList` so that it can be reused for `DIE::Children`, which also cares about `push_back()`. This drops llc memory usage from 799 MB down to 735 MB, about 8%. (I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`; see r236629 for details.) llvm-svn: 240733
-
Michael J. Spencer authored
llvm-svn: 240731
-
Michael J. Spencer authored
llvm-svn: 240730
-
Alexey Samsonov authored
llvm-svn: 240729
-
NAKAMURA Takumi authored
llvm-svn: 240727
-
Adrian Prantl authored
llvm-svn: 240726
-
Anna Zaks authored
Do not instrument globals that are placed in sections containing "__llvm" in their name. This fixes a bug in ASan / PGO interoperability. ASan interferes with LLVM's PGO, which places its globals into a special section, which is memcpy-ed by the linker as a whole. When those goals are instrumented, ASan's memcpy wrapper reports an issue. http://reviews.llvm.org/D10541 llvm-svn: 240723
-
Anna Zaks authored
It makes LLVM run out of registers even on 64-bit platforms. For example, the following test case fails on darwin. clang -cc1 -O0 -triple x86_64-apple-macosx10.10.0 -emit-obj -fsanitize=address -mstackrealign -o ~/tmp/ex.o -x c ex.c error: inline assembly requires more registers than available void TestInlineAssembly(const unsigned char *S, unsigned int pS, unsigned char *D, unsigned int pD, unsigned int h) { unsigned int sr = 4, pDiffD = pD - 5; unsigned int pDiffS = (pS << 1) - 5; char flagSA = ((pS & 15) == 0), flagDA = ((pD & 15) == 0); asm volatile ( "mov %0, %%"PTR_REG("si")"\n" "mov %2, %%"PTR_REG("cx")"\n" "mov %1, %%"PTR_REG("di")"\n" "mov %8, %%"PTR_REG("ax")"\n" : : "m" (S), "m" (D), "m" (pS), "m" (pDiffS), "m" (pDiffD), "m" (sr), "m" (flagSA), "m" (flagDA), "m" (h) : "%"PTR_REG("si"), "%"PTR_REG("di"), "%"PTR_REG("ax"), "%"PTR_REG("cx"), "%"PTR_REG("dx"), "memory" ); } http://reviews.llvm.org/D10719 llvm-svn: 240722
-
Adrian Prantl authored
While looking at a couple of bugs in the debug info output for bitfields I noticed that there wasn't a single regression test to test my changes against, so here's a start. llvm-svn: 240717
-
Matt Arsenault authored
llvm-svn: 240709
-
Rafael Espindola authored
This allows user code to say Sym.getSize() instead of having to manually fetch the object. llvm-svn: 240708
-
- Jun 25, 2015
-
-
Frederic Riss authored
r224810 fixed the handling of macro debug locations in AsmParser. This patch fixes the logic to actually do what was intended: it uses the first macro of the macro stack instead of the last one. The updated testcase shows that the current scheme doesn't work when macro instanciations are nested and multiple files are used. Reviewers: compnerd Differential Revision: http://reviews.llvm.org/D10463 llvm-svn: 240705
-
Michael J. Spencer authored
llvm-svn: 240703
-
Duncan P. N. Exon Smith authored
Split out code to patch up the `DW_AT_stmt_list` for the cloned DIE, and reorganize it so that it doesn't depend on `DIE::values_begin()` and `DIE::values_end()` (which I'm trying to kill off). David Blaikie and I talked about adding a range-algorithm version of `std::find_if()`, but the assertion *still* required getting at the end iterator. IMO, a separate helper function with an early return is easier to reason about here. A follow-up commit that removes `DIE::setValue()` and mutates the `DIEValue` directly is coming shortly. llvm-svn: 240701
-
Sanjay Patel authored
llvm-svn: 240699
-
Rafael Espindola authored
This matches the behavior of gnu nm. Fixes pr23930. llvm-svn: 240695
-
Pete Cooper authored
A number of places had explicit loops over Constant::operands(). Just use foreach loops where possible. llvm-svn: 240694
-
Rafael Espindola authored
llvm-svn: 240684
-
Jingyue Wu authored
Summary: Fixes PR23809. Without passing the context to SimplifyICmpInst, we would use the assume to prove that the condition feeding the assume is trivially true (see isValidAssumeForContext in ValueTracking.cpp), causing the removal of the assume which may be useful for later optimizations. Test Plan: pr23800.ll Reviewers: hfinkel, majnemer Reviewed By: hfinkel Subscribers: henryhu, llvm-commits, wengxt, broune, meheff, eliben Differential Revision: http://reviews.llvm.org/D10695 llvm-svn: 240683
-
Rafael Espindola authored
We already disallowed .global .Lfoo so this is reasonable. This is a small cherry pick from r240130. llvm-svn: 240681
-
Paul Robinson authored
It was matching at EOF regardless of whether the section was present. llvm-svn: 240679
-
Yaron Keren authored
llvm-svn: 240678
-
Douglas Katzman authored
llvm-svn: 240673
-
Matt Arsenault authored
MemIntrinsicSDNode is already a subclass of MemSDNode, so the MemSDNode check is sufficient. llvm-svn: 240672
-
Peter Collingbourne authored
This previously caused miscompilations as a result of phi nodes receiving undef incoming values from blocks dominated by such successors. Differential Revision: http://reviews.llvm.org/D10726 llvm-svn: 240670
-
Rafael Espindola authored
llvm-svn: 240656
-
Rafael Espindola authored
This matches gnu nm and has the advantage that there is a upper case N. llvm-svn: 240655
-