- Oct 28, 2014
-
-
Robert Khasanov authored
Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code. Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll): define <16 x i32> @_inreg16xi32(i32 %a) { ; CHECK-LABEL: _inreg16xi32: ; CHECK: ## BB#0: -; CHECK-NEXT: vpbroadcastd %edi, %zmm0 +; CHECK-NEXT: vmovd %edi, %xmm0 +; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0 +; CHECK-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0 ; CHECK-NEXT: retq %b = insertelement <16 x i32> undef, i32 %a, i32 0 %c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer ret <16 x i32> %c } Here, 256-bit broadcast was generated instead of 512-bit one. In this patch 1) I added vector-shuffle lowering through broadcasts 2) Removed asserts and branches likes because this is incorrect - assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI"); 3) Fixed lowering tests llvm-svn: 220774
-
NAKAMURA Takumi authored
llvm-svn: 220773
-
NAKAMURA Takumi authored
llvm-svn: 220772
-
NAKAMURA Takumi authored
llvm-svn: 220771
-
NAKAMURA Takumi authored
llvm-svn: 220770
-
Eric Fiselier authored
[libcxx] Delay evaluation of __make_tuple_types to prevent blowing the max template instantiation depth. Fixes Bug #18345 Summary: http://llvm.org/bugs/show_bug.cgi?id=18345 Tuple's constructor and assignment operators for "tuple-like" types evaluates __make_tuple_types unnecessarily. In the case of a large array this can blow the template instantiation depth. Ex: ``` #include <array> #include <tuple> #include <memory> typedef std::array<int, 1256> array_t; typedef std::tuple<array_t> tuple_t; int main() { array_t a; tuple_t t(a); // broken t = a; // broken // make_shared uses tuple behind the scenes. This bug breaks this code. std::make_shared<array_t>(a); } ``` To prevent this from happening we delay the instantiation of `__make_tuple_types` until after we perform the length check. Currently `__make_tuple_types` is instantiated at the same time that the length check . Test Plan: Two tests have been added. One for the "tuple-like" constructors and another for the "tuple-like" assignment operator. Reviewers: mclow.lists, EricWF Reviewed By: EricWF Subscribers: K-ballo, cfe-commits Differential Revision: http://reviews.llvm.org/D4467 llvm-svn: 220769
-
Rui Ueyama authored
test/elf/Mips/hilo16-*.test depends on llvm-mc, so we need to make CMake to build that before running the tests. llvm-svn: 220768
-
Peter Zotov authored
We don't care about pre-3.12.1 anymore. llvm-svn: 220767
-
Peter Zotov authored
llvm-svn: 220766
-
Richard Trieu authored
llvm-svn: 220763
-
Jason Molenda authored
PlatformLinux::GetSoftwareBreakpointTrapOpcode. Patch by Stephane Sezer. http://reviews.llvm.org/D5923 llvm-svn: 220762
-
Jason Molenda authored
<rdar://problem/18786645> llvm-svn: 220761
-
Saleem Abdulrasool authored
The option is '--allow-multiple-definition' not '--allow-multiple-definitions'. llvm-svn: 220760
-
David Blaikie authored
llvm-svn: 220759
-
Reid Kleckner authored
This is a Microsoft calling convention that supports both x86 and x86_64 subtargets. It passes vector and floating point arguments in XMM0-XMM5, and passes them indirectly once they are consumed. Homogenous vector aggregates of up to four elements can be passed in sequential vector registers, but this part is not implemented in LLVM and will be handled in Clang. On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as integer register parameters and is callee cleanup. On x86_64, it delegates to the normal win64 calling convention. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D5943 llvm-svn: 220745
-
Tim Northover authored
Benchmarks have shown that it's harmless to the performance there, and having a unified set of passes between the two cores where possible helps big.LITTLE deployment. Patch by Z. Zheng. llvm-svn: 220744
-
Jim Ingham authored
llvm-svn: 220743
-
Rafael Espindola authored
llvm-svn: 220742
-
Rafael Espindola authored
I noticed that it was untested, and forcing it on caused some tests to fail: LLVM :: Linker/metadata-a.ll LLVM :: Linker/prefixdata.ll LLVM :: Linker/type-unique-odr-a.ll LLVM :: Linker/type-unique-simple-a.ll LLVM :: Linker/type-unique-simple2-a.ll LLVM :: Linker/type-unique-simple2.ll LLVM :: Linker/type-unique-type-array-a.ll LLVM :: Linker/unnamed-addr1-a.ll LLVM :: Linker/visibility1.ll If it is to be resurrected, it has to be fixed and we should probably have a -preserve-source command line option in llvm-mc and run tests with and without it. llvm-svn: 220741
-
Fariborz Jahanian authored
to error. rdar://18768214. llvm-svn: 220740
-
NAKAMURA Takumi authored
llvm-svn: 220739
-
Richard Smith authored
llvm-svn: 220738
-
Adam Nemet authored
This is implemented via a multiclass that derives from the vperm imm multiclass. Fixes <rdar://problem/18426089> llvm-svn: 220737
-
Adam Nemet authored
No functionality change. No change in X86.td.expanded except that we only set the CD8 attributes for the memory variants. (This shouldn't be used unless we have a memory operand.) llvm-svn: 220736
-
Adam Nemet authored
This used to derive from avx512_pshuf_imm which is confusing. NFC. Compared X86.td.expanded. llvm-svn: 220735
-
Adam Nemet authored
1) i512mem -> f512mem (this is the packed FP input being permuted) 2) element size is 64 bits in EVEX_CD8 for PD. (A good illustration why X86VectorVTInfo is useful) llvm-svn: 220734
-
Rafael Espindola authored
llvm-svn: 220733
-
Rafael Espindola authored
llvm-svn: 220732
-
Richard Smith authored
llvm-svn: 220731
-
- Oct 27, 2014
-
-
Tim Northover authored
llvm-svn: 220730
-
Jon Roelofs authored
http://reviews.llvm.org/D6006 llvm-svn: 220729
-
Pete Cooper authored
For a call to not return in to the stackmap shadow, the shadow must end with the call. To do this, we must insert any required nops *before* the call, and not after it. llvm-svn: 220728
-
Fariborz Jahanian authored
to C type a collection literal. rdar://18768214 llvm-svn: 220727
-
David Majnemer authored
GCC doesn't do this and it semes weird to include a file that we can't open. This fixes PR21362. llvm-svn: 220726
-
Hans Wennborg authored
Looks like some builds were not happy with the potentially-throwing move constructor that was added in r220723, and reached for the implicitly deleted copy constructor instead. llvm-svn: 220725
-
Eric Fiselier authored
For targets that end it `redhat-linux` and `suse-linux` manually add the `-gnu` section of the target since `linux-gnu` is needed in the testsuite. This patch also moves the removal of minor and patchlevel numbers from OSX triples to be handled when deducing the triple instead of when adding available features. llvm-svn: 220724
-
Hans Wennborg authored
llvm-svn: 220723
-
Eric Fiselier authored
llvm-svn: 220722
-
Jingyue Wu authored
to be consistent with its definition in ScalarEvolution.cpp llvm-svn: 220721
-
Greg Clayton authored
Make sure OTHER_CFLAGS and OTHER_LDFLAGS are inherited from the Xcode project so you can easily add to the flags of all targets. llvm-svn: 220720
-