- Sep 17, 2013
-
-
Sean Callanan authored
constants before using them in the IR interpreter. Patch by Félix Cloutier. llvm-svn: 190877
-
Arnold Schwaighofer authored
Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this. We model reductions as a series of (shufflevector,add) tuples ultimately followed by an extractelement. For example, for an add-reduction of <4 x float> we could generate the following sequence: (v0, v1, v2, v3) \ \ / / \ \ / + + (v0+v2, v1+v3, undef, undef) \ / ((v0+v2) + (v1+v3), undef, undef) %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef> %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 %r = extractelement <4 x float> %bin.rdx8, i32 0 This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)" that will allow clients to ask for the cost of such a reduction (as backends might generate more efficient code than the cost of the individual instructions summed up). This interface is excercised by the CostModel analysis pass which looks for reduction patterns like the one above - starting at extractelements - and if it sees a matching sequence will call the cost model interface. We will also support a second form of pairwise reduction that is well supported on common architectures (haddps, vpadd, faddp). (v0, v1, v2, v3) \ / \ / (v0+v1, v2+v3, undef, undef) \ / ((v0+v1)+(v2+v3), undef, undef, undef) %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef> %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 1, i32 3, i32 undef, i32 undef> %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef> %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 %r = extractelement <4 x float> %bin.rdx.1, i32 0 llvm-svn: 190876
-
Ed Maste authored
llvm-svn: 190875
-
Fariborz Jahanian authored
the ObjectiveC object of an @synchronized statement. // rdar://14993814 llvm-svn: 190874
-
Ed Maste authored
We cannot use "GetMaxU64Bitfield" for non-power-of-two sizes, so just use the same code that handles N > 8 for these. Review: http://llvm-reviews.chandlerc.com/D1699 llvm-svn: 190873
-
Andrew Kaylor authored
llvm-svn: 190872
-
Arnold Schwaighofer authored
We can't insert an insertelement after an invoke. We would have to split a critical edge. So when we see a phi node that uses an invoke we just give up. radar://14990770 llvm-svn: 190871
-
Quentin Colombet authored
other in memory. The motivation was to get rid of truncate and shift right instructions that get in the way of paired load or floating point load. E.g., Consider the following example: struct Complex { float real; float imm; }; When accessing a complex, llvm was generating a 64-bits load and the imm field was obtained by a trunc(lshr) sequence, resulting in poor code generation, at least for x86. The idea is to declare that two load instructions is the canonical form for loading two arithmetic type, which are next to each other in memory. Two scalar loads at a constant offset from each other are pretty easy to detect for the sorts of passes that like to mess with loads. <rdar://problem/14477220> llvm-svn: 190870
-
Preston Gurd authored
llvm-svn: 190869
-
Daniel Malea authored
- searches frames beginning from the current frame, stops when an equivalent context is found - not using GetStackFrameCount() for performance reasons - fixes TestInlineStepping (clang/gcc buildbots) llvm-svn: 190868
-
Daniel Malea authored
- now fails due to llvm.org/pr15415 (partial stack trace while stopped inside read() call) llvm-svn: 190867
-
Serge Pavlov authored
llvm-svn: 190866
-
Daniel Malea authored
- original bug llvm.org/pr14323 is long closed llvm-svn: 190865
-
Ben Langmuir authored
Add llvm.x86.* intrinsics for all of the Intel SHA Extensions instructions, as well as tests. Also remove mayLoad and hasSideEffects, which can be inferred from the instruction patterns. llvm-svn: 190864
-
Kostya Serebryany authored
[asan] inline the calls to __asan_stack_free_* with small sizes. Yet another 10%-20% speedup for use-after-return llvm-svn: 190863
-
Joey Gouly authored
llvm-svn: 190862
-
Daniel Jasper authored
This fixes llvm.org/PR17265. Before: Foo::Foo() #ifdef BAR : baz(0) #endif { } After: Foo::Foo() #ifdef BAR : baz(0) #endif { } llvm-svn: 190861
-
Alexey Samsonov authored
llvm-svn: 190860
-
Stepan Dyatkovskiy authored
Wrong cast operation. MergeFunctions emits Bitcast instead of pointer-to-integer operation. Patch fixes MergeFunctions::writeThunk function. It replaces unconditional Bitcast creation with "Value* createCast(...)" method, that checks operand types and selects proper instruction. See unit-test as example. llvm-svn: 190859
-
Daniel Jasper authored
llvm-svn: 190858
-
Joerg Sonnenberger authored
llvm-svn: 190857
-
Alexey Samsonov authored
llvm-svn: 190856
-
Daniel Jasper authored
Before: if () { } else { } After: if () { } else { } This fixed llvm.org/PR17262. llvm-svn: 190855
-
Daniel Jasper authored
Before (with column limit 60): aaaaaaaaaaaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaa > > aaaaa); After: aaaaaaaaaaaaaaaaaaaaaaaaaaaa( aaaaaaaaaaaaaaaaaaaaaaaaaaaaa >> aaaaa); (Not sure how that could have stayed in that long without being detected..) llvm-svn: 190854
-
Alexey Samsonov authored
llvm-svn: 190853
-
Kostya Serebryany authored
llvm-svn: 190852
-
Elena Demikhovsky authored
llvm-svn: 190851
-
Craig Topper authored
llvm-svn: 190850
-
Craig Topper authored
llvm-svn: 190849
-
Craig Topper authored
Push contents of X86TargetInfo::setFeatureEnabled down to a static function called by the virtual version and all the places in getDefaultFeatures. This way getDefaultFeatures doesn't make so many virtual calls. llvm-svn: 190847
-
Craig Topper authored
llvm-svn: 190846
-
Eli Friedman authored
AssignConvertType::IncompatibleVectors means the two types are in fact compatible. :) No testcase; I don't think the extra init list has any actual visible effect other than making the resulting AST dump look a bit strange. llvm-svn: 190845
-
Eli Friedman authored
Like any other type, an init list for a vector can have the same type as the vector itself; handle that case. <rdar://problem/14990460> llvm-svn: 190844
-
Craig Topper authored
llvm-svn: 190843
-
Tobias Grosser authored
llvm-svn: 190842
-
Tobias Grosser authored
Instead of defining the relevant functions inline, we now just keep the declarations in the class itself. This makes the class declaration a lot easier to read as all functions can be seen at once. We also use this opportunity to privatize all functions not used in the public interface of the class. llvm-svn: 190841
-
Shankar Easwaran authored
This sets the sectionChoice property for DefinedAtoms. The output section name is derived by the property of the atom. This also decreases native file size. Adds a test. llvm-svn: 190840
-
Kevin Qin authored
llvm-svn: 190839
-
Jim Ingham authored
llvm-svn: 190838
-
Howard Hinnant authored
llvm-svn: 190837
-