- Dec 27, 2012
-
-
Craig Topper authored
Fix operands and encoding form for ARPL instruction. Register form had and reversed. Memory form writes memory, but was marked as MRMSrcMem. llvm-svn: 171123
-
Craig Topper authored
llvm-svn: 171122
-
- Dec 26, 2012
-
-
Craig Topper authored
llvm-svn: 171121
-
Craig Topper authored
Mark all the _REV instructions as not having side effects. They aren't really emitted by the backend, but it reduces the number of instructions in the output files with unmodelled side effects to make auditing easier. llvm-svn: 171118
-
Craig Topper authored
Remove a special conditional setting of neverHasSideEffects if the instruction didn't have a pattern. This was leftover from when tablegen used to complain if things were already inferred from patterns. llvm-svn: 171117
-
Craig Topper authored
llvm-svn: 171103
-
Craig Topper authored
llvm-svn: 171102
-
Craig Topper authored
llvm-svn: 171097
-
Craig Topper authored
llvm-svn: 171096
-
Craig Topper authored
llvm-svn: 171095
-
Craig Topper authored
llvm-svn: 171093
-
Craig Topper authored
Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for a bunch of SSE2 integer arithmetic instructions. llvm-svn: 171092
-
Nadav Rotem authored
llvm-svn: 171091
-
Craig Topper authored
Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for PAND/POR/PXOR/PANDN llvm-svn: 171087
-
Craig Topper authored
llvm-svn: 171086
-
Craig Topper authored
llvm-svn: 171085
-
Craig Topper authored
llvm-svn: 171082
-
Craig Topper authored
Remove alignment from folding table for VMOVUPD as an unaligned instruction it shouldn't require alignment... llvm-svn: 171081
-
Craig Topper authored
Remove alignment requirements from (V)EXTRACTPS. This instruction does 32-bit stores which aren't required to be aligned on SSE or AVX. llvm-svn: 171080
-
Craig Topper authored
Remove alignment requirement from VCVTSS2SD in folding tables. Reverting r171049. This instruction doesn't require alignment. llvm-svn: 171078
-
- Dec 25, 2012
-
-
Hal Finkel authored
Use of store or load with the atomic specifier on 64-bit types would cause instruction-selection failures. As with the 32-bit case, these can use the default expansion in terms of cmp-and-swap. llvm-svn: 171072
-
Benjamin Kramer authored
llvm-svn: 171064
-
Benjamin Kramer authored
pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack. Small speedup on loop-vectorized viterbi (-march=core2). llvm-svn: 171063
-
Nadav Rotem authored
llvm-svn: 171049
-
- Dec 24, 2012
-
-
Nick Lewycky authored
llvm-svn: 171044
-
Benjamin Kramer authored
This affords us to use std::string's allocation routines and use the destructor for the memory management. Switching to that also means that we can use operator==(const std::string&, const char *) to perform the string comparison rather than resorting to libc functionality (i.e. strcmp). Patch by Saleem Abdulrasool! Differential Revision: http://llvm-reviews.chandlerc.com/D230 llvm-svn: 171042
-
Nadav Rotem authored
support for the insert-subvector and extract-subvector kinds. llvm-svn: 171027
-
Nadav Rotem authored
Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. llvm-svn: 171024
-
Nadav Rotem authored
Change the codegen Cost Model API for shuffeles. This patch removes the API for broadcast and adds a more general API that accepts an enum of known shuffles. llvm-svn: 171022
-
- Dec 23, 2012
-
-
Nadav Rotem authored
the cost of arithmetic functions. We now assume that the cost of arithmetic operations that are marked as Legal or Promote is low, but ops that are marked as custom are higher. llvm-svn: 171002
-
Nadav Rotem authored
llvm-svn: 170997
-
Nadav Rotem authored
llvm-svn: 170996
-
Nadav Rotem authored
them more expensive. llvm-svn: 170995
-
- Dec 22, 2012
-
-
Benjamin Kramer authored
pmuludq is slow, but it turns out that all the unpacking and packing of the scalarized mul is even slower. 10% speedup on loop-vectorized paq8p. llvm-svn: 170985
-
Benjamin Kramer authored
Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. llvm-svn: 170984
-
Nadav Rotem authored
The only way to read the eflags is using push and pop. If we don't adjust the stack then we run over the first frame index. This is not something that we want to do, so we have to make sure that our machine function does not copy the flags. If it does then we have to emit the prolog that adjusts the stack. rdar://12896831 llvm-svn: 170961
-
Akira Hatanaka authored
instructions. llvm-svn: 170956
-
Akira Hatanaka authored
llvm-svn: 170955
-
Akira Hatanaka authored
llvm-svn: 170954
-
Akira Hatanaka authored
was not catching the error. llvm-svn: 170953
-