- Sep 13, 2012
-
-
Jakob Stoklund Olesen authored
The patch caused "Wrong topological sorting" assertions. llvm-svn: 163810
-
Craig Topper authored
Add a new compression type to ModRM table that detects when the memory modRM byte represent 8 instructions and the reg modRM byte represents up to 64 instructions. Reduces modRM table from 43k entreis to 25k entries. Based on a patch from Manman Ren. llvm-svn: 163774
-
Jakob Stoklund Olesen authored
We don't have enough GR64_TC registers when calling a varargs function with 6 arguments. Since %al holds the number of vector registers used, only %r11 is available as a scratch register. This means that addressing modes using both base and index registers can't be folded into TCRETURNmi64. <rdar://problem/12282281> llvm-svn: 163761
-
- Sep 12, 2012
-
-
Michael Liao authored
- BlockAddress has no support of BA + offset form and there is no way to propagate that offset into machine operand; - Add BA + offset support and a new interface 'getTargetBlockAddress' to simplify target block address forming; - All targets are modified to use new interface and X86 backend is enhanced to support BA + offset addressing. llvm-svn: 163743
-
Chad Rosier authored
llvm-svn: 163729
-
Roman Divacky authored
llvm-svn: 163710
-
Craig Topper authored
llvm-svn: 163682
-
Manman Ren authored
"#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)" No functional change. Update r163339. llvm-svn: 163653
-
- Sep 11, 2012
-
-
Chad Rosier authored
llvm-svn: 163649
-
Chad Rosier authored
llvm-svn: 163648
-
Craig Topper authored
llvm-svn: 163596
-
Craig Topper authored
llvm-svn: 163594
-
Chad Rosier authored
llvm-svn: 163561
-
Chad Rosier authored
llvm-svn: 163557
-
Chad Rosier authored
llvm-svn: 163556
-
- Sep 10, 2012
-
-
Dmitri Gribenko authored
llvm-svn: 163547
-
Chad Rosier authored
and update the printOperand() function accordingly. llvm-svn: 163544
-
Chad Rosier authored
llvm-svn: 163542
-
Michael Liao authored
- Fix an remaining issue of PR11674 as well llvm-svn: 163528
-
Michael Liao authored
- If a boolean value is generated from CMOV and tested as boolean value, simplify the use of test result by referencing the original condition. RDRAND intrinisc is one of such cases. llvm-svn: 163516
-
Elena Demikhovsky authored
The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer. I've added the "zeroinitializer" case in this patch. llvm-svn: 163506
-
Nick Lewycky authored
llvm-svn: 163484
-
- Sep 08, 2012
-
-
Craig Topper authored
llvm-svn: 163473
-
Craig Topper authored
llvm-svn: 163463
-
Craig Topper authored
llvm-svn: 163461
-
Craig Topper authored
Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. llvm-svn: 163458
-
- Sep 07, 2012
-
-
Benjamin Kramer authored
gas accepts this and it seems to be common enough to be worth supporting. This doesn't affect the parsing of reg operands outside of .cfi directives. llvm-svn: 163390
-
- Sep 06, 2012
-
-
Manman Ren authored
No functional change. llvm-svn: 163339
-
Elena Demikhovsky authored
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible. llvm-svn: 163312
-
Michael Liao authored
llvm-svn: 163295
-
Craig Topper authored
Use iPTR instead of i32 for extract_subvector/insert_subvector index in lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder. llvm-svn: 163293
-
Craig Topper authored
Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr. llvm-svn: 163292
-
Roman Divacky authored
llvm-svn: 163258
-
- Sep 05, 2012
-
-
Roman Divacky authored
by casting. Found with gcc48. llvm-svn: 163247
-
Craig Topper authored
Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing. llvm-svn: 163198
-
Craig Topper authored
Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS. llvm-svn: 163196
-
Craig Topper authored
Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores. llvm-svn: 163192
-
Chad Rosier authored
llvm-svn: 163187
-
- Sep 04, 2012
-
-
Preston Gurd authored
- CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150
-
Elena Demikhovsky authored
Since this specific shuffle is widely used in many workloads we have ~10% performance on them. shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14> vmovaps (%rdx), %ymm0 vshufps $8, %ymm0, %ymm0, %ymm0 vmovaps (%rcx), %ymm1 vshufps $8, %ymm0, %ymm1, %ymm1 vunpcklps %ymm0, %ymm1, %ymm0 vmovaps (%rcx), %ymm0 vmovsldup (%rdx), %ymm1 vblendps $85, %ymm0, %ymm1, %ymm0 llvm-svn: 163134
-