- Aug 09, 2010
-
-
Bruno Cardoso Lopes authored
Add VCVTPD2PS, VCVTPS2DQ, VCVTPS2PDY, VCVTTPD2DQY, VCVTTPS2DQ and VCVTPD2DQ 256-bit conversion intrinsics llvm-svn: 110608
-
Bruno Cardoso Lopes authored
Add patterns to AVX conversions instructions. Do that instead of declaring more intructions whenever is possible, more coming llvm-svn: 110605
-
Oscar Fuentes authored
Next time the build is broken due to wrong library dependencies, just try building again (if you are on some Unix and are building all LLVM targets) or ask someone to commit the regenerated LLVMLibDeps.cmake. llvm-svn: 110593
-
Bruno Cardoso Lopes authored
llvm-svn: 110582
-
Bruno Cardoso Lopes authored
llvm-svn: 110580
-
- Aug 07, 2010
-
-
Dale Johannesen authored
form of CMPSD (etc.) Matching a 128-bit memory operand is wrong, the instruction uses only 64 bits (same as ADDSD etc.) 8193553. llvm-svn: 110491
-
Bruno Cardoso Lopes authored
llvm-svn: 110480
-
- Aug 06, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 110468
-
Owen Anderson authored
llvm-svn: 110460
-
Bruno Cardoso Lopes authored
llvm-svn: 110427
-
Bruno Cardoso Lopes authored
llvm-svn: 110425
-
Owen Anderson authored
llvm-svn: 110410
-
Eric Christopher authored
llvm-svn: 110404
-
Owen Anderson authored
ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396
-
Bruno Cardoso Lopes authored
Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394
-
- Aug 05, 2010
-
-
Eric Christopher authored
llvm-svn: 110371
-
Eric Christopher authored
llvm-svn: 110369
-
Eric Christopher authored
instructions. llvm-svn: 110360
-
Eric Christopher authored
llvm-svn: 110359
-
Eric Christopher authored
uses. llvm-svn: 110274
-
Eli Friedman authored
llvm-svn: 110268
-
- Aug 04, 2010
-
-
Devang Patel authored
llvm-svn: 110224
-
Benjamin Kramer authored
llvm-svn: 110200
-
Benjamin Kramer authored
- The COFF backend doesn't support MingW/Cygwin at the moment, it'll report an error, but it's still much better than random assertions from the MachO backend. - We want to make ELF the default eventually, it's what the majority of targets use. llvm-svn: 110197
-
Chris Lattner authored
llvm-svn: 110164
-
- Jul 31, 2010
-
-
Michael J. Spencer authored
llvm-svn: 109949
-
Michael J. Spencer authored
llvm-svn: 109947
-
- Jul 30, 2010
-
-
Bruno Cardoso Lopes authored
declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878
-
Bruno Cardoso Lopes authored
llvm-svn: 109877
-
- Jul 29, 2010
-
-
Jakob Stoklund Olesen authored
We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764
-
- Jul 28, 2010
-
-
Jakob Stoklund Olesen authored
The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652
-
Nate Begeman authored
This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566
-
Nate Begeman authored
~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549
-
- Jul 27, 2010
-
-
Michael J. Spencer authored
llvm-svn: 109494
-
Jakob Stoklund Olesen authored
subregister operands like this: %reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8) Make them return false when subreg operands are present. VirtRegRewriter is making bad assumptions otherwise. This fixes PR7713. llvm-svn: 109489
-
Jakob Stoklund Olesen authored
with a too-big register class. llvm-svn: 109488
-
- Jul 26, 2010
-
-
Evan Cheng authored
llvm-svn: 109450
-
Bruno Cardoso Lopes authored
we are using AVX and no AVX version of the desired intruction is present, this is better for incremental dev (without fallbacks it's easier to spot what's missing). Not sure this is the best hack thought (we can also disable all HasSSE* predicates by dinamically marking them 'false' if AVX is present) llvm-svn: 109434
-
- Jul 24, 2010
-
-
Evan Cheng authored
appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300
-
Bruno Cardoso Lopes authored
llvm-svn: 109295
-