- Jul 21, 2011
-
-
Bruno Cardoso Lopes authored
instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662
-
Bruno Cardoso Lopes authored
refactor the code and add a bunch of comments. The final shuffle emitted by handling 256-bit types is suitable for the VPERM shuffle instruction which is going to be introduced in a next commit (with a testcase which cover this commit) llvm-svn: 135661
-
Bruno Cardoso Lopes authored
llvm-svn: 135660
-
Bruno Cardoso Lopes authored
llvm-svn: 135659
-
Bruno Cardoso Lopes authored
llvm-svn: 135658
-
Bruno Cardoso Lopes authored
llvm-svn: 135657
-
Bruno Cardoso Lopes authored
llvm-svn: 135656
-
Bill Wendling authored
llvm-svn: 135645
-
Bill Wendling authored
llvm-svn: 135635
-
Bill Wendling authored
llvm-svn: 135634
-
- Jul 20, 2011
-
-
Evan Cheng authored
There is still a bit more refactoring left to do in Targets. But we are now very close to fixing all the layering issues in MC. llvm-svn: 135611
-
Eli Friedman authored
llvm-svn: 135607
-
Evan Cheng authored
- Introduce JITDefault code model. This tells targets to set different default code model for JIT. This eliminates the ugly hack in TargetMachine where code model is changed after construction. llvm-svn: 135580
-
NAKAMURA Takumi authored
X86Subtarget.h: Assume "x86_64-cygwin", though it has not been released yet, to appease test/CodeGen/X86 on cygwin. llvm-svn: 135564
-
- Jul 19, 2011
-
-
Evan Cheng authored
(including compilation, assembly). Move relocation model Reloc::Model from TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine. llvm-svn: 135468
-
Evan Cheng authored
better location welcome). llvm-svn: 135438
-
- Jul 18, 2011
-
-
Evan Cheng authored
to MCRegisterInfo. Also initialize the mapping at construction time. This patch eliminate TargetRegisterInfo from TargetAsmInfo. It's another step towards fixing the layering violation. llvm-svn: 135424
-
Bruno Cardoso Lopes authored
definitions. llvm-svn: 135407
-
Bruno Cardoso Lopes authored
llvm-svn: 135404
-
Chris Lattner authored
llvm-svn: 135375
-
- Jul 16, 2011
-
-
Bruno Cardoso Lopes authored
llvm-svn: 135332
-
Bruno Cardoso Lopes authored
1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us canonize the loads and handle things the same way we use to handle for 128-bit registers. Despite of what one of the removed comments explained, the load promotion would not mess with VPERM, it's only a matter of doing the appropriate bitcasts when this instructions comes to be introduced. Also make LOAD v8i32 legal. 2) Doing 1) exposed two bugs: - v4i64 was being promoted to itself for several opcodes (introduced in r124447 by David Greene) causing endless recursion and the stack to explode. - there was no support for allOnes BUILD_VECTORs and ANDNP would fail to match because it was generating early target constant pools during lowering. 3) The testcases are already checked-in, doing 1) exposed the bugs in the current testcases. 4) Tidy up code to be more clear and explicit about AVX. llvm-svn: 135313
-
Bruno Cardoso Lopes authored
comming together with other tests. llvm-svn: 135312
-
- Jul 15, 2011
-
-
Eli Friedman authored
llvm-svn: 135303
-
Chandler Carruth authored
was really intended, and it may have been required prior to some of the recent refactors. Including it however causes LLVMX86Desc to need symbols from LLVMX86CodeGen, forming a dependency cycle. This was masked in almost all builds: Clang, and GCC w/ optimizations didn't actually emit the symbols! llvm-svn: 135242
-
Evan Cheng authored
solution but it is a small step towards removing the horror that is TargetAsmInfo. llvm-svn: 135237
-
Chandler Carruth authored
backend. Moved some MCAsmInfo files down into the MCTargetDesc sublibraries, removed some (i suspect long) dead files from other parts of the CMake build, etc. Also copied the include directory hack from the Makefile. Finally, updated the lib deps. I spot checked this, and think its correct, but review appreciated there. llvm-svn: 135234
-
Evan Cheng authored
Rename createAsmInfo to createMCAsmInfo and move registration code to MCTargetDesc to prepare for next round of changes. llvm-svn: 135219
-
Bill Wendling authored
unwind library expects. * Comment the permutation encoding for frameless stacks. llvm-svn: 135202
-
- Jul 14, 2011
-
-
Benjamin Kramer authored
llvm-svn: 135198
-
Evan Cheng authored
registeration and creation code into XXXMCDesc libraries. llvm-svn: 135184
-
Eric Christopher authored
when determining validity of matching constraint. Allow i1 types access to the GR8 reg class for x86. Fixes PR10352 and rdar://9777108 llvm-svn: 135180
-
Bruno Cardoso Lopes authored
llvm-svn: 135171
-
Nadav Rotem authored
[VECTOR-SELECT] During type legalization we often use the SIGN_EXTEND_INREG SDNode. When this SDNode is legalized during the LegalizeVector phase, it is scalarized because non-simple types are automatically marked to be expanded. In this patch we add support for lowering SIGN_EXTEND_INREG manually. This fixes CodeGen/X86/vec_sext.ll when running with the '-promote-elements' flag. llvm-svn: 135144
-
Eli Friedman authored
Fix up assertion in r135018 so it doesn't trigger on 32-bit; when we're in 32-bit, it doesn't matter whether the operation overflows because the computed address is not wider than the immediate. llvm-svn: 135120
-
Bill Wendling authored
The frameless unwind stack has a special encoding, the algorithm for which is in "permuteEncode". llvm-svn: 135103
-
- Jul 13, 2011
-
-
Bruno Cardoso Lopes authored
general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088
-
Bruno Cardoso Lopes authored
it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN instruction. Rename it. llvm-svn: 135087
-
Eli Friedman authored
Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084
-
Eli Friedman authored
Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress. Part of <rdar://problem/9763308>. llvm-svn: 135079
-