- Sep 01, 2010
-
-
Bill Wendling authored
int x(int t) { if (t & 256) return -26; return 0; } We generate this: tst.w r0, #256 mvn r0, #25 it eq moveq r0, #0 while gcc generates this: ands r0, r0, #256 it ne mvnne r0, #25 bx lr Scandalous really! During ISel time, we can look for this particular pattern. One where we have a "MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND instruction to 0. Something like this (greatly simplified): %r0 = ISD::AND ... ARMISD::CMPZ %r0, 0 @ sets [CPSR] %r0 = ARMISD::MOVCC 0, -26 @ reads [CPSR] All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR] when it's zero. The zero value will all ready be in the %r0 register and we only need to change it if the AND wasn't zero. Easy! llvm-svn: 112664
-
Bruno Cardoso Lopes authored
llvm-svn: 112661
-
Bruno Cardoso Lopes authored
llvm-svn: 112657
-
Bill Wendling authored
llvm-svn: 112654
-
- Aug 31, 2010
-
-
Jakob Stoklund Olesen authored
No CCR virtual registers should exist, and %EFLAGS is used in ways that can surprise RegAllocFast. llvm-svn: 112650
-
Bruno Cardoso Lopes authored
llvm-svn: 112644
-
Bruno Cardoso Lopes authored
llvm-svn: 112642
-
Jim Grosbach authored
determining if they're likely to be in range of the SP when resolving frame references. llvm-svn: 112624
-
Jim Grosbach authored
the offset is legally encodable, not actually trying to do the encoding. llvm-svn: 112622
-
Bill Wendling authored
- Convert {0,1} and friends into 0b01, which is identical and more consistent. llvm-svn: 112593
-
Bruno Cardoso Lopes authored
Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles llvm-svn: 112570
-
Eric Christopher authored
llvm-svn: 112568
-
Eric Christopher authored
things we can't handle. llvm-svn: 112559
-
Anton Korobeynikov authored
scheduling opportunities (extra instruction can go in between MOVT / MOVW pair removing the stall). llvm-svn: 112546
-
Bill Wendling authored
is meant to do exactly the same thing. Thanks to Jim Grosbach for pointing this out! :-) llvm-svn: 112538
-
- Aug 30, 2010
-
-
Jakob Stoklund Olesen authored
kill flag. This could cause duplicate kill flags when the same register was used twice in a continuous sequence of STRs. There is no small test case. <rdar://problem/8218046> llvm-svn: 112534
-
Bob Wilson authored
Auto-upgrade the old intrinsic and update tests. llvm-svn: 112507
-
Jim Grosbach authored
Make ARM add rN, sp, #imm instructions rematerializable. That's how the address of locals is calculated, so this should help relieve register pressure a bit. Recalculating the local address is almost always going to be better than spilling. llvm-svn: 112503
-
Bob Wilson authored
operand is killed, add it to the expanded instruction as an implicit kill operand instead of marking the individual subregs with kill flags. This should work better in general and also handles the case for VST3 where one of the subregs was not referenced in the expanded instruction and so was not marked killed. llvm-svn: 112494
-
Bill Wendling authored
optional modified register (instead of reg0). Along with r112461 it will make sure that the optional define of CPSR is marked as "def" and will thus mark the instructions using these classes (t2ANDS*) as setting the 's' flag. llvm-svn: 112462
-
- Aug 29, 2010
-
-
Kalle Raiskila authored
The IDX was treated as byte index, not element index. llvm-svn: 112422
-
Bill Wendling authored
llvm-svn: 112421
-
Bob Wilson authored
IR add/sub operations with one or both operands sign- or zero-extended. Auto-upgrade the old intrinsics. llvm-svn: 112416
-
Eli Friedman authored
llvm-svn: 112411
-
Bill Wendling authored
- Create T2I_bin_sw_irs to be like T2I_bin_w_irs, but that it sets the S bit. llvm-svn: 112399
-
Chris Lattner authored
llvm-svn: 112397
-
Bill Wendling authored
llvm-svn: 112395
-
Bill Wendling authored
llvm-svn: 112394
-
Bill Wendling authored
it sets the CPSR register. llvm-svn: 112393
-
- Aug 28, 2010
-
-
Chris Lattner authored
times. This patch causes llc and llvm-mc (which both default to verbose-asm) to print out comments after a few common shuffle instructions which indicates the shuffle mask, e.g.: insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1] unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0] This is carefully factored to keep the information extraction (of the shuffle mask) separate from the printing logic. I plan to move the extraction part out somewhere else at some point for other parts of the x86 backend that want to introspect on the behavior of shuffles. llvm-svn: 112387
-
Chris Lattner authored
insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379
-
Chris Lattner authored
when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378
-
Chris Lattner authored
a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377
-
Chris Lattner authored
and hasn't kept up with ToT. Approved by Anton. llvm-svn: 112375
-
Bob Wilson authored
llvm-svn: 112357
-
Chris Lattner authored
being actively maintained, improved, or extended. llvm-svn: 112356
-
Bruno Cardoso Lopes authored
Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348
-
Bob Wilson authored
llvm-svn: 112336
-
Bob Wilson authored
the special values that for ARM would be used with IB or DA modes. Fall through and consider materializing a new base address is it would be profitable. llvm-svn: 112329
-
Bob Wilson authored
all the other LDM/STM instructions. This fixes asm printer crashes when compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run with -O0 to check this in the future. Prior to this change VLDM/VSTM used addressing mode #5, but not really. The offset field was used to hold a count of the number of registers being loaded or stored, and the AM5 opcode field was expanded to specify the IA or DB mode, instead of the standard ADD/SUB specifier. Much of the backend was not aware of these special cases. The crashes occured when rewriting a frameindex caused the AM5 offset field to be changed so that it did not have a valid submode. I don't know exactly what changed to expose this now. Maybe we've never done much with -O0 and NEON. Regardless, there's no longer any reason to keep a count of the VLDM/VSTM registers, so we can use addressing mode #4 and clean things up in a lot of places. llvm-svn: 112322
-