- Mar 25, 2010
-
-
Jakob Stoklund Olesen authored
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register in a different domain than where it was defined. Some instructions have equvivalents for different domains, like por/orps/orpd. The SSEDomainFix pass tries to minimize the number of domain crossings by changing between equvivalent opcodes where possible. This is a work in progress, in particular the pass doesn't do anything yet. SSE instructions are tagged with their execution domain in TableGen using the last two bits of TSFlags. Note that not all instructions are tagged correctly. Life just isn't that simple. The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline issue handled by NEONMoveFixPass. This pass may become target independent to handle both. llvm-svn: 99524
-
Johnny Chen authored
instead of the current N2V. Format of NVDupLane instances are set to NEONFrm currently. llvm-svn: 99518
-
Bob Wilson authored
opcode values fitting in one byte (svn r99494). llvm-svn: 99514
-
Chris Lattner authored
handles dead implicit results more aggressively. More to come, I think this is now just a data entry problem. llvm-svn: 99486
-
Evan Cheng authored
addl $12, %esp popl %esi popl %edi popl %ebx popl %ebp jmpl *__Block_deallocator-L1$pb(%esi) # TAILCALL The problem is the global base register is assigned GR32 register class. TCRETURNmi needs the registers making up the address mode to have the GR32_TC register class. The *proper* fix is for X86DAGToDAGISel::getGlobalBaseReg() to return a copy from the global base register of the machine function rather than returning the register itself. But that has the potential of causing it to be coalesced to a more restrictive register class: GR32_TC. It can introduce additional copies and spills. For something as important the PIC base, it's not worth it especially since this is not an issue on 64-bit. llvm-svn: 99455
-
Bob Wilson authored
--- Reverse-merging r99440 into '.': U test/MC/AsmParser/X86/x86_32-bit_cat.s U test/MC/AsmParser/X86/x86_32-encoding.s U include/llvm/IntrinsicsX86.td U include/llvm/CodeGen/SelectionDAGNodes.h U lib/Target/X86/X86InstrSSE.td U lib/Target/X86/X86ISelLowering.h llvm-svn: 99450
-
- Mar 24, 2010
-
-
Kevin Enderby authored
llvm-svn: 99440
-
Jim Grosbach authored
Preliminary testing shows significant performance wins by not using these instructions. llvm-svn: 99436
-
Kevin Enderby authored
not get an "Unknown immediate size" assert failure when used. All instructions of this form have an 8-bit immediate. Also added a test case of an example instruction that is of this form. llvm-svn: 99435
-
Nate Begeman authored
llvm-svn: 99434
-
Johnny Chen authored
llvm-svn: 99428
-
Nate Begeman authored
llvm-svn: 99423
-
Johnny Chen authored
NVCVTFrm will later be used to describe "vcvt with fractional bits". llvm-svn: 99415
-
Johnny Chen authored
N3VX instructions using special case code. llvm-svn: 99409
-
Jim Grosbach authored
llvm-svn: 99402
-
Johnny Chen authored
llvm-svn: 99376
-
Chris Lattner authored
and defining the add pattern with Pat<>, eliminating a use of parallel. llvm-svn: 99375
-
Johnny Chen authored
respectively, and add some more comment. llvm-svn: 99373
-
Chris Lattner authored
llvm-svn: 99370
-
Chris Lattner authored
ISD node. The only change in the generated isel code are comments like: < // Src: (X86dec_flag:i16 GR16:i16:$src) --- > // Src: (X86dec_flag:i16:i32 GR16:i16:$src) because now it knows that X86dec_flag returns both an i16 (for the result) and an i32 (for EFLAGS) in this case. Wewt. llvm-svn: 99369
-
Chris Lattner authored
llvm-svn: 99360
-
Chris Lattner authored
llvm-svn: 99359
-
Chris Lattner authored
llvm-svn: 99358
-
Jim Grosbach authored
test run permformance numbers say as to whether it helps. llvm-svn: 99355
-
Jakob Stoklund Olesen authored
This reverts commit 99345. It was breaking buildbots. llvm-svn: 99352
-
Chris Lattner authored
just use an empty result list. llvm-svn: 99346
-
Jakob Stoklund Olesen authored
This is work in progress. So far, SSE execution domain tables are added to X86InstrInfo, and a skeleton pass is enabled with -sse-domain-fix. llvm-svn: 99345
-
Johnny Chen authored
llvm-svn: 99344
-
- Mar 23, 2010
-
-
Johnny Chen authored
llvm-svn: 99328
-
Johnny Chen authored
llvm-svn: 99327
-
Johnny Chen authored
Converted some of the NEON vcvt instructions to this format. llvm-svn: 99326
-
Johnny Chen authored
llvm-svn: 99322
-
Evan Cheng authored
Teach isSafeToClobberEFLAGS to ignore dbg_value's. We need a MachineBasicBlock::iterator that does this automatically? llvm-svn: 99320
-
Bob Wilson authored
These instructions are only needed for codegen, so I've removed all the explicit encoding bits for now; they should be set in the same way as the for VLDMD and VSTMD whenever we add encodings for VFP. The use of addrmode5 requires that the instructions be custom-selected so that the number of registers can be set in the AM5Opc value. llvm-svn: 99309
-
Bob Wilson authored
llvm-svn: 99295
-
Johnny Chen authored
Ref: A7.4.6 One register and a modified immediate value. llvm-svn: 99288
-
Bob Wilson authored
llvm-svn: 99266
-
Bob Wilson authored
of D registers. Add a separate VST1q instruction with a Q register source operand for use by storeRegToStackSlot. llvm-svn: 99265
-
Bob Wilson authored
of D registers. Add a separate VLD1q instruction with a Q register destination operand for use by loadRegFromStackSlot. llvm-svn: 99261
-
Daniel Dunbar authored
MC: Add TargetAsmBackend::MayNeedRelaxation, for checking whether a particular instruction + fixups might need relaxation. llvm-svn: 99249
-