- Mar 26, 2010
-
-
Johnny Chen authored
It doesn't seem to be used anywhere. llvm-svn: 99566
-
Jim Grosbach authored
llvm-svn: 99565
-
Gabor Greif authored
llvm-svn: 99564
-
- Mar 25, 2010
-
-
Daniel Dunbar authored
exactly two passes in that case, and don't ever need to recompute any layout, so this is a nice baseline for relaxation performance. llvm-svn: 99563
-
Johnny Chen authored
llvm-svn: 99557
-
Jim Grosbach authored
llvm-svn: 99549
-
Johnny Chen authored
expect a Format arg. N2VCvtD/N2VCvtQ are modified to use the NVCVTFrm format. llvm-svn: 99548
-
Evan Cheng authored
llvm-svn: 99544
-
Daniel Dunbar authored
- Still O(N^2), just a faster form, and now its the MCAsmLayout's fault. On the .s I am tuning against (combine.s from 403.gcc): -- ddunbar@lordcrumb:MC$ diff stats-before.txt stats-after.txt 5,10c5,10 < 1728 assembler - Number of assembler layout and relaxation steps < 7707 assembler - Number of emitted assembler fragments < 120588 assembler - Number of emitted object file bytes < 2233448 assembler - Number of evaluated fixups < 1727 assembler - Number of relaxed instructions < 6723845 mcexpr - Number of MCExpr evaluations --- > 3 assembler - Number of assembler layout and relaxation steps > 7707 assembler - Number of emitted assembler fragments > 120588 assembler - Number of emitted object file bytes > 14796 assembler - Number of evaluated fixups > 1727 assembler - Number of relaxed instructions > 67889 mcexpr - Number of MCExpr evaluations -- Feel free to LOL at the -before numbers, if you like. I am a little surprised we make more than 2 relaxation passes. It's pretty trivial for us to do relaxation out-of-order if that would give a speedup. llvm-svn: 99543
-
Daniel Dunbar authored
llvm-svn: 99542
-
Jakob Stoklund Olesen authored
llvm-svn: 99540
-
Jakob Stoklund Olesen authored
Remove much horribleness from X86InstrFormats as a result. Similar simplifications are probably possible for other targets. llvm-svn: 99539
-
Chris Lattner authored
the custom insertion hook deletes the instruction, then we try to set dead flags on it. Neither the code that I added nor the code that was there before was safe. llvm-svn: 99538
-
Evan Cheng authored
llvm-svn: 99537
-
Daniel Dunbar authored
llvm-svn: 99529
-
Daniel Dunbar authored
llvm-svn: 99528
-
Jakob Stoklund Olesen authored
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register in a different domain than where it was defined. Some instructions have equvivalents for different domains, like por/orps/orpd. The SSEDomainFix pass tries to minimize the number of domain crossings by changing between equvivalent opcodes where possible. This is a work in progress, in particular the pass doesn't do anything yet. SSE instructions are tagged with their execution domain in TableGen using the last two bits of TSFlags. Note that not all instructions are tagged correctly. Life just isn't that simple. The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline issue handled by NEONMoveFixPass. This pass may become target independent to handle both. llvm-svn: 99524
-
Johnny Chen authored
instead of the current N2V. Format of NVDupLane instances are set to NEONFrm currently. llvm-svn: 99518
-
Bob Wilson authored
opcode values fitting in one byte (svn r99494). llvm-svn: 99514
-
Devang Patel authored
llvm-svn: 99507
-
Daniel Dunbar authored
llvm-svn: 99504
-
Evan Cheng authored
Scheduler assumes SDDbgValue nodes are in source order. That's true currently. But add an assertion to verify it. llvm-svn: 99501
-
Daniel Dunbar authored
llvm-svn: 99500
-
Daniel Dunbar authored
llvm-svn: 99499
-
Chris Lattner authored
bytes instead of one byte. This is important because we're running up to too many opcodes to fit in a byte and it is aggrevated by FIRST_TARGET_MEMORY_OPCODE making the numbering sparse. This just bites the bullet and bloats out the table. In practice, this increases the size of the x86 isel table from 74.5K to 76K. I think we'll cope :) This fixes rdar://7791648 llvm-svn: 99494
-
Devang Patel authored
llvm-svn: 99490
-
Evan Cheng authored
llvm-svn: 99489
-
Chris Lattner authored
llvm-svn: 99488
-
Evan Cheng authored
llvm-svn: 99487
-
Chris Lattner authored
handles dead implicit results more aggressively. More to come, I think this is now just a data entry problem. llvm-svn: 99486
-
Chris Lattner authored
happening. Enhance scheduling to set the DEAD flag on implicit defs more aggressively. Before, we'd set an implicit def operand to dead if it were present in the SDNode corresponding to the machineinstr but had no use. Now we do it in this case AND if the implicit def does not exist in the SDNode at all. This exposes a couple of problems: one is the FIXME, which causes a live intervals crash on CodeGen/X86/sibcall.ll. The second is that it makes machinecse and licm more aggressive (which is a good thing) but also exposes a case where licm hoists a set0 and then it doesn't get resunk. Talking to codegen folks about both these issues, but I need this patch in in the meantime. llvm-svn: 99485
-
Devang Patel authored
llvm-svn: 99484
-
Eric Christopher authored
instead of InlineFunction. llvm-svn: 99483
-
Chris Lattner authored
r99453. llvm-svn: 99482
-
Daniel Dunbar authored
llvm-svn: 99474
-
Daniel Dunbar authored
llvm-svn: 99473
-
Evan Cheng authored
Change how dbg_value sdnodes are converted into machine instructions. Their placement should be determined by the relative order of incoming llvm instructions. The scheduler will now use the SDNode ordering information to determine where to insert them. A dbg_value instruction is inserted after the instruction with the last highest source order and before the instruction with the next highest source order. It will optimize the placement by inserting right after the instruction that produces the value if they have consecutive order numbers. Here is a theoretical example that illustrates why the placement is important. tmp1 = store tmp1 -> x ... tmp2 = add ... ... call ... store tmp2 -> x Now mem2reg comes along: tmp1 = dbg_value (tmp1 -> x) ... tmp2 = add ... ... call ... dbg_value (tmp2 -> x) When the debugger examine the value of x after the add instruction but before the call, it should have the value of tmp1. Furthermore, for dbg_value's that reference constants, they should not be emitted at the beginning of the block (since they do not have "producers"). This patch also cleans up how SDISel manages DbgValue nodes. It allow a SDNode to be referenced by multiple SDDbgValue nodes. When a SDNode is deleted, it uses the information to find the SDDbgValues and invalidate them. They are not deleted until the corresponding SelectionDAG is destroyed. llvm-svn: 99469
-
Daniel Dunbar authored
llvm-svn: 99467
-
Daniel Dunbar authored
MC: Fix refacto in MCExpr evaluation, I mistakenly replaced a fragment address with a symbol address. - This fixes the integrated-as nightly test regressions. llvm-svn: 99466
-
Evan Cheng authored
llvm-svn: 99465
-