- Dec 19, 2011
-
-
Evan Cheng authored
unpredicated. That is, turn subeq r0, r1, #1 addne r0, r1, #1 into sub r0, r1, #1 addne r0, r1, #1 For targets where conditional instructions are always executed, this may be beneficial. It may remove pseudo anti-dependency in out-of-order execution CPUs. e.g. op r1, ... str r1, [r10] ; end-of-life of r1 as div result cmp r0, #65 movne r1, #44 ; raw dependency on previous r1 moveq r1, #12 If movne is unpredicated, then op r1, ... str r1, [r10] cmp r0, #65 mov r1, #44 ; r1 written unconditionally moveq r1, #12 Both mov and moveq are no longer depdendent on the first instruction. This gives the out-of-order execution engine more freedom to reorder them. This has passed entire LLVM test suite. But it has not been enabled for any ARM variant pending more performance evaluation. rdar://8951196 llvm-svn: 146914
-
- Dec 07, 2011
-
-
Evan Cheng authored
generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if *all* of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026
-
- Nov 05, 2011
-
-
- Aug 04, 2011
-
-
Jakub Staszak authored
llvm-svn: 136828
-
Jakub Staszak authored
llvm-svn: 136826
-
- Jul 22, 2011
-
-
Jakub Staszak authored
llvm-svn: 135738
-
Jakub Staszak authored
llvm-svn: 135734
-
Jakub Staszak authored
llvm-svn: 135724
-
- Jul 10, 2011
-
-
Jakub Staszak authored
llvm-svn: 134858
-
Jakub Staszak authored
llvm-svn: 134856
-
- Jun 29, 2011
-
-
Evan Cheng authored
llvm-svn: 134049
-
- Jun 28, 2011
-
-
Evan Cheng authored
sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021
-
- May 12, 2011
-
-
Evan Cheng authored
markers. In some cases a register def is dead on one path, but not on another. This is passing Clang self-hosting. llvm-svn: 131214
-
- May 11, 2011
-
-
Rafael Espindola authored
to provide a reduced testcase. llvm-svn: 131176
-
Evan Cheng authored
at the start of basic blocks to their common predecessor. It's actually quite common (e.g. about 50 times in JM/lencod) and has shown to be a nice code size benefit. e.g. pushq %rax testl %edi, %edi jne LBB0_2 ## BB#1: xorb %al, %al popq %rdx ret LBB0_2: xorb %al, %al callq _foo popq %rdx ret => pushq %rax xorb %al, %al testl %edi, %edi je LBB0_2 ## BB#1: callq _foo LBB0_2: popq %rdx ret rdar://9145558 llvm-svn: 131172
-
- Apr 27, 2011
-
-
Evan Cheng authored
successors) and use inverse depth first search to traverse the BBs. However that doesn't work when the CFG has infinite loops. Simply do a linear traversal of all BBs work just fine. rdar://9344645 llvm-svn: 130324
-
- Nov 06, 2010
-
-
Benjamin Kramer authored
llvm-svn: 118342
-
- Nov 03, 2010
-
-
Evan Cheng authored
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135
-
- Oct 26, 2010
-
-
Bob Wilson authored
do not double-count the duplicate instructions by counting once from the beginning and again from the end. Keep track of where the duplicates from the beginning ended and don't go past that point when counting duplicates at the end. Radar 8589805. This change causes one of the MC/ARM/simple-fp-encoding tests to produce different (better!) code without the vmovne instruction being tested. I changed the test to produce vmovne and vmoveq instructions but moving between register files in the opposite direction. That's not quite the same but predicated versions of those instructions weren't being tested before, so at least the test coverage is not any worse, just different. llvm-svn: 117333
-
Bob Wilson authored
instructions separately from the count of non-predicated instructions. The instruction count is used in places to determine how many instructions to copy, predicate, etc. and things get confused if that count includes the extra cost for microcoded ops. llvm-svn: 117332
-
- Oct 19, 2010
-
-
Owen Anderson authored
Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize the pass's dependencies. Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h before parsing commandline arguments. I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass registration/creation, please send the testcase to me directly. llvm-svn: 116820
-
- Oct 12, 2010
-
-
Owen Anderson authored
perform initialization without static constructors AND without explicit initialization by the client. For the moment, passes are required to initialize both their (potential) dependencies and any passes they preserve. I hope to be able to relax the latter requirement in the future. llvm-svn: 116334
-
- Oct 08, 2010
-
-
Owen Anderson authored
llvm-svn: 115996
-
- Oct 02, 2010
-
-
Owen Anderson authored
Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now, stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide more nuanced estimates in the future. llvm-svn: 115364
-
- Sep 30, 2010
-
-
Benjamin Kramer authored
llvm-svn: 115097
-
- Sep 28, 2010
-
-
Owen Anderson authored
estimates. llvm-svn: 114981
-
Owen Anderson authored
Rather than having arbitrary cutoffs, actually try to cost model the conversion. For now, the constants are tuned to more or less match our existing behavior, but these will be changed to reflect realistic values as this work proceeds. llvm-svn: 114973
-
- Sep 10, 2010
-
-
Evan Cheng authored
take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570
-
- Aug 06, 2010
-
-
Owen Anderson authored
llvm-svn: 110460
-
Owen Anderson authored
llvm-svn: 110410
-
Owen Anderson authored
ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396
-
- Jul 22, 2010
-
-
Owen Anderson authored
llvm-svn: 109045
-
- Jun 29, 2010
-
-
Bob Wilson authored
There are 2 changes relative to the previous version of the patch: 1) For the "simple" if-conversion case, there's no need to worry about RemoveExtraEdges not handling an unanalyzable branch. Predicated terminators are ignored in this context, so RemoveExtraEdges does the right thing. This might break someday if we ever treat indirect branches (BRIND) as predicable, but for now, I just removed this part of the patch, because in the case where we do not add an unconditional branch, we rely on keeping the fall-through edge to CvtBBI (which is empty after this transformation). The change relative to the previous patch is: @@ -1036,10 +1036,6 @@ IterIfcvt = false; } - // RemoveExtraEdges won't work if the block has an unanalyzable branch, - // which is typically the case for IfConvertSimple, so explicitly remove - // CvtBBI as a successor. - BBI.BB->removeSuccessor(CvtBBI->BB); RemoveExtraEdges(BBI); // Update block info. BB can be iteratively if-converted. 2) My patch exposed a bug in the code for merging the tail of a "diamond", which had previously never been exercised. The code was simply checking that the tail had a single predecessor, but there was a case in MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was neither edge of the diamond. I added the following change to check for that: @@ -1276,7 +1276,18 @@ // tail, add a unconditional branch to it. if (TailBB) { BBInfo TailBBI = BBAnalysis[TailBB->getNumber()]; - if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) { + bool CanMergeTail = !TailBBI.HasFallThrough; + // There may still be a fall-through edge from BBI1 or BBI2 to TailBB; + // check if there are any other predecessors besides those. + unsigned NumPreds = TailBB->pred_size(); + if (NumPreds > 1) + CanMergeTail = false; + else if (NumPreds == 1 && CanMergeTail) { + MachineBasicBlock::pred_iterator PI = TailBB->pred_begin(); + if (*PI != BBI1->BB && *PI != BBI2->BB) + CanMergeTail = false; + } + if (CanMergeTail) { MergeBlocks(BBI, TailBBI); TailBBI.IsDone = true; } else { With these fixes, I was able to run all the SingleSource and MultiSource tests successfully. llvm-svn: 107110
-
- Jun 28, 2010
-
-
Jim Grosbach authored
llvm-svn: 107060
-
Daniel Dunbar authored
block, not...", it caused a bunch of nightly test regressions. llvm-svn: 107009
-
- Jun 26, 2010
-
-
Bob Wilson authored
regressions. --- Reverse-merging r106939 into '.': U test/CodeGen/Thumb2/thumb2-ifcvt3.ll U lib/CodeGen/IfConversion.cpp llvm-svn: 106951
-
Bob Wilson authored
if-conversion. The RemoveExtraEdges function doesn't work for blocks that end with unanalyzable branches, so in those cases, the "extra" edges must be explicitly removed. The CopyAndPredicateBlock and MergeBlocks methods can also avoid copying successor edges due to branches that have already been removed. The latter case is especially helpful when MergeBlocks is called for handling "diamond" if-conversions, where otherwise you can end up with some weird intermediate states in the CFG. Unfortunately I've been unable to find cases where this cleanup actually makes a significant difference in the code. There is one test where we manage to remove an empty block at the end of a function. Radar 6911268. llvm-svn: 106939
-
Jim Grosbach authored
just at the head, when doing diamond if-conversion. rdar://7797940 llvm-svn: 106907
-
Evan Cheng authored
llvm-svn: 106901
-
Jim Grosbach authored
llvm-svn: 106894
-