Skip to content
  1. Dec 19, 2011
    • Evan Cheng's avatar
      Add a if-conversion optimization that allows 'true' side of a diamond to be · 4266a793
      Evan Cheng authored
      unpredicated. That is, turn
       subeq  r0, r1, #1
       addne  r0, r1, #1                                                                                                                                                                                                     
      into
       sub    r0, r1, #1
       addne  r0, r1, #1
      
      For targets where conditional instructions are always executed, this may be
      beneficial. It may remove pseudo anti-dependency in out-of-order execution
      CPUs. e.g.
       op    r1, ...
       str   r1, [r10]        ; end-of-life of r1 as div result
       cmp   r0, #65
       movne r1, #44  ; raw dependency on previous r1
       moveq r1, #12
      
      If movne is unpredicated, then
       op    r1, ...
       str   r1, [r10]
       cmp   r0, #65
       mov   r1, #44  ; r1 written unconditionally
       moveq r1, #12
      
      Both mov and moveq are no longer depdendent on the first instruction. This gives
      the out-of-order execution engine more freedom to reorder them.
      
      This has passed entire LLVM test suite. But it has not been enabled for any ARM
      variant pending more performance evaluation.
      
      rdar://8951196
      
      llvm-svn: 146914
      4266a793
  2. Dec 07, 2011
    • Evan Cheng's avatar
      Add bundle aware API for querying instruction properties and switch the code · 7f8e563a
      Evan Cheng authored
      generator to it. For non-bundle instructions, these behave exactly the same
      as the MC layer API.
      
      For properties like mayLoad / mayStore, look into the bundle and if any of the
      bundled instructions has the property it would return true.
      For properties like isPredicable, only return true if *all* of the bundled
      instructions have the property.
      For properties like canFoldAsLoad, isCompare, conservatively return false for
      bundles.
      
      llvm-svn: 146026
      7f8e563a
  3. Nov 05, 2011
  4. Aug 04, 2011
  5. Jul 22, 2011
  6. Jul 10, 2011
  7. Jun 29, 2011
  8. Jun 28, 2011
  9. May 12, 2011
  10. May 11, 2011
  11. Apr 27, 2011
  12. Nov 06, 2010
  13. Nov 03, 2010
    • Evan Cheng's avatar
      Two sets of changes. Sorry they are intermingled. · debf9c50
      Evan Cheng authored
      1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
         "optimize for latency". Call instructions don't have the right latency and
         this is more likely to use introduce spills.
      2. Fix if-converter cost function. For ARM, it should use instruction latencies,
         not # of micro-ops since multi-latency instructions is completely executed
         even when the predicate is false. Also, some instruction will be "slower"
         when they are predicated due to the register def becoming implicit input.
         rdar://8598427
      
      llvm-svn: 118135
      debf9c50
  14. Oct 26, 2010
    • Bob Wilson's avatar
      When the "true" and "false" blocks of a diamond if-conversion are the same, · e1961fe2
      Bob Wilson authored
      do not double-count the duplicate instructions by counting once from the
      beginning and again from the end.  Keep track of where the duplicates from
      the beginning ended and don't go past that point when counting duplicates
      at the end.  Radar 8589805.
      
      This change causes one of the MC/ARM/simple-fp-encoding tests to produce
      different (better!) code without the vmovne instruction being tested.
      I changed the test to produce vmovne and vmoveq instructions but moving
      between register files in the opposite direction.  That's not quite the same
      but predicated versions of those instructions weren't being tested before,
      so at least the test coverage is not any worse, just different.
      
      llvm-svn: 117333
      e1961fe2
    • Bob Wilson's avatar
      Change if-conversion to keep track of the extra cost due to microcoded · efd360c5
      Bob Wilson authored
      instructions separately from the count of non-predicated instructions.  The
      instruction count is used in places to determine how many instructions to
      copy, predicate, etc. and things get confused if that count includes the
      extra cost for microcoded ops.
      
      llvm-svn: 117332
      efd360c5
  15. Oct 19, 2010
    • Owen Anderson's avatar
      Get rid of static constructors for pass registration. Instead, every pass... · 6c18d1aa
      Owen Anderson authored
      Get rid of static constructors for pass registration.  Instead, every pass exposes an initializeMyPassFunction(), which
      must be called in the pass's constructor.  This function uses static dependency declarations to recursively initialize
      the pass's dependencies.
      
      Clients that only create passes through the createFooPass() APIs will require no changes.  Clients that want to use the
      CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
      before parsing commandline arguments.
      
      I have tested this with all standard configurations of clang and llvm-gcc on Darwin.  It is possible that there are problems
      with the static dependencies that will only be visible with non-standard options.  If you encounter any crash in pass
      registration/creation, please send the testcase to me directly.
      
      llvm-svn: 116820
      6c18d1aa
  16. Oct 12, 2010
  17. Oct 08, 2010
  18. Oct 02, 2010
  19. Sep 30, 2010
  20. Sep 28, 2010
  21. Sep 10, 2010
    • Evan Cheng's avatar
      Teach if-converter to be more careful with predicating instructions that would · bf407075
      Evan Cheng authored
      take multiple cycles to decode.
      For the current if-converter clients (actually only ARM), the instructions that
      are predicated on false are not nops. They would still take machine cycles to
      decode. Micro-coded instructions such as LDM / STM can potentially take multiple
      cycles to decode. If-converter should take treat them as non-micro-coded
      simple instructions.
      
      llvm-svn: 113570
      bf407075
  22. Aug 06, 2010
  23. Jul 22, 2010
  24. Jun 29, 2010
    • Bob Wilson's avatar
      Reapply my if-conversion cleanup from svn r106939 with fixes. · 1e5da550
      Bob Wilson authored
      There are 2 changes relative to the previous version of the patch:
      
      1) For the "simple" if-conversion case, there's no need to worry about
      RemoveExtraEdges not handling an unanalyzable branch.  Predicated terminators
      are ignored in this context, so RemoveExtraEdges does the right thing.
      This might break someday if we ever treat indirect branches (BRIND) as
      predicable, but for now, I just removed this part of the patch, because
      in the case where we do not add an unconditional branch, we rely on keeping
      the fall-through edge to CvtBBI (which is empty after this transformation).
      
      The change relative to the previous patch is:
      
      @@ -1036,10 +1036,6 @@
           IterIfcvt = false;
         }
       
      -  // RemoveExtraEdges won't work if the block has an unanalyzable branch,
      -  // which is typically the case for IfConvertSimple, so explicitly remove
      -  // CvtBBI as a successor.
      -  BBI.BB->removeSuccessor(CvtBBI->BB);
         RemoveExtraEdges(BBI);
       
         // Update block info. BB can be iteratively if-converted.
      
      
      2) My patch exposed a bug in the code for merging the tail of a "diamond",
      which had previously never been exercised.  The code was simply checking that
      the tail had a single predecessor, but there was a case in
      MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was
      neither edge of the diamond.  I added the following change to check for
      that:
      
      @@ -1276,7 +1276,18 @@
         // tail, add a unconditional branch to it.
         if (TailBB) {
           BBInfo TailBBI = BBAnalysis[TailBB->getNumber()];
      -    if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) {
      +    bool CanMergeTail = !TailBBI.HasFallThrough;
      +    // There may still be a fall-through edge from BBI1 or BBI2 to TailBB;
      +    // check if there are any other predecessors besides those.
      +    unsigned NumPreds = TailBB->pred_size();
      +    if (NumPreds > 1)
      +      CanMergeTail = false;
      +    else if (NumPreds == 1 && CanMergeTail) {
      +      MachineBasicBlock::pred_iterator PI = TailBB->pred_begin();
      +      if (*PI != BBI1->BB && *PI != BBI2->BB)
      +        CanMergeTail = false;
      +    }
      +    if (CanMergeTail) {
             MergeBlocks(BBI, TailBBI);
             TailBBI.IsDone = true;
           } else {
      
      With these fixes, I was able to run all the SingleSource and MultiSource
      tests successfully.
      
      llvm-svn: 107110
      1e5da550
  25. Jun 28, 2010
  26. Jun 26, 2010
Loading