Skip to content
  • Dale Johannesen's avatar
    Fix the time regression I introduced in 464.h264ref with · 12d031b7
    Dale Johannesen authored
    my last patch to this file.
    
    The issue there was that all uses of an IV inside a loop
    are actually references to Base[IV*2], and there was one
    use outside that was the same but LSR didn't see the base
    or the scaling because it didn't recurse into uses outside
    the loop; thus, it used base+IV*scale mode inside the loop
    instead of pulling base out of the loop.  This was extra bad
    because register pressure later forced both base and IV into
    memory.  Doing that recursion, at least enough
    to figure out addressing modes, is a good idea in general;
    the change in AddUsersIfInteresting does this.  However,
    there were side effects....
    
    It is also possible for recursing outside the loop to
    introduce another IV where there was only 1 before (if
    the refs inside are not scaled and the ref outside is).
    I don't think this is a common case, but it's in the testsuite.
    It is right to be very aggressive about getting rid of
    such introduced IVs (CheckForIVReuse and the handling of
    nonzero RewriteFactor in StrengthReduceStridedIVUsers).
    In the testcase in question the new IV produced this way
    has both a nonconstant stride and a nonzero base, neither
    of which was handled before.  (This patch does not handle 
    all the cases where this can happen.)  And when inserting 
    new code that feeds into a PHI, it's right to put such 
    code at the original location rather than in the PHI's 
    immediate predecessor(s) when the original location is outside 
    the loop (a case that couldn't happen before)
    (RewriteInstructionToUseNewBase); better to avoid making
    multiple copies of it in this case.
    
    Everything above is exercised in
    CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is
    the same IR).
    
    llvm-svn: 61178
    12d031b7
Loading