Skip to content
  1. Oct 17, 2012
  2. Oct 16, 2012
    • Michael Gottesman's avatar
      [InstCombine] Teach InstCombine how to handle an obfuscated splat. · 02a1141e
      Michael Gottesman authored
      An obfuscated splat is where the frontend poorly generates code for a splat
      using several different shuffles to create the splat, i.e.,
      
        %A = load <4 x float>* %in_ptr, align 16
        %B = shufflevector <4 x float> %A, <4 x float> undef, <4 x i32> <i32 0, i32 0, i32 undef, i32 undef>
        %C = shufflevector <4 x float> %B, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 4, i32 undef>
        %D = shufflevector <4 x float> %C, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 2, i32 4>
      
      llvm-svn: 166061
      02a1141e
    • Chad Rosier's avatar
    • Jakub Staszak's avatar
      Simplify code. No functionality change. · 8f46e914
      Jakub Staszak authored
      llvm-svn: 166053
      8f46e914
    • Michael Liao's avatar
      Check .rela instead of ELF64 for the compensation vaue resetting · d6f3168a
      Michael Liao authored
      llvm-svn: 166051
      d6f3168a
    • Jakub Staszak's avatar
      80-col fixup. · 25dcab1e
      Jakub Staszak authored
      llvm-svn: 166050
      25dcab1e
    • Michael Liao's avatar
      Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x) · 19006206
      Michael Liao authored
      llvm-svn: 166049
      19006206
    • Rafael Espindola's avatar
      Switch back to the old coalescer for now to fix the 32 bit bit · b58be2c5
      Rafael Espindola authored
      llvm+clang+compiler-rt bootstrap.
      
      llvm-svn: 166046
      b58be2c5
    • Jakub Staszak's avatar
    • Bill Wendling's avatar
    • Michael Liao's avatar
      Support v8f32 to v8i8/vi816 conversion through custom lowering · 02ca3454
      Michael Liao authored
      - Add custom FP_TO_SINT on v8i16 (and v8i8 which is legalized as v8i16 due to
        vector element-wise widening) to reduce DAG combiner and its overhead added
        in X86 backend.
      
      llvm-svn: 166036
      02ca3454
    • Bill Wendling's avatar
      53a6f63c
    • Owen Anderson's avatar
      Speculative fix the mask constants to be of type uintptr_t. I don't know of... · 544284eb
      Owen Anderson authored
      Speculative fix the mask constants to be of type uintptr_t.  I don't know of any case where the old form was incorrect, but I'm more confident that such cases don't exist in this version.
      
      llvm-svn: 166031
      544284eb
    • Dmitri Gribenko's avatar
      610a86e6
    • Bill Schmidt's avatar
      This patch addresses PR13949. · 48081cad
      Bill Schmidt authored
      For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
      bytes are to be passed in the low-order bits ("right-adjusted") of the
      doubleword register or memory slot assigned to them.  A previous patch
      addressed this for aggregates passed in registers.  However, small
      aggregates passed in the overflow portion of the parameter save area are
      still being passed left-adjusted.
      
      The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
      caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
      the callee side.  The main fix on the callee side simply extends
      existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
      and correcting a constant left over from 32-bit code.  There is also a
      fix to a bogus calculation of the offset to the following argument in
      the parameter save area.
      
      On the caller side, again a constant left over from 32-bit code is
      fixed.  Additionally, some code for 1, 2, and 4-byte objects is
      duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only.  The
      LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
      handle both ABIs, and I propose to separate this into two functions in a
      future patch, at which time the duplication can be removed.
      
      The patch adds a new test (structsinmem.ll) to demonstrate correct
      passing of structures of all seven sizes.  Eight dummy parameters are
      used to force these structures to be in the overflow portion of the
      parameter save area.
      
      As a side effect, this corrects the case when aggregates passed in
      registers are saved into the first eight doublewords of the parameter
      save area:  Previously they were stored left-justified, and now are
      properly stored right-justified.  This requires changing the expected
      output of existing test case structsinregs.ll.
      
      llvm-svn: 166022
      48081cad
    • Stepan Dyatkovskiy's avatar
      Issue: · e59a920b
      Stepan Dyatkovskiy authored
      Stack is formed improperly for long structures passed as byval arguments for
      EABI mode.
      
      If we took AAPCS reference, we can found the next statements:
      
      A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
      Core Register Number) is rounded up to the next even register number." (5.5
      Parameter Passing, Stage C, C.3).
      
      B: "The alignment of an aggregate shall be the alignment of its most-aligned
      component." (4.3 Composite Types, 4.3.1 Aggregates).
      
      So if we have structure with doubles (9 double fields) and 3 Core unused
      registers (r1, r2, r3): caller should use r2 and r3 registers only.
      Currently r1,r2,r3 set is used, but it is invalid.
      
      Callee VA routine should also use r2 and r3 regs only. All is ok here. This
      behaviour is guessed by rounding up SP address with ADD+BFC operations.
      
      Fix:
      Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
      8 byte alignment, we waste odd registers then.
      
      P.S.:
      I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
      not generated by current regression test after this patch. 
      
      llvm-svn: 166018
      e59a920b
    • NAKAMURA Takumi's avatar
      Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>. · 1705a999
      NAKAMURA Takumi authored
      Original message:
      
      The attached is the fix to radar://11663049. The optimization can be outlined by following rules:
      
         (select (x != c), e, c) -> select (x != c), e, x),
         (select (x == c), c, e) -> select (x == c), x, e)
      where the <c> is an integer constant.
      
       The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
      however, conditional-move-from-register need only one instruction.
      
        While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.
      
        The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".
      
      Original message since r165661:
      
      My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.
      
      llvm-svn: 166017
      1705a999
    • Bill Wendling's avatar
      Cleanup whitespace. · 118a78b9
      Bill Wendling authored
      llvm-svn: 166016
      118a78b9
    • Owen Anderson's avatar
      Fix a bug in the set(I,E)/reset(I,E) methods that I recently added. The... · 04b8daa9
      Owen Anderson authored
      Fix a bug in the set(I,E)/reset(I,E) methods that I recently added.  The boundary condition for checking if I and E were in the same word were incorrect, and, beyond that, the mask computation was not using a wide enough constant.
      
      llvm-svn: 166015
      04b8daa9
    • Craig Topper's avatar
    • Bill Wendling's avatar
      Cleanup whitespace. · a529ade5
      Bill Wendling authored
      llvm-svn: 166013
      a529ade5
Loading