Skip to content
  1. Sep 13, 2012
  2. Sep 12, 2012
  3. Sep 11, 2012
  4. Sep 10, 2012
  5. Sep 08, 2012
  6. Sep 07, 2012
  7. Sep 06, 2012
  8. Sep 05, 2012
  9. Sep 04, 2012
    • Preston Gurd's avatar
      Generic Bypass Slow Div · cdf540d5
      Preston Gurd authored
      - CodeGenPrepare pass for identifying div/rem ops
      - Backend specifies the type mapping using addBypassSlowDivType
      - Enabled only for Intel Atom with O2 32-bit -> 8-bit
      - Replace IDIV with instructions which test its value and use DIVB if the value
      is positive and less than 256.
      - In the case when the quotient and remainder of a divide are used a DIV
      and a REM instruction will be present in the IR. In the non-Atom case
      they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
      using the quotient and remainder from the first IDIV. However,
      due to this optimization CSE is not able to eliminate redundant
      IDIV instructions because they are located in different basic blocks.
      This is overcome by calculating both the quotient (DIV) and remainder (REM)
      in each basic block that is inserted by the optimization and reusing the result
      values when a subsequent DIV or REM instruction uses the same operands.
      - Test cases check for the presents of the optimization when calculating
      either the quotient, remainder,  or both.
      
      Patch by Tyler Nowicki!
      
      llvm-svn: 163150
      cdf540d5
    • Elena Demikhovsky's avatar
      This patch optimizes shuffle instruction - generates 2 instructions instead of 4. · cbe99bbb
      Elena Demikhovsky authored
      Since this specific shuffle is widely used in many workloads we have ~10% performance on them.
      
      shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>
      
      vmovaps (%rdx), %ymm0
      vshufps $8, %ymm0, %ymm0, %ymm0
      vmovaps (%rcx), %ymm1
      vshufps $8, %ymm0, %ymm1, %ymm1
      vunpcklps       %ymm0, %ymm1, %ymm0
      
      vmovaps (%rcx), %ymm0
      vmovsldup       (%rdx), %ymm1
      vblendps        $85, %ymm0, %ymm1, %ymm0
      
      llvm-svn: 163134
      cbe99bbb
Loading