Skip to content
  1. Sep 26, 2012
  2. Sep 04, 2012
    • Preston Gurd's avatar
      Generic Bypass Slow Div · cdf540d5
      Preston Gurd authored
      - CodeGenPrepare pass for identifying div/rem ops
      - Backend specifies the type mapping using addBypassSlowDivType
      - Enabled only for Intel Atom with O2 32-bit -> 8-bit
      - Replace IDIV with instructions which test its value and use DIVB if the value
      is positive and less than 256.
      - In the case when the quotient and remainder of a divide are used a DIV
      and a REM instruction will be present in the IR. In the non-Atom case
      they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
      using the quotient and remainder from the first IDIV. However,
      due to this optimization CSE is not able to eliminate redundant
      IDIV instructions because they are located in different basic blocks.
      This is overcome by calculating both the quotient (DIV) and remainder (REM)
      in each basic block that is inserted by the optimization and reusing the result
      values when a subsequent DIV or REM instruction uses the same operands.
      - Test cases check for the presents of the optimization when calculating
      either the quotient, remainder,  or both.
      
      Patch by Tyler Nowicki!
      
      llvm-svn: 163150
      cdf540d5
  3. Aug 30, 2012
    • Michael Liao's avatar
      Introduce 'UseSSEx' to force SSE legacy encoding · bbd10792
      Michael Liao authored
      - Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
        enabled.
      
        As the penalty of inter-mixing SSE and AVX instructions, we need
        prevent SSE legacy insn from being generated except explicitly
        specified through some intrinsics. For patterns supported by both
        SSE and AVX, so far, we force AVX insn will be tried first relying on
        AddedComplexity or position in td file. It's error-prone and
        introduces bugs accidentally.
      
        'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
        by AVX, we need this predicate to force VEX encoding or SSE legacy
        encoding only.
      
        For insns not inherited by AVX, we still use the previous predicates,
        i.e. 'HasSSEx'. So far, these insns fall into the following
        categories:
        * SSE insns with MMX operands
        * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
          CRC, and etc.)
        * SSE4A insns.
        * MMX insns.
        * x87 insns added by SSE.
      
      2 test cases are modified:
      
       - test/CodeGen/X86/fast-isel-x86-64.ll
         AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
         selected by fast-isel due to complicated pattern and fast-isel
         fallback to materialize it from constant pool.
      
       - test/CodeGen/X86/widen_load-1.ll
         AVX code generation is different from SSE one after fixing SSE/AVX
         inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
         'vmovaps'.
      
      llvm-svn: 162919
      bbd10792
  4. Aug 24, 2012
  5. Aug 23, 2012
  6. Aug 01, 2012
  7. Jun 03, 2012
  8. May 31, 2012
  9. Apr 23, 2012
    • Preston Gurd's avatar
      This patch fixes a problem which arose when using the Post-RA scheduler · 9a091475
      Preston Gurd authored
      on X86 Atom. Some of our tests failed because the tail merging part of
      the BranchFolding pass was creating new basic blocks which did not
      contain live-in information. When the anti-dependency code in the Post-RA
      scheduler ran, it would sometimes rename the register containing
      the function return value because the fact that the return value was
      live-in to the subsequent block had been lost. To fix this, it is necessary
      to run the RegisterScavenging code in the BranchFolding pass.
      
      This patch makes sure that the register scavenging code is invoked
      in the X86 subtarget only when post-RA scheduling is being done.
      Post RA scheduling in the X86 subtarget is only done for Atom.
      
      This patch adds a new function to the TargetRegisterClass to control
      whether or not live-ins should be preserved during branch folding.
      This is necessary in order for the anti-dependency optimizations done
      during the PostRASchedulerList pass to work properly when doing
      Post-RA scheduling for the X86 in general and for the Intel Atom in particular.
      
      The patch adds and invokes the new function trackLivenessAfterRegAlloc()
      instead of using the existing requiresRegisterScavenging().
      It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
      requiresRegisterScavenging(). It changes the all the targets that
      implemented requiresRegisterScavenging() to also implement
      trackLivenessAfterRegAlloc().  
      
      It adds an assertion in the Post RA scheduler to make sure that post RA
      liveness information is available when it is needed.
      
      It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
      to avoid running into the added assertion.
      
      Finally, this patch restores the use of anti-dependency checking
      (which was turned off temporarily for the 3.1 release) for
      Intel Atom in the Post RA scheduler.
      
      Patch by Andy Zhang!
      
      Thanks to Jakob and Anton for their reviews.
      
      llvm-svn: 155395
      9a091475
  10. Mar 17, 2012
  11. Feb 19, 2012
  12. Feb 18, 2012
  13. Feb 07, 2012
  14. Feb 05, 2012
  15. Feb 02, 2012
    • Andrew Trick's avatar
      Instruction scheduling itinerary for Intel Atom. · 8523b16f
      Andrew Trick authored
      Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
      
      Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
      
      Adds a test to verify that the scheduler is working.
      
      Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
      
      Patch by Preston Gurd!
      
      llvm-svn: 149558
      8523b16f
  16. Jan 10, 2012
  17. Jan 09, 2012
  18. Dec 09, 2011
  19. Dec 08, 2011
    • Evan Cheng's avatar
      Many of the SSE patterns should not be selected when AVX is available. This... · 4d1a2d44
      Evan Cheng authored
      Many of the SSE patterns should not be selected when AVX is available. This led to the following code in X86Subtarget.cpp
      
        if (HasAVX)
          X86SSELevel = NoMMXSSE;
      
      This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected.
      However, this breaks instructions which do not have AVX variants.
      
      The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX().
      Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change.
      
      However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case,
      the prefetch instructions. rdar://10538297
      
      llvm-svn: 146163
      4d1a2d44
  20. Dec 02, 2011
  21. Nov 22, 2011
  22. Oct 30, 2011
  23. Oct 18, 2011
  24. Oct 16, 2011
  25. Oct 14, 2011
  26. Oct 13, 2011
  27. Oct 11, 2011
  28. Oct 09, 2011
  29. Oct 03, 2011
  30. Sep 05, 2011
  31. Aug 26, 2011
  32. Jul 20, 2011
  33. Jul 09, 2011
  34. Jul 07, 2011
  35. Jul 02, 2011
  36. Jul 01, 2011
Loading