Skip to content
  1. Jul 28, 2013
  2. Jul 24, 2013
  3. May 03, 2013
  4. Apr 25, 2013
  5. Apr 19, 2013
  6. Mar 29, 2013
  7. Mar 28, 2013
  8. Mar 27, 2013
    • Preston Gurd's avatar
      · 663e6f95
      Preston Gurd authored
      For the current Atom processor, the fastest way to handle a call
      indirect through a memory address is to load the memory address into
      a register and then call indirect through the register.
      
      This patch implements this improvement by modifying SelectionDAG to
      force a function address which is a memory reference to be loaded
      into a virtual register.
      
      Patch by Sriram Murali.
      
      llvm-svn: 178171
      663e6f95
  9. Mar 26, 2013
  10. Feb 14, 2013
  11. Jan 08, 2013
    • Preston Gurd's avatar
      Pad Short Functions for Intel Atom · a01daace
      Preston Gurd authored
      The current Intel Atom microarchitecture has a feature whereby
      when a function returns early then it is slightly faster to execute
      a sequence of NOP instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction until
      the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass,
      called "X86PadShortFunction" which will add NOP instructions where less
      than four cycles elapse between function entry and return.
      
      It includes tests.
      
      This patch has been updated to address Nadav's review comments
      - Optimize only at >= O1 and don't do optimization if -Os is set
      - Stores MachineBasicBlock* instead of BBNum
      - Uses DenseMap instead of std::map
      - Fixes placement of braces
      
      Patch by Andy Zhang.
      
      llvm-svn: 171879
      a01daace
  12. Jan 05, 2013
    • Nadav Rotem's avatar
      Revert revision 171524. Original message: · 478b6a47
      Nadav Rotem authored
      URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
      Log:
      The current Intel Atom microarchitecture has a feature whereby when a function
      returns early then it is slightly faster to execute a sequence of NOP
      instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction
      until the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass, called
      "X86PadShortFunction" which will add NOP instructions where less than four
      cycles elapse between function entry and return.
      
      It includes tests.
      
      Patch by Andy Zhang.
      
      llvm-svn: 171603
      478b6a47
  13. Jan 04, 2013
    • Preston Gurd's avatar
      The current Intel Atom microarchitecture has a feature whereby when a function · e36b685a
      Preston Gurd authored
      returns early then it is slightly faster to execute a sequence of NOP
      instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction
      until the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass, called
      "X86PadShortFunction" which will add NOP instructions where less than four
      cycles elapse between function entry and return.
      
      It includes tests.
      
      Patch by Andy Zhang.
      
      llvm-svn: 171524
      e36b685a
  14. Dec 15, 2012
    • Chandler Carruth's avatar
      Make '-mtune=x86_64' assume fast unaligned memory accesses. · 7a28f954
      Chandler Carruth authored
      Not all chips targeted by x86_64 have this feature, but a dramatically
      increasing number do. Specifying a chip-specific tuning parameter will
      continue to turn the feature on or off as appropriate for that
      particular chip, but the generic flag should try to achieve the best
      performance on the most widely available hardware. Today, the number of
      chips with fast UA access dwarfs those without in the x86-64 space.
      
      Note that this also brings LLVM's code generation for this '-march' flag
      more in line with that of modern GCCs. Reviewed by Dan Gohman.
      
      llvm-svn: 170269
      7a28f954
  15. Dec 10, 2012
    • Chandler Carruth's avatar
      Revert "Make '-mtune=x86_64' assume fast unaligned memory accesses." · 867c7bff
      Chandler Carruth authored
      Accidental commit... git svn betrayed me. Sorry for the noise.
      
      llvm-svn: 169741
      867c7bff
    • Chandler Carruth's avatar
      Make '-mtune=x86_64' assume fast unaligned memory accesses. · 7eaa45c7
      Chandler Carruth authored
      Summary:
      Not all chips targeted by x86_64 have this feature, but a dramatically
      increasing number do. Specifying a chip-specific tuning parameter will
      continue to turn the feature on or off as appropriate for that
      particular chip, but the generic flag should try to achieve the best
      performance on the most widely available hardware. Today, the number of
      chips with fast UA access dwarfs those without in the x86-64 space.
      
      Note that this also brings LLVM's code generation for this '-march' flag
      more in line with that of modern GCCs.
      
      CC: llvm-commits
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D195
      
      llvm-svn: 169740
      7eaa45c7
    • Chandler Carruth's avatar
      Address a FIXME and update the fast unaligned memory feature for newer · 0f585581
      Chandler Carruth authored
      Intel chips.
      
      The model number rules were determined by inspecting Intel's
      documentation for their newer chip model numbers. My understanding is
      that all of the newer Intel chips have fast unaligned memory access, but
      if anyone is concerned about a particular chip, just shout.
      
      No tests updated; it's not clear we have dedicated tests for the chips'
      various features, but if anyone would like tests (or can point me at
      some existing ones), I'm happy to oblige.
      
      llvm-svn: 169730
      0f585581
  16. Nov 08, 2012
    • Michael Liao's avatar
      Add support of RTM from TSX extension · 73cffddb
      Michael Liao authored
      - Add RTM code generation support throught 3 X86 intrinsics:
        xbegin()/xend() to start/end a transaction region, and xabort() to abort a
        tranaction region
      
      llvm-svn: 167573
      73cffddb
  17. Oct 25, 2012
  18. Oct 03, 2012
  19. Sep 12, 2012
  20. Sep 04, 2012
    • Preston Gurd's avatar
      Generic Bypass Slow Div · cdf540d5
      Preston Gurd authored
      - CodeGenPrepare pass for identifying div/rem ops
      - Backend specifies the type mapping using addBypassSlowDivType
      - Enabled only for Intel Atom with O2 32-bit -> 8-bit
      - Replace IDIV with instructions which test its value and use DIVB if the value
      is positive and less than 256.
      - In the case when the quotient and remainder of a divide are used a DIV
      and a REM instruction will be present in the IR. In the non-Atom case
      they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
      using the quotient and remainder from the first IDIV. However,
      due to this optimization CSE is not able to eliminate redundant
      IDIV instructions because they are located in different basic blocks.
      This is overcome by calculating both the quotient (DIV) and remainder (REM)
      in each basic block that is inserted by the optimization and reusing the result
      values when a subsequent DIV or REM instruction uses the same operands.
      - Test cases check for the presents of the optimization when calculating
      either the quotient, remainder,  or both.
      
      Patch by Tyler Nowicki!
      
      llvm-svn: 163150
      cdf540d5
  21. Aug 16, 2012
  22. Jul 07, 2012
    • Andrew Trick's avatar
      I'm introducing a new machine model to simultaneously allow simple · 87255e34
      Andrew Trick authored
      subtarget CPU descriptions and support new features of
      MachineScheduler.
      
      MachineModel has three categories of data:
      1) Basic properties for coarse grained instruction cost model.
      2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
      3) Instruction itineraties for detailed per-cycle reservation tables.
      
      These will all live side-by-side. Any subtarget can use any
      combination of them. Instruction itineraries will not change in the
      near term. In the long run, I expect them to only be relevant for
      in-order VLIW machines that have complex contraints and require a
      precise scheduling/bundling model. Once itineraries are only actively
      used by VLIW-ish targets, they could be replaced by something more
      appropriate for those targets.
      
      This tablegen backend rewrite sets things up for introducing
      MachineModel type #2: per opcode/operand cost model.
      
      llvm-svn: 159891
      87255e34
  23. Jun 03, 2012
  24. May 31, 2012
  25. May 01, 2012
  26. Apr 26, 2012
  27. Feb 18, 2012
  28. Feb 07, 2012
  29. Feb 02, 2012
    • Andrew Trick's avatar
      Instruction scheduling itinerary for Intel Atom. · 8523b16f
      Andrew Trick authored
      Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
      
      Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
      
      Adds a test to verify that the scheduler is working.
      
      Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
      
      Patch by Preston Gurd!
      
      llvm-svn: 149558
      8523b16f
  30. Jan 12, 2012
  31. Jan 10, 2012
Loading