Skip to content
  1. Apr 25, 2013
  2. Jan 08, 2013
    • Preston Gurd's avatar
      Pad Short Functions for Intel Atom · a01daace
      Preston Gurd authored
      The current Intel Atom microarchitecture has a feature whereby
      when a function returns early then it is slightly faster to execute
      a sequence of NOP instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction until
      the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass,
      called "X86PadShortFunction" which will add NOP instructions where less
      than four cycles elapse between function entry and return.
      
      It includes tests.
      
      This patch has been updated to address Nadav's review comments
      - Optimize only at >= O1 and don't do optimization if -Os is set
      - Stores MachineBasicBlock* instead of BBNum
      - Uses DenseMap instead of std::map
      - Fixes placement of braces
      
      Patch by Andy Zhang.
      
      llvm-svn: 171879
      a01daace
  3. Jan 07, 2013
    • Chandler Carruth's avatar
      Switch TargetTransformInfo from an immutable analysis pass that requires · 664e354d
      Chandler Carruth authored
      a TargetMachine to construct (and thus isn't always available), to an
      analysis group that supports layered implementations much like
      AliasAnalysis does. This is a pretty massive change, with a few parts
      that I was unable to easily separate (sorry), so I'll walk through it.
      
      The first step of this conversion was to make TargetTransformInfo an
      analysis group, and to sink the nonce implementations in
      ScalarTargetTransformInfo and VectorTargetTranformInfo into
      a NoTargetTransformInfo pass. This allows other passes to add a hard
      requirement on TTI, and assume they will always get at least on
      implementation.
      
      The TargetTransformInfo analysis group leverages the delegation chaining
      trick that AliasAnalysis uses, where the base class for the analysis
      group delegates to the previous analysis *pass*, allowing all but tho
      NoFoo analysis passes to only implement the parts of the interfaces they
      support. It also introduces a new trick where each pass in the group
      retains a pointer to the top-most pass that has been initialized. This
      allows passes to implement one API in terms of another API and benefit
      when some other pass above them in the stack has more precise results
      for the second API.
      
      The second step of this conversion is to create a pass that implements
      the TargetTransformInfo analysis using the target-independent
      abstractions in the code generator. This replaces the
      ScalarTargetTransformImpl and VectorTargetTransformImpl classes in
      lib/Target with a single pass in lib/CodeGen called
      BasicTargetTransformInfo. This class actually provides most of the TTI
      functionality, basing it upon the TargetLowering abstraction and other
      information in the target independent code generator.
      
      The third step of the conversion adds support to all TargetMachines to
      register custom analysis passes. This allows building those passes with
      access to TargetLowering or other target-specific classes, and it also
      allows each target to customize the set of analysis passes desired in
      the pass manager. The baseline LLVMTargetMachine implements this
      interface to add the BasicTTI pass to the pass manager, and all of the
      tools that want to support target-aware TTI passes call this routine on
      whatever target machine they end up with to add the appropriate passes.
      
      The fourth step of the conversion created target-specific TTI analysis
      passes for the X86 and ARM backends. These passes contain the custom
      logic that was previously in their extensions of the
      ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces.
      I separated them into their own file, as now all of the interface bits
      are private and they just expose a function to create the pass itself.
      Then I extended these target machines to set up a custom set of analysis
      passes, first adding BasicTTI as a fallback, and then adding their
      customized TTI implementations.
      
      The fourth step required logic that was shared between the target
      independent layer and the specific targets to move to a different
      interface, as they no longer derive from each other. As a consequence,
      a helper functions were added to TargetLowering representing the common
      logic needed both in the target implementation and the codegen
      implementation of the TTI pass. While technically this is the only
      change that could have been committed separately, it would have been
      a nightmare to extract.
      
      The final step of the conversion was just to delete all the old
      boilerplate. This got rid of the ScalarTargetTransformInfo and
      VectorTargetTransformInfo classes, all of the support in all of the
      targets for producing instances of them, and all of the support in the
      tools for manually constructing a pass based around them.
      
      Now that TTI is a relatively normal analysis group, two things become
      straightforward. First, we can sink it into lib/Analysis which is a more
      natural layer for it to live. Second, clients of this interface can
      depend on it *always* being available which will simplify their code and
      behavior. These (and other) simplifications will follow in subsequent
      commits, this one is clearly big enough.
      
      Finally, I'm very aware that much of the comments and documentation
      needs to be updated. As soon as I had this working, and plausibly well
      commented, I wanted to get it committed and in front of the build bots.
      I'll be doing a few passes over documentation later if it sticks.
      
      Commits to update DragonEgg and Clang will be made presently.
      
      llvm-svn: 171681
      664e354d
  4. Jan 05, 2013
    • Nadav Rotem's avatar
      Revert revision 171524. Original message: · 478b6a47
      Nadav Rotem authored
      URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
      Log:
      The current Intel Atom microarchitecture has a feature whereby when a function
      returns early then it is slightly faster to execute a sequence of NOP
      instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction
      until the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass, called
      "X86PadShortFunction" which will add NOP instructions where less than four
      cycles elapse between function entry and return.
      
      It includes tests.
      
      Patch by Andy Zhang.
      
      llvm-svn: 171603
      478b6a47
  5. Jan 04, 2013
    • Preston Gurd's avatar
      The current Intel Atom microarchitecture has a feature whereby when a function · e36b685a
      Preston Gurd authored
      returns early then it is slightly faster to execute a sequence of NOP
      instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction
      until the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass, called
      "X86PadShortFunction" which will add NOP instructions where less than four
      cycles elapse between function entry and return.
      
      It includes tests.
      
      Patch by Andy Zhang.
      
      llvm-svn: 171524
      e36b685a
  6. Nov 26, 2012
    • Chad Rosier's avatar
      Remove the X86 Maximal Stack Alignment Check pass as it is no longer necessary. · 4179e3f5
      Chad Rosier authored
      This pass was conservative in that it always reserved the FP to enable dynamic
      stack realignment, which allowed the RA to use aligned spills for vector
      registers.  This happens even when spills were not necessary.  The RA has 
      since been improved to use unaligned spills when necessary.
      
      The new behavior is to realign the stack if the frame pointer was already
      reserved for some other reason, but don't reserve the frame pointer just
      because a function contains vector virtual registers.
      
      Part of rdar://12719844
      
      llvm-svn: 168627
      4179e3f5
  7. Aug 01, 2012
  8. Jun 01, 2012
    • Hans Wennborg's avatar
      Implement the local-dynamic TLS model for x86 (PR3985) · 789acfb6
      Hans Wennborg authored
      This implements codegen support for accesses to thread-local variables
      using the local-dynamic model, and adds a clean-up pass so that the base
      address for the TLS block can be re-used between local-dynamic access on
      an execution path.
      
      llvm-svn: 157818
      789acfb6
  9. Mar 17, 2012
  10. Sep 28, 2011
  11. Aug 23, 2011
    • Bruno Cardoso Lopes's avatar
      Introduce a pass to insert vzeroupper instructions to avoid AVX to · 2a3ffb5d
      Bruno Cardoso Lopes authored
      SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper"
      llc command line option. This is only the first step (very naive and
      conservative one) to sketch out the idea, but proper DFA is coming next
      to allow smarter decisions. Comments and ideas now and in further commits
      will be very appreciated.
      
      llvm-svn: 138317
      2a3ffb5d
  12. Jul 25, 2011
  13. Jul 11, 2011
    • Evan Cheng's avatar
      - Eliminate MCCodeEmitter's dependency on TargetMachine. It now uses MCInstrInfo · c5e6d2f5
      Evan Cheng authored
        and MCSubtargetInfo.
      - Added methods to update subtarget features (used when targets automatically
        detect subtarget features or switch modes).
      - Teach X86Subtarget to update MCSubtargetInfo features bits since the
        MCSubtargetInfo layer can be shared with other modules.
      - These fixes .code 16 / .code 32 support since mode switch is updated in
        MCSubtargetInfo so MC code emitter can do the right thing.
      
      llvm-svn: 134884
      c5e6d2f5
  14. Jul 07, 2011
  15. Jun 28, 2011
  16. Jun 25, 2011
  17. Jun 24, 2011
    • Evan Cheng's avatar
      Starting to refactor Target to separate out code that's needed to fully describe · 24753317
      Evan Cheng authored
      target machine from those that are only needed by codegen. The goal is to
      sink the essential target description into MC layer so we can start building
      MC based tools without needing to link in the entire codegen.
      
      First step is to refactor TargetRegisterInfo. This patch added a base class
      MCRegisterInfo which TargetRegisterInfo is derived from. Changed TableGen to
      separate register description from the rest of the stuff.
      
      llvm-svn: 133782
      24753317
  18. Dec 20, 2010
  19. Jul 16, 2010
  20. Jul 10, 2010
  21. Apr 06, 2010
    • Jim Grosbach's avatar
      Fix PR6696 and PR6663 · 4dac8906
      Jim Grosbach authored
      When a frame pointer is not otherwise required, and dynamic stack alignment
      is necessary solely due to the spilling of a register with larger alignment
      requirements than the default stack alignment, the frame pointer can be both
      used as a general purpose register and a frame pointer. That goes poorly, for
      obvious reasons. This patch brings back a bit of old logic for identifying
      the use of such registers and conservatively reserves the frame pointer
      during register allocation in such cases.
      
      For now, implement for X86 only since it's 32-bit linux which is hitting this,
      and we want a targeted fix for 2.7. As a follow-on, this will be expanded
      to handle other targets, as theoretically the problem could arise elsewhere
      as well.
      
      llvm-svn: 100559
      4dac8906
  22. Mar 25, 2010
    • Jakob Stoklund Olesen's avatar
      Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings. · 49e121d5
      Jakob Stoklund Olesen authored
      On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
      in a different domain than where it was defined. Some instructions have
      equvivalents for different domains, like por/orps/orpd.
      
      The SSEDomainFix pass tries to minimize the number of domain crossings by
      changing between equvivalent opcodes where possible.
      
      This is a work in progress, in particular the pass doesn't do anything yet. SSE
      instructions are tagged with their execution domain in TableGen using the last
      two bits of TSFlags. Note that not all instructions are tagged correctly. Life
      just isn't that simple.
      
      The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
      issue handled by NEONMoveFixPass. This pass may become target independent to
      handle both.
      
      llvm-svn: 99524
      49e121d5
  23. Mar 24, 2010
  24. Mar 11, 2010
  25. Feb 21, 2010
  26. Feb 13, 2010
  27. Feb 05, 2010
  28. Feb 03, 2010
  29. Feb 02, 2010
  30. Dec 02, 2009
  31. Aug 27, 2009
    • Daniel Dunbar's avatar
      llvm-mc/X86: Implement single instruction encoding interface for MC. · 981a71c3
      Daniel Dunbar authored
       - Note, this is a gigantic hack, with the sole purpose of unblocking further
         work on the assembler (its also possible to test the mathcer more completely
         now).
      
       - Despite being a hack, its actually good enough to work over all of 403.gcc
         (although some encodings are probably incorrect). This is a testament to the 
         beauty of X86's MachineInstr, no doubt! ;)
      
      llvm-svn: 80234
      981a71c3
  32. Jul 25, 2009
  33. Jul 19, 2009
  34. Jul 15, 2009
    • Daniel Dunbar's avatar
      Reapply TargetRegistry refactoring commits. · e833810a
      Daniel Dunbar authored
      --- Reverse-merging r75799 into '.':
       U   test/Analysis/PointerTracking
      U    include/llvm/Target/TargetMachineRegistry.h
      U    include/llvm/Target/TargetMachine.h
      U    include/llvm/Target/TargetRegistry.h
      U    include/llvm/Target/TargetSelect.h
      U    tools/lto/LTOCodeGenerator.cpp
      U    tools/lto/LTOModule.cpp
      U    tools/llc/llc.cpp
      U    lib/Target/PowerPC/PPCTargetMachine.h
      U    lib/Target/PowerPC/AsmPrinter/PPCAsmPrinter.cpp
      U    lib/Target/PowerPC/PPCTargetMachine.cpp
      U    lib/Target/PowerPC/PPC.h
      U    lib/Target/ARM/ARMTargetMachine.cpp
      U    lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp
      U    lib/Target/ARM/ARMTargetMachine.h
      U    lib/Target/ARM/ARM.h
      U    lib/Target/XCore/XCoreTargetMachine.cpp
      U    lib/Target/XCore/XCoreTargetMachine.h
      U    lib/Target/PIC16/PIC16TargetMachine.cpp
      U    lib/Target/PIC16/PIC16TargetMachine.h
      U    lib/Target/Alpha/AsmPrinter/AlphaAsmPrinter.cpp
      U    lib/Target/Alpha/AlphaTargetMachine.cpp
      U    lib/Target/Alpha/AlphaTargetMachine.h
      U    lib/Target/X86/X86TargetMachine.h
      U    lib/Target/X86/X86.h
      U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h
      U    lib/Target/X86/AsmPrinter/X86AsmPrinter.cpp
      U    lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h
      U    lib/Target/X86/X86TargetMachine.cpp
      U    lib/Target/MSP430/MSP430TargetMachine.cpp
      U    lib/Target/MSP430/MSP430TargetMachine.h
      U    lib/Target/CppBackend/CPPTargetMachine.h
      U    lib/Target/CppBackend/CPPBackend.cpp
      U    lib/Target/CBackend/CTargetMachine.h
      U    lib/Target/CBackend/CBackend.cpp
      U    lib/Target/TargetMachine.cpp
      U    lib/Target/IA64/IA64TargetMachine.cpp
      U    lib/Target/IA64/AsmPrinter/IA64AsmPrinter.cpp
      U    lib/Target/IA64/IA64TargetMachine.h
      U    lib/Target/IA64/IA64.h
      U    lib/Target/MSIL/MSILWriter.cpp
      U    lib/Target/CellSPU/SPUTargetMachine.h
      U    lib/Target/CellSPU/SPU.h
      U    lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp
      U    lib/Target/CellSPU/SPUTargetMachine.cpp
      U    lib/Target/Mips/AsmPrinter/MipsAsmPrinter.cpp
      U    lib/Target/Mips/MipsTargetMachine.cpp
      U    lib/Target/Mips/MipsTargetMachine.h
      U    lib/Target/Mips/Mips.h
      U    lib/Target/Sparc/AsmPrinter/SparcAsmPrinter.cpp
      U    lib/Target/Sparc/SparcTargetMachine.cpp
      U    lib/Target/Sparc/SparcTargetMachine.h
      U    lib/ExecutionEngine/JIT/TargetSelect.cpp
      U    lib/Support/TargetRegistry.cpp
      
      llvm-svn: 75820
      e833810a
Loading