Skip to content
  1. Jul 24, 2013
  2. Jul 12, 2013
    • Charles Davis's avatar
      Target/X86: Add explicit Win64 and System V/x86-64 calling conventions. · e8f297ca
      Charles Davis authored
      Summary:
      This patch adds explicit calling convention types for the Win64 and
      System V/x86-64 ABIs. This allows code to override the default, and use
      the Win64 convention on a target that wants to use SysV (and
      vice-versa). This is needed to implement the `ms_abi` and `sysv_abi` GNU
      attributes.
      
      Reviewers:
      
      CC:
      
      llvm-svn: 186144
      e8f297ca
  3. Jun 25, 2013
  4. Jun 24, 2013
  5. Apr 25, 2013
  6. Mar 29, 2013
  7. Mar 27, 2013
    • Preston Gurd's avatar
      · 663e6f95
      Preston Gurd authored
      For the current Atom processor, the fastest way to handle a call
      indirect through a memory address is to load the memory address into
      a register and then call indirect through the register.
      
      This patch implements this improvement by modifying SelectionDAG to
      force a function address which is a memory reference to be loaded
      into a virtual register.
      
      Patch by Sriram Murali.
      
      llvm-svn: 178171
      663e6f95
  8. Mar 26, 2013
  9. Feb 16, 2013
  10. Feb 15, 2013
  11. Feb 14, 2013
  12. Jan 29, 2013
    • Evan Cheng's avatar
      Teach SDISel to combine fsin / fcos into a fsincos node if the following · 0e88c7d8
      Evan Cheng authored
      conditions are met:
      1. They share the same operand and are in the same BB.
      2. Both outputs are used.
      3. The target has a native instruction that maps to ISD::FSINCOS node or
         the target provides a sincos library call.
      
      Implemented the generic optimization in sdisel and enabled it for
      Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
      using an alternative entry point __sincos_stret which returns the two
      results in xmm0 / xmm1.
      
      rdar://13087969
      PR13204
      
      llvm-svn: 173755
      0e88c7d8
  13. Jan 25, 2013
    • Eli Bendersky's avatar
      In this patch, we teach X86_64TargetMachine that it has a ILP32 · 597fc123
      Eli Bendersky authored
      (defined by the x32 ABI) mode, in which case its pointers are 32-bits
      in size. This knowledge is also added to X86RegisterInfo that now
      returns the appropriate registers in getPointerRegClass.
      
      There are many outcomes to this change. In order to keep the patches
      separate and manageable, we start by focusing on some simple testable
      cases. The patch adds a test with passing a pointer to a function -
      focusing on the difference between the two data models for x86-64.
      Another test is added for handling of 'sret' arguments (and
      functionality is added in X86ISelLowering to make it work).
      
      A note on naming: the "x32 ABI" document refers to the AMD64
      architecture (in LLVM it's distinguished by being is64Bits() in the
      x86 subtarget) with two variations: the LP64 (default) data model, and
      the ILP32 data model. This patch adds predicates to the subtarget
      which are consistent with this naming scheme.
      
      llvm-svn: 173503
      597fc123
  14. Jan 08, 2013
    • Preston Gurd's avatar
      Pad Short Functions for Intel Atom · a01daace
      Preston Gurd authored
      The current Intel Atom microarchitecture has a feature whereby
      when a function returns early then it is slightly faster to execute
      a sequence of NOP instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction until
      the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass,
      called "X86PadShortFunction" which will add NOP instructions where less
      than four cycles elapse between function entry and return.
      
      It includes tests.
      
      This patch has been updated to address Nadav's review comments
      - Optimize only at >= O1 and don't do optimization if -Os is set
      - Stores MachineBasicBlock* instead of BBNum
      - Uses DenseMap instead of std::map
      - Fixes placement of braces
      
      Patch by Andy Zhang.
      
      llvm-svn: 171879
      a01daace
  15. Jan 05, 2013
    • Nadav Rotem's avatar
      Revert revision 171524. Original message: · 478b6a47
      Nadav Rotem authored
      URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
      Log:
      The current Intel Atom microarchitecture has a feature whereby when a function
      returns early then it is slightly faster to execute a sequence of NOP
      instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction
      until the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass, called
      "X86PadShortFunction" which will add NOP instructions where less than four
      cycles elapse between function entry and return.
      
      It includes tests.
      
      Patch by Andy Zhang.
      
      llvm-svn: 171603
      478b6a47
  16. Jan 04, 2013
    • Preston Gurd's avatar
      The current Intel Atom microarchitecture has a feature whereby when a function · e36b685a
      Preston Gurd authored
      returns early then it is slightly faster to execute a sequence of NOP
      instructions to wait until the return address is ready,
      as opposed to simply stalling on the ret instruction
      until the return address is ready.
      
      When compiling for X86 Atom only, this patch will run a pass, called
      "X86PadShortFunction" which will add NOP instructions where less than four
      cycles elapse between function entry and return.
      
      It includes tests.
      
      Patch by Andy Zhang.
      
      llvm-svn: 171524
      e36b685a
  17. Jan 02, 2013
    • Chandler Carruth's avatar
      Move all of the header files which are involved in modelling the LLVM IR · 9fb823bb
      Chandler Carruth authored
      into their new header subdirectory: include/llvm/IR. This matches the
      directory structure of lib, and begins to correct a long standing point
      of file layout clutter in LLVM.
      
      There are still more header files to move here, but I wanted to handle
      them in separate commits to make tracking what files make sense at each
      layer easier.
      
      The only really questionable files here are the target intrinsic
      tablegen files. But that's a battle I'd rather not fight today.
      
      I've updated both CMake and Makefile build systems (I think, and my
      tests think, but I may have missed something).
      
      I've also re-sorted the includes throughout the project. I'll be
      committing updates to Clang, DragonEgg, and Polly momentarily.
      
      llvm-svn: 171366
      9fb823bb
  18. Dec 04, 2012
  19. Nov 29, 2012
  20. Nov 08, 2012
    • Michael Liao's avatar
      Add support of RTM from TSX extension · 73cffddb
      Michael Liao authored
      - Add RTM code generation support throught 3 X86 intrinsics:
        xbegin()/xend() to start/end a transaction region, and xabort() to abort a
        tranaction region
      
      llvm-svn: 167573
      73cffddb
  21. Oct 08, 2012
  22. Oct 02, 2012
    • Andrew Kaylor's avatar
      Support for generating ELF objects on Windows. · feb805fc
      Andrew Kaylor authored
      This adds 'elf' as a recognized target triple environment value and overrides the default generated object format on Windows platforms if that value is present.  This patch also enables MCJIT tests on Windows using the new environment value.
      
      llvm-svn: 165030
      feb805fc
  23. Sep 26, 2012
  24. Sep 04, 2012
    • Preston Gurd's avatar
      Generic Bypass Slow Div · cdf540d5
      Preston Gurd authored
      - CodeGenPrepare pass for identifying div/rem ops
      - Backend specifies the type mapping using addBypassSlowDivType
      - Enabled only for Intel Atom with O2 32-bit -> 8-bit
      - Replace IDIV with instructions which test its value and use DIVB if the value
      is positive and less than 256.
      - In the case when the quotient and remainder of a divide are used a DIV
      and a REM instruction will be present in the IR. In the non-Atom case
      they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
      using the quotient and remainder from the first IDIV. However,
      due to this optimization CSE is not able to eliminate redundant
      IDIV instructions because they are located in different basic blocks.
      This is overcome by calculating both the quotient (DIV) and remainder (REM)
      in each basic block that is inserted by the optimization and reusing the result
      values when a subsequent DIV or REM instruction uses the same operands.
      - Test cases check for the presents of the optimization when calculating
      either the quotient, remainder,  or both.
      
      Patch by Tyler Nowicki!
      
      llvm-svn: 163150
      cdf540d5
  25. Aug 30, 2012
    • Michael Liao's avatar
      Introduce 'UseSSEx' to force SSE legacy encoding · bbd10792
      Michael Liao authored
      - Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
        enabled.
      
        As the penalty of inter-mixing SSE and AVX instructions, we need
        prevent SSE legacy insn from being generated except explicitly
        specified through some intrinsics. For patterns supported by both
        SSE and AVX, so far, we force AVX insn will be tried first relying on
        AddedComplexity or position in td file. It's error-prone and
        introduces bugs accidentally.
      
        'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
        by AVX, we need this predicate to force VEX encoding or SSE legacy
        encoding only.
      
        For insns not inherited by AVX, we still use the previous predicates,
        i.e. 'HasSSEx'. So far, these insns fall into the following
        categories:
        * SSE insns with MMX operands
        * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
          CRC, and etc.)
        * SSE4A insns.
        * MMX insns.
        * x87 insns added by SSE.
      
      2 test cases are modified:
      
       - test/CodeGen/X86/fast-isel-x86-64.ll
         AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
         selected by fast-isel due to complicated pattern and fast-isel
         fallback to materialize it from constant pool.
      
       - test/CodeGen/X86/widen_load-1.ll
         AVX code generation is different from SSE one after fixing SSE/AVX
         inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
         'vmovaps'.
      
      llvm-svn: 162919
      bbd10792
  26. Aug 24, 2012
  27. Aug 23, 2012
  28. Aug 01, 2012
  29. Jun 03, 2012
  30. May 31, 2012
  31. Apr 23, 2012
    • Preston Gurd's avatar
      This patch fixes a problem which arose when using the Post-RA scheduler · 9a091475
      Preston Gurd authored
      on X86 Atom. Some of our tests failed because the tail merging part of
      the BranchFolding pass was creating new basic blocks which did not
      contain live-in information. When the anti-dependency code in the Post-RA
      scheduler ran, it would sometimes rename the register containing
      the function return value because the fact that the return value was
      live-in to the subsequent block had been lost. To fix this, it is necessary
      to run the RegisterScavenging code in the BranchFolding pass.
      
      This patch makes sure that the register scavenging code is invoked
      in the X86 subtarget only when post-RA scheduling is being done.
      Post RA scheduling in the X86 subtarget is only done for Atom.
      
      This patch adds a new function to the TargetRegisterClass to control
      whether or not live-ins should be preserved during branch folding.
      This is necessary in order for the anti-dependency optimizations done
      during the PostRASchedulerList pass to work properly when doing
      Post-RA scheduling for the X86 in general and for the Intel Atom in particular.
      
      The patch adds and invokes the new function trackLivenessAfterRegAlloc()
      instead of using the existing requiresRegisterScavenging().
      It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
      requiresRegisterScavenging(). It changes the all the targets that
      implemented requiresRegisterScavenging() to also implement
      trackLivenessAfterRegAlloc().  
      
      It adds an assertion in the Post RA scheduler to make sure that post RA
      liveness information is available when it is needed.
      
      It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
      to avoid running into the added assertion.
      
      Finally, this patch restores the use of anti-dependency checking
      (which was turned off temporarily for the 3.1 release) for
      Intel Atom in the Post RA scheduler.
      
      Patch by Andy Zhang!
      
      Thanks to Jakob and Anton for their reviews.
      
      llvm-svn: 155395
      9a091475
  32. Mar 17, 2012
  33. Feb 19, 2012
  34. Feb 18, 2012
  35. Feb 07, 2012
  36. Feb 05, 2012
Loading