Skip to content
  1. Jul 21, 2014
    • Aaron Ballman's avatar
      Fixing an MSVC conversion warning about implicitly converting the shift... · 6c078a59
      Aaron Ballman authored
      Fixing an MSVC conversion warning about implicitly converting the shift results to 64-bits. No functional change intended.
      
      llvm-svn: 213515
      6c078a59
    • Hal Finkel's avatar
      Move isIdentifiedFunctionLocal from BasicAA to AA · c782aa5a
      Hal Finkel authored
      The ability to identify function locals will exist outside of BasicAA (for
      example, logic for inferring noalias function arguments will need this), so
      make this concept generally accessible without code duplication.
      
      No functionality change.
      
      llvm-svn: 213514
      c782aa5a
    • Daniel Sanders's avatar
      [mips] Try to fix the test/ExecutionEngine tests on a MIPS host. · decb7a2b
      Daniel Sanders authored
      Fix a dangerous default case that caused MipsCodeEmitter to discard pseudo
      instructions it didn't recognize. It will now call llvm_unreachable() for
      unrecognized pseudo's and explicitly handles PseudoReturn, PseudoReturn64,
      PseudoIndirectBranch, PseudoIndirectBranch64, CFI_INSTRUCTION, IMPLICIT_DEF,
      and KILL.
      
      There may be other pseudos that need handling but this was enough for the
      ExecutionEngine tests to pass on my test system.
      
      llvm-svn: 213513
      decb7a2b
    • Alexey Bataev's avatar
      [OPENMP] Initial parsing and sema analysis for 'flush' directive. · 6125da92
      Alexey Bataev authored
      llvm-svn: 213512
      6125da92
    • Daniel Sanders's avatar
      [mips] Do not emit '.module [no]oddspreg' unless we really need to. · d7c27960
      Daniel Sanders authored
      We now emit this directive when we need to contradict the default value (e.g.
      -mno-odd-spreg is given) or an option changed the default value (e.g. -mfpxx
      is given).
      
      This restores support for the currently available head of binutils. However,
      at this point binutils 2.24 is still not sufficient since it does not support
      '.module fp=...'.
      
      llvm-svn: 213511
      d7c27960
    • Alexander Musman's avatar
      [OPENMP] Parsing/Sema of the OpenMP directive 'critical'. · d9ed09f7
      Alexander Musman authored
      llvm-svn: 213510
      d9ed09f7
    • Benjamin Kramer's avatar
      [clang-tidy] Fix a false positive in the make_pair checker if an argument has... · ddf36dea
      Benjamin Kramer authored
      [clang-tidy] Fix a false positive in the make_pair checker if an argument has a explicit template argument.
      
      This required a rather ugly workaround for a problem in ASTMatchers where
      callee() is only overloaded for Stmt and Decl but not for Expr.
      
      llvm-svn: 213509
      ddf36dea
    • Chandler Carruth's avatar
      FileCheck-ize a test. · efd14a62
      Chandler Carruth authored
      llvm-svn: 213508
      efd14a62
    • Tim Northover's avatar
      CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext · f7a02c17
      Tim Northover authored
      This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc,
      and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation
      path is now uniform, regardless of the input IR:
      
        fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall
        fpext -> FP16_TO_FP (if f16 illegal) -> libcall
      
      Each target should be able to select the version that best matches its
      operations and not be required to duplicate patterns for both fptrunc
      and FP_TO_FP16 (for example).
      
      As a result we can remove some redundant AArch64 patterns.
      
      llvm-svn: 213507
      f7a02c17
    • Chandler Carruth's avatar
      [SDAG,cleanup] Switch the DAG combiner over to use the spelling · 3c0012be
      Chandler Carruth authored
      'Worklist' consistently rather than a deeply confusing mixture of
      'WorkList' and 'Worklist'.
      
      Notably, the very 'WorkList' of the DAG combiner was exposed to target
      specific DAG combines under an interface 'AddToWorklist' which was
      implemented by in turn calling 'AddToWorkList' in the combiner. This has
      sent me circling with the wrong case in grep one too many times.
      
      I chose to normalize on 'Worklist' because that one won the grep-vote
      for llvm/lib/... by a hundered hits or so, and it is used in places
      relatively "canonical" such as InstCombine's Worklist. Let's all jsut
      pick this casing, whether "correct", "good", or "bad" and be
      consistent...
      
      llvm-svn: 213506
      3c0012be
    • Chandler Carruth's avatar
      [SDAG] Rather than using a narrow test against the one dummy node on the · 24ceb0ce
      Chandler Carruth authored
      stack, filter all handle nodes from the DAG combiner worklist.
      
      This will also handle cases where other handle nodes might be
      (erroneously) added to the worklist and then cause bugs and explosions
      when deleted. For example, when running the legalizer within the DAG
      combiner, there are times when other handle nodes are used and can end
      up here.
      
      llvm-svn: 213505
      24ceb0ce
    • Andrea Di Biagio's avatar
      [DAGCombiner] Improve the shuffle-vector folding logic. · 0fb20131
      Andrea Di Biagio authored
      Canonicalize shuffles according to rules:
       *  shuffle(A, shuffle(A, B)) -> shuffle(shuffle(A,B), A)
       *  shuffle(B, shuffle(A, B)) -> shuffle(shuffle(A,B), B)
       *  shuffle(B, shuffle(A, Undef)) -> shuffle(shuffle(A, Undef), B)
      
      This patch helps identifying more shuffle pairs that could be combined reusing
      the already existing rules in the DAGCombiner.
      
      Added new test 'combine-vec-shuffle-5.ll' to verify that the canonicalized
      shuffles are now folded into a single shuffle node by the DAGCombiner.
      Added more test cases to 'combine-vec-shuffle-4.ll'.
      
      llvm-svn: 213504
      0fb20131
    • Andrea Di Biagio's avatar
      [DAG] Refactor some logic. No functional change. · 4d8bd416
      Andrea Di Biagio authored
      This patch removes function 'CommuteVectorShuffle' from X86ISelLowering.cpp
      and moves its logic into SelectionDAG.cpp as method 'getCommutedVectorShuffles'.
      This refactoring is in preperation of an upcoming change to the DAGCombiner.
      
      llvm-svn: 213503
      4d8bd416
    • James Dennett's avatar
      1810fc93
    • Daniel Jasper's avatar
      clang-tidy: [misc-use-override] Slightly tweak the wording of warning. · 58ed9c91
      Daniel Jasper authored
      'final' should really be used with care.
      
      llvm-svn: 213501
      58ed9c91
    • James Dennett's avatar
      Add clang::DesignatedInitExpr::designators() for range-based access, · ab4ebb42
      James Dennett authored
      with overloads for designators_range and designators_const_range.
      
      llvm-svn: 213500
      ab4ebb42
    • Richard Smith's avatar
      Add missing initialization found due to a valgrind false positive. · 22fdae9b
      Richard Smith authored
      This field is never inspected in the object state initialized by this
      constructor; however, initializing it seems reasonable, since it has
      a meaningful value.
      
      llvm-svn: 213499
      22fdae9b
    • Richard Smith's avatar
      [modules] Fix some of the confusion when computing the override set for a macro · 57721ac5
      Richard Smith authored
      introduced by finalization. This is still not entirely correct; more fixes to
      follow.
      
      llvm-svn: 213498
      57721ac5
    • Gerolf Hoflehner's avatar
      Fix for regression: [Bug 20369] wrong code at -O3 on x86_64-linux-gnu in 64-bit mode · ae1ec299
      Gerolf Hoflehner authored
      Prevents hoisting of loads above stores and sinking of stores below loads
      in MergedLoadStoreMotion.cpp (rdar://15991737)
      
      llvm-svn: 213497
      ae1ec299
    • Alexey Bataev's avatar
      [OPENMP] Added several test cases for clauses 'ordered' and 'nowait': if there... · 4c904adf
      Alexey Bataev authored
      [OPENMP] Added several test cases for clauses 'ordered' and 'nowait': if there are more than one 'nowait' or 'ordered' clause an error message is expected.
      
      llvm-svn: 213496
      4c904adf
    • Ulrich Weigand's avatar
      [PowerPC] Optimize passing certain aggregates by value · 601957fa
      Ulrich Weigand authored
      In addition to enabling ELFv2 homogeneous aggregate handling,
      LLVM support to pass array types directly also enables a performance
      enhancement.  We can now pass (non-homogeneous) aggregates that fit
      fully in registers as direct integer arrays, using an element type
      to encode the alignment requirement (that would otherwise go to the
      "byval align" field).
      
      This is preferable since "byval" forces the back-end to write the
      aggregate out to the stack, even if it could be passed fully in
      registers.  This is particularly annoying on ELFv2, if there is
      no parameter save area available, since we then need to allocate
      space on the callee's stack just to hold those aggregates.
      
      Note that to implement this optimization, this patch does not attempt
      to fully anticipate register allocation rules as (defined in the
      ABI and) implemented in the back-end.  Instead, the patch is simply
      passing *any* aggregate passed by value using the array mechanism
      if its size is up to 64 bytes.   This means that some of those will
      end up being passed in stack slots anyway, but the generated code
      shouldn't be any worse either.  (*Large* aggregates remain passed
      using "byval" to enable optimized copying via memcpy etc.)
      
      llvm-svn: 213495
      601957fa
    • Ulrich Weigand's avatar
      [PowerPC] Support the ELFv2 ABI · b712237d
      Ulrich Weigand authored
      This patch implements clang support for the PowerPC ELFv2 ABI.
      Together with a series of companion patches in LLVM, this makes
      clang/LLVM fully usable on powerpc64le-linux.
      
      Most of the ELFv2 ABI changes are fully implemented on the LLVM side.
      On the clang side, we only need to implement some changes in how
      aggregate types are passed by value.   Specifically, we need to:
      - pass (and return) "homogeneous" floating-point or vector aggregates in
        FPRs and VRs (this is similar to the ARM homogeneous aggregate ABI)
      - return aggregates of up to 16 bytes in one or two GPRs
      
      The second piece is trivial to implement in any case.  To implement
      the first piece, this patch makes use of infrastructure recently
      enabled in the LLVM PowerPC back-end to support passing array types
      directly, where the array element type encodes properties needed to
      handle homogeneous aggregates correctly.
      
      Specifically, the array element type encodes:
      - whether the parameter should be passed in FPRs, VRs, or just
        GPRs/stack slots  (for float / vector / integer element types,
        respectively)
      - what the alignment requirements of the parameter are when passed in
        GPRs/stack slots  (8 for float / 16 for vector / the element type
        size for integer element types) -- this corresponds to the
        "byval align" field
      
      With this support in place, the clang part simply needs to *detect*
      whether an aggregate type implements a float / vector homogeneous
      aggregate as defined by the ELFv2 ABI, and if so, pass/return it
      as array type using the appropriate float / vector element type.
      
      llvm-svn: 213494
      b712237d
    • Ulrich Weigand's avatar
      [PowerPC] ELFv2 aggregate passing support · 85d5df25
      Ulrich Weigand authored
      This patch adds infrastructure support for passing array types
      directly.  These can be used by the front-end to pass aggregate
      types (coerced to an appropriate array type).  The details of the
      array type being used inform the back-end about ABI-relevant
      properties.  Specifically, the array element type encodes:
      - whether the parameter should be passed in FPRs, VRs, or just
        GPRs/stack slots  (for float / vector / integer element types,
        respectively)
      - what the alignment requirements of the parameter are when passed in
        GPRs/stack slots  (8 for float / 16 for vector / the element type
        size for integer element types) -- this corresponds to the
        "byval align" field
      
      Using the infrastructure provided by this patch, a companion patch
      to clang will enable two features:
      - In the ELFv2 ABI, pass (and return) "homogeneous" floating-point
        or vector aggregates in FPRs and VRs (this is similar to the ARM
        homogeneous aggregate ABI)
      - As an optimization for both ELFv1 and ELFv2 ABIs, pass aggregates
        that fit fully in registers without using the "byval" mechanism
      
      The patch uses the functionArgumentNeedsConsecutiveRegisters callback
      to encode that special treatment is required for all directly-passed
      array types.  The isInConsecutiveRegs / isInConsecutiveRegsLast bits set
      as a results are then used to implement the required size and alignment
      rules in CalculateStackSlotSize / CalculateStackSlotAlignment etc.
      
      As a related change, the ABI routines have to be modified to support
      passing floating-point types in GPRs.  This is necessary because with
      homogeneous aggregates of 4-byte float type we can now run out of FPRs
      *before* we run out of the 64-byte argument save area that is shadowed
      by GPRs.  Any extra floating-point arguments that no longer fit in FPRs
      must now be passed in GPRs until we run out of those too.
      
      Note that there was already code to pass floating-point arguments in
      GPRs used with vararg parameters, which was done by writing the argument
      out to the argument save area first and then reloading into GPRs.  The
      patch re-implements this, however, in favor of code packing float arguments
      directly via extension/truncation, BITCAST, and BUILD_PAIR operations.
      
      This is required to support the ELFv2 ABI, since we cannot unconditionally
      write to the argument save area (which the caller might not have allocated).
      The change does, however, affect ELFv1 varags routines too; but even here
      the overall effect should be advantageous: Instead of loading the argument
      into the FPR, then storing the argument to the stack slot, and finally
      reloading the argument from the stack slot into a GPR, the new code now
      just loads the argument into the FPR, and subsequently loads the argument
      into the GPR (via BITCAST).  That BITCAST might imply a save/reload from
      a stack temporary (in which case we're no worse than before); but it
      might be implemented more efficiently in some cases.
      
      The final part of the patch enables up to 8 FPRs and VRs for argument
      return in PPCCallingConv.td; this is required to support returning
      ELFv2 homogeneous aggregates.  (Note that this doesn't affect other ABIs
      since LLVM wil only look for which register to use if the parameter is
      marked as "direct" return anyway.)
      
      Reviewed by Hal Finkel.
      
      llvm-svn: 213493
      85d5df25
    • Ulrich Weigand's avatar
      [PowerPC] ELFv2 explicit CFI for CR fields · be928cc2
      Ulrich Weigand authored
      This is a minor improvement in the ELFv2 ABI.   In ELFv1, DWARF CFI
      would represent a saved CR word (holding CR fields CR2, CR3, and CR4)
      using just a single CFI record refering to CR2.   In ELFv2 instead,
      each of the CR fields is represented by its own CFI record.  The
      advantage is that the compiler can now chose to save just a single
      (or two) CR fields instead of all of them, if those are the only ones
      that actually need saving.  That can lead to more efficient code using
      mf(o)crf instead of the (slow) mfcr instruction.
      
      Note that this patch does not (yet) implement this more efficient
      code generation, but it does implement the part that is required to
      be ABI compliant: creating multiple CFI records if multiple CR fields
      are saved.
      
      Reviewed by Hal Finkel.
      
      llvm-svn: 213492
      be928cc2
    • Ulrich Weigand's avatar
      [PowerPC] ELFv2 dynamic loader support · 752b5c9e
      Ulrich Weigand authored
      This patch enables the new ELFv2 ABI in the runtime dynamic loader.
      The loader has to implement the following features:
      - In the ELFv2 ABI, do not look up a function descriptor in .opd, but
        instead use the local entry point when resolving a direct call.
      - Update the TOC restore code to use the new TOC slot linkage area
        offset.
      - Create PLT stubs appropriate for the ELFv2 ABI.
      
      Note that this patch also adds common-code changes. These are necessary
      because the loader must check the newly added ELF flags: the e_flags
      header bits encoding the ABI version, and the st_other symbol table
      entry bits encoding the local entry point offset.  There is currently
      no way to access these, so I've added ObjectFile::getPlatformFlags and
      SymbolRef::getOther accessors.
      
      Reviewed by Hal Finkel.
      
      llvm-svn: 213491
      752b5c9e
    • Ulrich Weigand's avatar
      [PowerPC] ELFv2 stack space reduction · 8658f17e
      Ulrich Weigand authored
      The ELFv2 ABI reduces the amount of stack required to implement an
      ABI-compliant function call in two ways:
      * the "linkage area" is reduced from 48 bytes to 32 bytes by
        eliminating two unused doublewords
      * the 64-byte "parameter save area" is now optional and need not be
        present in certain cases (it remains mandatory in functions with
        variable arguments, and functions that have any parameter that is
        passed on the stack)
      
      The following patch implements this required changes:
      - reducing the linkage area, and associated relocation of the TOC save
        slot, in getLinkageSize / getTOCSaveOffset (this requires updating all
        callers of these routines to pass in the isELFv2ABI flag).
      - (partially) handling the case where the parameter save are is optional
      
      This latter part requires some extra explanation:  Currently, we still
      always allocate the parameter save area when *calling* a function.
      That is certainly always compliant with the ABI, but may cause code to
      allocate stack unnecessarily.  This can be addressed by a follow-on
      optimization patch.
      
      On the *callee* side, in LowerFormalArguments, we *must* track
      correctly whether the ABI guarantees that the caller has allocated
      the parameter save area for our use, and the patch does so. However,
      there is one complication: the code that handles incoming "byval"
      arguments will currently *always* write to the parameter save area,
      because it has to force incoming register arguments to the stack since
      it must return an *address* to implement the byval semantics.
      
      To fix this, the patch changes the LowerFormalArguments code to write
      arguments to a freshly allocated stack slot on the function's own stack
      frame instead of the argument save area in those cases where that area
      is not present.
      
      Reviewed by Hal Finkel.
      
      llvm-svn: 213490
      8658f17e
    • Ulrich Weigand's avatar
      [PowerPC] ELFv2 function call changes · aa0ac4f1
      Ulrich Weigand authored
      This patch builds upon the two preceding MC changes to implement the
      basic ELFv2 function call convention.  In the ELFv1 ABI, a "function
      descriptor" was associated with every function, pointing to both the
      entry address and the related TOC base (and a static chain pointer
      for nested functions).  Function pointers would actually refer to that
      descriptor, and the indirect call sequence needed to load up both entry
      address and TOC base.
      
      In the ELFv2 ABI, there are no more function descriptors, and function
      pointers simply refer to the (global) entry point of the function code.
      Indirect function calls simply branch to that address, after loading it
      up into r12 (as required by the ABI rules for a global entry point).
      Direct function calls continue to just do a "bl" to the target symbol;
      this will be resolved by the linker to the local entry point of the
      target function if it is local, and to a PLT stub if it is global.
      That PLT stub would then load the (global) entry point address of the
      final target into r12 and branch to it.  Note that when performing a
      local function call, r2 must be set up to point to the current TOC
      base: if the target ends up local, the ABI requires that its local
      entry point is called with r2 set up; if the target ends up global,
      the PLT stub requires that r2 is set up.
      
      This patch implements all LLVM changes to implement that scheme:
      - No longer create a function descriptor when emitting a function
        definition (in EmitFunctionEntryLabel)
      - Emit two entry points *if* the function needs the TOC base (r2)
        anywhere (this is done EmitFunctionBodyStart; note that this cannot
        be done in EmitFunctionBodyStart because the global entry point
        prologue code must be *part* of the function as covered by debug info).
      - In order to make use tracking of r2 (as needed above) work correctly,
        mark direct function calls as implicitly using r2.
      - Implement the ELFv2 indirect function call sequence (no function
        descriptors; load target address into r12).
      - When creating an ELFv2 object file, emit the .abiversion 2 directive
        to tell the linker to create the appropriate version of PLT stubs.  
      
      Reviewed by Hal Finkel.
      
      llvm-svn: 213489
      aa0ac4f1
    • Hal Finkel's avatar
      [LoopVectorize] Remove an unused private AA pointer · 07c9bb3d
      Hal Finkel authored
      Thanks to the lld-x86_64-darwin13 builder for catching this first.
      
      llvm-svn: 213488
      07c9bb3d
    • Ulrich Weigand's avatar
      [MC] Pass MCSymbolData to needsRelocateWithSymbol · 46797c69
      Ulrich Weigand authored
      As discussed in a previous checking to support the .localentry
      directive on PowerPC, we need to inspect the actual target symbol
      in needsRelocateWithSymbol to make the appropriate decision based
      on that symbol's st_other bits.
      
      Currently, needsRelocateWithSymbol does not get the target symbol.
      However, it is directly available to its sole caller.  This patch
      therefore simply extends the needsRelocateWithSymbol by a new
      parameter "const MCSymbolData &SD", passes in the target symbol,
      and updates all derived implementations.
      
      In particular, in the PowerPC implementation, this patch removes
      the FIXME added by the previous checkin.
      
      llvm-svn: 213487
      46797c69
    • Hal Finkel's avatar
      [LoopVectorize] Use AA to partition potential dependency checks · 7ae00a12
      Hal Finkel authored
      Prior to this change, the loop vectorizer did not make use of the alias
      analysis infrastructure. Instead, it performed memory dependence analysis using
      ScalarEvolution-based linear dependence checks within equivalence classes
      derived from the results of ValueTracking's GetUnderlyingObjects.
      
      Unfortunately, this meant that:
        1. The loop vectorizer had logic that essentially duplicated that in BasicAA
           for aliasing based on identified objects.
        2. The loop vectorizer could not partition the space of dependency checks
           based on information only easily available from within AA (TBAA metadata is
           currently the prime example).
      
      This means, for example, regardless of whether -fno-strict-aliasing was
      provided, the vectorizer would only vectorize this loop with a runtime
      memory-overlap check:
      
      void foo(int *a, float *b) {
        for (int i = 0; i < 1600; ++i)
          a[i] = b[i];
      }
      
      This is suboptimal because the TBAA metadata already provides the information
      necessary to show that this check unnecessary. Of course, the vectorizer has a
      limit on the number of such checks it will insert, so in practice, ignoring
      TBAA means not vectorizing more-complicated loops that we should.
      
      This change causes the vectorizer to use an AliasSetTracker to keep track of
      the pointers in the loop. The resulting alias sets are then used to partition
      the space of dependency checks, and potential runtime checks; this results in
      more-efficient vectorizations.
      
      When pointer locations are added to the AliasSetTracker, two things are done:
        1. The location size is set to UnknownSize (otherwise you'd not catch
           inter-iteration dependencies)
        2. For instructions in blocks that would need to be predicated, TBAA is
           removed (because the metadata might have a control dependency on the condition
           being speculated).
      
      For non-predicated blocks, you can leave the TBAA metadata. This is safe
      because you can't have an iteration dependency on the TBAA metadata (if you
      did, and you unrolled sufficiently, you'd end up with the same pointer value
      used by two accesses that TBAA says should not alias, and that would yield
      undefined behavior).
      
      llvm-svn: 213486
      7ae00a12
    • Ulrich Weigand's avatar
      [PowerPC] ELFv2 MC support for .localentry directive · bb68610d
      Ulrich Weigand authored
      A second binutils feature needed to support ELFv2 is the .localentry
      directive.  In the ELFv2 ABI, functions may have two entry points:
      one for calling the routine locally via "bl", and one for calling the
      function via function pointer (either at the source level, or implicitly
      via a PLT stub for global calls).  The two entry points share a single
      ELF symbol, where the ELF symbol address identifies the global entry
      point address, while the local entry point is found by adding a delta
      offset to the symbol address.  That offset is encoded into three
      platform-specific bits of the ELF symbol st_other field.
      
      The .localentry directive instructs the assembler to set those fields
      to encode a particular offset.  This is typically used by a function
      prologue sequence like this:
      
      func:
              addis r2, r12, (.TOC.-func)@ha
              addi r2, r2, (.TOC.-func)@l
              .localentry func, .-func
      
      Note that according to the ABI, when calling the global entry point,
      r12 must be set to point the global entry point address itself; while
      when calling the local entry point, r2 must be set to point to the TOC
      base.  The two instructions between the global and local entry point in
      the above example translate the first requirement into the second.
      
      This patch implements support in the PowerPC MC streamers to emit the
      .localentry directive (both into assembler and ELF object output), as
      well as support in the assembler parser to parse that directive.
      
      In addition, there is another change required in MC fixup/relocation
      handling to properly deal with relocations targeting function symbols
      with two entry points: When the target function is known local, the MC
      layer would immediately handle the fixup by inserting the target
      address -- this is wrong, since the call may need to go to the local
      entry point instead.  The GNU assembler handles this case by *not*
      directly resolving fixups targeting functions with two entry points,
      but always emits the relocation and relies on the linker to handle
      this case correctly.  This patch changes LLVM MC to do the same (this
      is done via the processFixupValue routine).
      
      Similarly, there are cases where the assembler would normally emit a
      relocation, but "simplify" it to a relocation targeting a *section*
      instead of the actual symbol.  For the same reason as above, this
      may be wrong when the target symbol has two entry points.  The GNU
      assembler again handles this case by not performing this simplification
      in that case, but leaving the relocation targeting the full symbol,
      which is then resolved by the linker.  This patch changes LLVM MC
      to do the same (via the needsRelocateWithSymbol routine).
      NOTE: The method used in this patch is overly pessimistic, since the
      needsRelocateWithSymbol routine currently does not have access to the
      actual target symbol, and thus must always assume that it might have
      two entry points.  This will be improved upon by a follow-on patch
      that modifies common code to pass the target symbol when calling
      needsRelocateWithSymbol.
      
      Reviewed by Hal Finkel.
      
      llvm-svn: 213485
      bb68610d
    • Ulrich Weigand's avatar
      [PowerPC] ELFv2 MC support for .abiversion directive · 0daa5164
      Ulrich Weigand authored
      ELFv2 binaries are marked by a bit in the ELF header e_flags field.
      A new assembler directive .abiversion can be used to set that flag.
      This patch implements support in the PowerPC MC streamers to emit the
      .abiversion directive (both into assembler and ELF binary output),
      as well as support in the assembler parser to parse the .abiversion
      directive.
      
      Reviewed by Hal Finkel.
      
      llvm-svn: 213484
      0daa5164
    • Ulrich Weigand's avatar
      [PowerPC] Refactor byval handling in LowerFormalArguments_64SVR4 · 24195972
      Ulrich Weigand authored
      When handling an incoming byval argument, we need to possibly write
      incoming registers to the stack in order to create an on-stack image
      of the parameter, so we can return its address to common code.
      
      This currently uses CreateFixedObject to access the parts of the
      parameter save area where the argument is (or needs to be) stored.
      However, sometimes we need to access multiple parts of that area,
      e.g. to write multiple registers.  The code currently uses a new
      CreateFixedObject call for each of these accesses, resulting in
      a patchwork of overlapping (fixed) stack objects.
      
      This doesn't really matter in the case of fixed objects, since
      any access to those turns into a fixed stackpointer + offset
      address anyway.  However, with the upcoming ELFv2 patches, we
      may actually need to place an incoming argument into our *own*
      stack frame instead of the caller's.  This means we need to use
      CreateStackObject instead, and we cannot have multiple overlapping
      instances of those.
      
      To make the rest of the argument handling code work equally in
      both situations, this patch refactors it to always use just a
      single call to CreateFixedObject, and access parts of that object
      as required using address arithmetic.  This way, we can in a future
      patch substitute CreateStackObject without further changes.
      
      No change to generated code intended.
      
      llvm-svn: 213483
      24195972
    • Ulrich Weigand's avatar
      [PowerPC] Fix FrameIndex handling in SelectAddressRegImm · 55a96650
      Ulrich Weigand authored
      The PPCTargetLowering::SelectAddressRegImm routine needs to handle
      FrameIndex nodes in a special manner, by tranlating them into a
      TargetFrameIndex node.  This was done in most cases, but seems to
      have been neglected in one path: when the input tree has an OR of
      the FrameIndex with an immediate.  This can happen if the FrameIndex
      can be proven to be sufficiently aligned that an OR of that immediate
      is equivalent to an ADD.
      
      The missing handling of FrameIndex in that case caused the SelectionDAG
      instruction selection to miss opportunities to merge the OR back into
      the FrameIndex node, leading to superfluous addi/ori instructions in
      the final assembler output.
      
      llvm-svn: 213482
      55a96650
  2. Jul 20, 2014
Loading