Skip to content
  1. Feb 28, 2017
  2. Feb 27, 2017
    • Rui Ueyama's avatar
      Move SymbolTable<ELFT>::Sections out of the class. · 536a2670
      Rui Ueyama authored
      The list of all input sections was defined in SymbolTable class for a
      historical reason. The list itself is not a template. However, because
      SymbolTable class is a template, we needed to pass around ELFT to access
      the list. This patch moves the list out of the class so that it doesn't
      need ELFT.
      
      llvm-svn: 296309
      536a2670
  3. Feb 24, 2017
  4. Feb 23, 2017
  5. Feb 17, 2017
    • George Rimar's avatar
      [ELF] - Move DependentSections vector from InputSection to InputSectionBase · 647c1685
      George Rimar authored
      I splitted it from D29273.
      Since we plan to make relocatable sections as dependent for target ones for
      --emit-relocs implementation, this change is required to support .eh_frame case.
      
      EhInputSection inherets from InputSectionBase and not from InputSection.
      So for case when it has relocation section, it should be able to access DependentSections
      vector.
      
      This case is real for Linux kernel.
      
      Differential revision: https://reviews.llvm.org/D30084
      
      llvm-svn: 295483
      647c1685
  6. Feb 16, 2017
  7. Feb 03, 2017
    • Rafael Espindola's avatar
      Replace MergeOutputSection with a synthetic section. · 9e9754b5
      Rafael Espindola authored
      With a synthetic merge section we can have, for example, a single
      .rodata section with stings, fixed sized constants and non merge
      constants.
      
      I can be simplified further by not setting Entsize, but that is
      probably better done is a followup patch.
      
      This should allow some cleanup in the linker script code now that
      every output section command maps to just one output section.
      
      llvm-svn: 294005
      9e9754b5
  8. Feb 01, 2017
    • Peter Smith's avatar
      [ELF] Use SyntheticSections for Thunks · 3a52eb00
      Peter Smith authored
          
      Thunks are now implemented by redirecting the relocation to the
      symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
      to S. This has the following implications:
      - All the side-effects of Thunks happen within createThunks()
      - Thunks are no longer stored in InputSections and Symbols no longer
        need to hold a pointer to a Thunk
      - The synthetic Thunk sections need to be merged into OutputSections
          
      This implementation is almost a direct conversion of the existing
      Thunks with the following exceptions:
      - Mips LA25 Thunks are placed before the InputSection that defines
        the symbol that needs a Thunk.
      - All ARM Thunks are placed at the end of the OutputSection of the
        first caller to the Thunk.
          
      Range extension Thunks are not supported yet so it is optimistically
      assumed that all Thunks can be reused.
      
      This is a recommit of r293283 with a fixed comparison predicate as
      std::merge requires a strict weak ordering.
      
      Differential revision: https://reviews.llvm.org/D29327
      
      llvm-svn: 293757
      3a52eb00
  9. Jan 28, 2017
  10. Jan 27, 2017
    • Peter Smith's avatar
      [ELF][ARM] Use SyntheticSections for Thunks · 5191c6f9
      Peter Smith authored
          
      Thunks are now implemented by redirecting the relocation to the
      symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
      to S. This has the following implications:
      - All the side-effects of Thunks happen within createThunks()
      - Thunks are no longer stored in InputSections and Symbols no longer
        need to hold a pointer to a Thunk
      - The synthetic Thunk sections need to be merged into OutputSections
          
      This implementation is almost a direct conversion of the existing
      Thunks with the following exceptions:
      - Mips LA25 Thunks are placed before the InputSection that defines
        the symbol that needs a Thunk.
      - All ARM Thunks are placed at the end of the OutputSection of the
        first caller to the Thunk.
          
      Range extension Thunks are not supported yet so it is optimistically
      assumed that all Thunks can be reused.
      
      Differential Revision:  https://reviews.llvm.org/D29129
      
      llvm-svn: 293283
      5191c6f9
  11. Jan 12, 2017
  12. Jan 06, 2017
  13. Dec 20, 2016
  14. Dec 19, 2016
    • Rui Ueyama's avatar
      Remove inappropriate use of CachedHashStringRef. · 8f687f71
      Rui Ueyama authored
      Use of CachedHashStringRef makes sense only when we reuse hash values.
      Sprinkling it to all DenseMap has no benefits and just complicates data types.
      Basically we shouldn't use CachedHashStringRef unless there is a strong
      reason to to do so.
      
      llvm-svn: 290076
      8f687f71
  15. Dec 06, 2016
    • Rui Ueyama's avatar
      Inline MergeInputSection::getData(). · c8e68848
      Rui Ueyama authored
      This change seems to make LLD 0.6% faster when linking Clang with
      debug info. I don't want us to have lots of local optimizations,
      but this function is very hot, and the improvement is small but
      not negligible, so I think it's worth doing.
      
      llvm-svn: 288757
      c8e68848
  16. Dec 05, 2016
  17. Dec 01, 2016
    • Rui Ueyama's avatar
      Updates file comments and variable names. · 91ae861a
      Rui Ueyama authored
      Use "color" instead of "group id" to describe the ICF algorithm.
      
      llvm-svn: 288409
      91ae861a
    • Rui Ueyama's avatar
      Parallelize ICF to make LLD's ICF really fast. · c1835319
      Rui Ueyama authored
      ICF is short for Identical Code Folding. It is a size optimization to
      identify two or more functions that happened to have the same contents
      to merges them. It usually reduces output size by a few percent.
      
      ICF is slow because it is computationally intensive process. I tried
      to paralellize it before but failed because I couldn't make a
      parallelized version produce consistent outputs. Although it didn't
      create broken executables, every invocation of the linker generated
      slightly different output, and I couldn't figure out why.
      
      I think I now understand what was going on, and also came up with a
      simple algorithm to fix it. So is this patch.
      
      The result is very exciting. Chromium for example has 780,662 input
      sections in which 20,774 are reducible by ICF. LLD previously took
      7.980 seconds for ICF. Now it finishes in 1.065 seconds.
      
      As a result, LLD can now link a Chromium binary (output size 1.59 GB)
      in 10.28 seconds on my machine with ICF enabled. Compared to gold
      which takes 40.94 seconds to do the same thing, this is an amazing
      number.
      
      From here, I'll describe what we are doing for ICF, what was the
      previous problem, and what I did in this patch.
      
      In ICF, two sections are considered identical if they have the same
      section flags, section data, and relocations. Relocations are tricky,
      becuase two relocations are considered the same if they have the same
      relocation type, values, and if they point to the same section _in
      terms of ICF_.
      
      Here is an example. If foo and bar defined below are compiled to the
      same machine instructions, ICF can (and should) merge the two,
      although their relocations point to each other.
      
        void foo() { bar(); }
        void bar() { foo(); }
      
      This is not an easy problem to solve.
      
      What we are doing in LLD is some sort of coloring algorithm. We color
      non-identical sections using different colors repeatedly, and sections
      in the same color when the algorithm terminates are considered
      identical. Here is the details:
      
        1. First, we color all sections using their hash values of section
        types, section contents, and numbers of relocations. At this moment,
        relocation targets are not taken into account. We just color
        sections that apparently differ in different colors.
      
        2. Next, for each color C, we visit sections having color C to see
        if their relocations are the same. Relocations are considered equal
        if their targets have the same color. We then recolor sections that
        have different relocation targets in new colors.
      
        3. If we recolor some section in step 2, relocations that were
        previously pointing to the same color targets may now be pointing to
        different colors. Therefore, repeat 2 until a convergence is
        obtained.
      
      Step 2 is a heavy operation. For Chromium, the first iteration of step
      2 takes 2.882 seconds, and the second iteration takes 1.038 seconds,
      and in total it needs 23 iterations.
      
      Parallelizing step 1 is easy because we can color each section
      independently. This patch does that.
      
      Parallelizing step 2 is tricky. We could work on each color
      independently, but we cannot recolor sections in place, because it
      will break the invariance that two possibly-identical sections must
      have the same color at any moment.
      
      Consider sections S1, S2, S3, S4 in the same color C, where S1 and S2
      are identical, S3 and S4 are identical, but S2 and S3 are not. Thread
      A is about to recolor S1 and S2 in C'. After thread A recolor S1 in
      C', but before recolor S2 in C', other thread B might observe S1 and
      S2. Then thread B will conclude that S1 and S2 are different, and it
      will split thread B's sections into smaller groups wrongly. Over-
      splitting doesn't produce broken results, but it loses a chance to
      merge some identical sections. That was the cause of indeterminism.
      
      To fix the problem, I made sections have two colors, namely current
      color and next color. At the beginning of each iteration, both colors
      are the same. Each thread reads from current color and writes to next
      color. In this way, we can avoid threads from reading partial
      results. After each iteration, we flip current and next.
      
      This is a very simple solution and is implemented in less than 50
      lines of code.
      
      I tested this patch with Chromium and confirmed that this parallelized
      ICF produces the identical output as the non-parallelized one.
      
      Differential Revision: https://reviews.llvm.org/D27247
      
      llvm-svn: 288373
      c1835319
  18. Nov 26, 2016
    • Rui Ueyama's avatar
      Change return types of split{Non,}Strings. · e8a077ba
      Rui Ueyama authored
      They return new vectors, but at the same time they mutate other vectors,
      so returning values doesn't make much sense. We should just mutate two
      vectors.
      
      llvm-svn: 287979
      e8a077ba
  19. Nov 25, 2016
  20. Nov 23, 2016
  21. Nov 21, 2016
    • Rui Ueyama's avatar
      Add a flag to InputSectionBase for linker script. · f94efddd
      Rui Ueyama authored
      Previously, we set (uintptr_t)-1 to InputSectionBase::OutSec to record
      that a section has already been set to be assigned to some output section
      by linker scripts. Later, we restored nullptr to the pointer to use
      the field for the original purpose. That overloading is not very easy to
      understand.
      
      This patch adds a bit flag for that purpose, so that we don't need
      to piggyback the flag on an unrelated pointer.
      
      llvm-svn: 287508
      f94efddd
  22. Nov 20, 2016
    • Rui Ueyama's avatar
      Do not expose ICF class from the file. · bd1f0630
      Rui Ueyama authored
      Also this patch uses file-scope functions instead of class member function.
      
      Now that ICF class is not visible from outside, InputSection class
      can no longer be "friend" of it. So I removed the friend relation
      and just make it expose the features to public.
      
      llvm-svn: 287480
      bd1f0630
  23. Nov 18, 2016
    • Rui Ueyama's avatar
      Simplify MergeOutputSection. · 77f2a875
      Rui Ueyama authored
      MergeOutputSection class was a bit hard to use because it provdes
      a series of finalize functions that have to be called in a right way
      at a right time. It also intereacted with MergeInputSection, and the
      logic was somewhat entangled between the two classes.
      
      This patch simplifies it by providing only one finalize function.
      Now, all you have to do is to call MergeOutputSection::finalize
      when you have added all sections to the output section. Then, it
      internally merges strings and initliazes StringPiece objects.
      I think this is much easier to understand.
      
      This patch also adds comments.
      
      llvm-svn: 287314
      77f2a875
  24. Nov 14, 2016
  25. Nov 11, 2016
  26. Nov 10, 2016
    • Rafael Espindola's avatar
      Parse relocations only once. · 9f0c4bb7
      Rafael Espindola authored
      Relocations are the last thing that we wore storing a raw section
      pointer to and parsing on demand.
      
      With this patch we parse it only once and store a pointer to the
      actual data.
      
      The patch also changes where we store it. It is now in
      InputSectionBase. Not all sections have relocations, but most do and
      this simplifies the logic. It also means that we now only support one
      relocation section per section. Given that that constraint is
      maintained even with -r with gold bfd and lld, I think it is OK.
      
      llvm-svn: 286459
      9f0c4bb7
    • Eugene Leviant's avatar
      [ELF] Convert .got.plt section to input section · 41ca327b
      Eugene Leviant authored
      Differential revision: https://reviews.llvm.org/D26349
      
      llvm-svn: 286443
      41ca327b
    • Rafael Espindola's avatar
      Make OutputSectionBase a class instead of class template. · e08e78df
      Rafael Espindola authored
      The disadvantage is that we use uint64_t instad of uint32_t for some
      value in 32 bit files. The advantage is a substantially simpler code,
      faster builds and less code duplication.
      
      llvm-svn: 286414
      e08e78df
  27. Nov 09, 2016
    • Simon Atanasyan's avatar
      [ELF][MIPS] Convert .MIPS.abiflags section to synthetic input section · fa03b0fa
      Simon Atanasyan authored
      Previously, we have both input and output section for .MIPS.abiflags.
      Now we have only one class for .MIPS.abiflags, which is MipsAbiFlagsSection.
      This class is a synthetic input section.
      
      .MIPS.abiflags sections are handled as regular sections until
      the control reaches Writer. Writer then aggregates all sections
      whose type is SHT_MIPS_ABIFLAGS to create a single synthesized
      input section. The synthesized section is then processed normally
      as if it came from an input file.
      
      llvm-svn: 286398
      fa03b0fa
    • Simon Atanasyan's avatar
      [ELF][MIPS] Convert .reginfo and .MIPS.options sections to synthetic input sections · ce02cf00
      Simon Atanasyan authored
      Previously, we have both input and output sections for .reginfo and
      .MIPS.options. Now for each such sections we have one synthetic input
      sections: MipsReginfoSection and MipsOptionsSection respectively.
      
      Both sections are handled as regular sections until the control reaches
      Writer. Writer then aggregates all sections whose type is SHT_MIPS_REGINFO
      or SHT_MIPS_OPTIONS to create a single synthesized input section. In that
      moment Writer also save GP0 value to the MipsGp0 field of the corresponding
      ObjectFile. This value required for R_MIPS_GPREL16 and R_MIPS_GPREL32
      relocations calculation.
      
      Differential revision: https://reviews.llvm.org/D26444
      
      llvm-svn: 286397
      ce02cf00
    • Rafael Espindola's avatar
      Make Discarded a InputSection. · 6ff570a3
      Rafael Espindola authored
      It was quite confusing that it had SectionKind of Regular, but was not
      actually a InputSection.
      
      llvm-svn: 286379
      6ff570a3
    • Rafael Espindola's avatar
      Add a convenience getObj method. NFC. · 77dbe9a4
      Rafael Espindola authored
      llvm-svn: 286370
      77dbe9a4
  28. Nov 08, 2016
  29. Nov 07, 2016
Loading