Skip to content
  1. Jun 25, 2015
    • Rui Ueyama's avatar
      COFF: Devirtualize mark(), markLive() and isCOMDAT(). · fc510f4c
      Rui Ueyama authored
      Only SectionChunk can be dead-stripped. Previously,
      all types of chunks implemented these functions,
      but their functions were blank.
      
      Likewise, only DefinedRegular and DefinedCOMDAT symbols
      can be dead-stripped. markLive() function was implemented
      for other symbol types, but they were blank.
      
      I started thinking that the change I made in r240319 was
      a mistake. I separated DefinedCOMDAT from DefinedRegular
      because I thought that would make the code cleaner, but now
      we want to handle them as the same type here. Maybe we
      should roll it back.
      
      This change should improve readability a bit as this removes
      some dubious uses of reinterpret_cast. Previously, we
      assumed that all COMDAT chunks are actually SectionChunks,
      which was not very obvious.
      
      llvm-svn: 240675
      fc510f4c
    • Rui Ueyama's avatar
      COFF: Simplify. NFC. · f34c0885
      Rui Ueyama authored
      llvm-svn: 240666
      f34c0885
    • Rui Ueyama's avatar
      COFF: Use std::equal to compare two lists of relocations. · c6fcfbc9
      Rui Ueyama authored
      llvm-svn: 240665
      c6fcfbc9
    • Rui Ueyama's avatar
      COFF: Don't use COFFHeader->NumberOfRelocations. · 02c30279
      Rui Ueyama authored
      The size of the field is 16 bit, so it's inaccurate if the
      number of relocations in a section is more than 65535.
      
      llvm-svn: 240661
      02c30279
    • Rui Ueyama's avatar
      COFF: Fix a bug of __imp_ symbol. · 88e0f920
      Rui Ueyama authored
      The change I made in r240620 was not correct. If a symbol foo is
      defined, and if you use __imp_foo, __imp_foo symbol is automatically
      defined as a pointer (not just an alias) to foo.
      
      Now that we need to create a chunk for automatically-created symbols.
      I defined LocalImportChunk class for them.
      
      llvm-svn: 240622
      88e0f920
    • Rui Ueyama's avatar
      COFF: Handle undefined symbols starting with __imp_ in a special way. · d7666535
      Rui Ueyama authored
      MSVC linker is able to link an object file created from the following code.
      Note that __imp_hello is not defined anywhere.
      
        void hello() { printf("Hello\n"); }
        extern void (*__imp_hello)();
        int main() { __imp_hello(); }
      
      Function symbols exported from DLLs are automatically mangled by appending
      __imp_ prefix, so they have two names (original one and with the prefix).
      This "feature" seems to simulate that behavior even for non-DLL symbols.
      
      This is in my opnion very odd feature. Even MSVC linker warns if you use this.
      I'm adding that anyway for the sake of compatibiltiy.
      
      llvm-svn: 240620
      d7666535
    • Rui Ueyama's avatar
      COFF: Use COFFObjectFile::getRelocations(). NFC. · 42aa00b3
      Rui Ueyama authored
      llvm-svn: 240614
      42aa00b3
    • Rui Ueyama's avatar
      COFF: Cache raw pointers to relocation tables. · cde92423
      Rui Ueyama authored
      Getting an iterator to the relocation table is very hot operation
      in the linker. We do that not only to apply relocations but also
      to mark live sections and to do ICF.
      
      libObject's interface is slow. By caching pointers to the first
      relocation table entries makes the linker 6% faster to self-link.
      
      We probably need to fix libObject as well.
      
      llvm-svn: 240603
      cde92423
  2. Jun 24, 2015
    • Rui Ueyama's avatar
      COFF: Move code for ICF from Writer.cpp to ICF.cpp. · 49560c7a
      Rui Ueyama authored
      llvm-svn: 240590
      49560c7a
    • Rui Ueyama's avatar
      COFF: Initial implementation of Identical COMDAT Folding. · ddf71fc3
      Rui Ueyama authored
      Identical COMDAT Folding (ICF) is an optimization to reduce binary
      size by merging COMDAT sections that contain the same metadata,
      actual data and relocations. MSVC link.exe and many other linkers
      have this feature. LLD achieves on per with MSVC in terms produced
      binary size with this patch.
      
      This technique is pretty effective. For example, LLD's size is
      reduced from 64MB to 54MB by enaling this optimization.
      
      The algorithm implemented in this patch is extremely inefficient.
      It puts all COMDAT sections into a set to identify duplicates.
      Time to self-link with/without ICF are 3.3 and 320 seconds,
      respectively. So this option roughly makes LLD 100x slower.
      But it's okay as I wanted to achieve correctness first.
      LLD is still able to link itself with this optimization.
      I'm going to make it more efficient in followup patches.
      
      Note that this optimization is *not* entirely safe. C/C++ require
      different functions have different addresses. If your program
      relies on that property, your program wouldn't work with ICF.
      However, it's not going to be an issue on Windows because MSVC
      link.exe turns ICF on by default. As long as your program works
      with default settings (or not passing /opt:noicf), your program
      would work with LLD too.
      
      llvm-svn: 240519
      ddf71fc3
    • Peter Collingbourne's avatar
      COFF: Remove unused field SectionChunk::SectionIndex. · bd3a29d0
      Peter Collingbourne authored
      llvm-svn: 240512
      bd3a29d0
    • Peter Collingbourne's avatar
      2ed4c8f5
    • Peter Collingbourne's avatar
      COFF: Ignore debug symbols. · c7b685d9
      Peter Collingbourne authored
      Differential Revision: http://reviews.llvm.org/D10675
      
      llvm-svn: 240487
      c7b685d9
    • Rui Ueyama's avatar
      COFF: Add names for logging/debugging to COMDAT chunks. · 6a60be77
      Rui Ueyama authored
      Chunks are basically unnamed chunks of bytes, and we don't like
      to give them names. However, for logging or debugging, we want to
      know symbols names of functions for COMDAT chunks. (For example,
      we want to print out "we have removed unreferenced COMDAT section
      which contains a function FOOBAR.")
      
      This patch is to do that.
      
      llvm-svn: 240484
      6a60be77
    • Rui Ueyama's avatar
      COFF: Make link order compatible with MSVC link.exe. · 0d2e9990
      Rui Ueyama authored
      Previously, we added files in directive sections to the symbol
      table as we read the sections, so the link order was depth-first.
      That's not compatible with MSVC link.exe nor the old LLD.
      
      This patch is to queue files so that new files are added to the
      end of the queue and processed last. Now addFile() doesn't parse
      files nor resolve symbols. You need to call run() to process
      queued files.
      
      llvm-svn: 240483
      0d2e9990
  3. Jun 23, 2015
  4. Jun 22, 2015
    • Rui Ueyama's avatar
      COFF: Separate DefinedCOMDAT from DefinedRegular symbol type. NFC. · 617f5ccb
      Rui Ueyama authored
      Before this change, you got to cast a symbol to DefinedRegular and then
      call isCOMDAT() to determine if a given symbol is a COMDAT symbol.
      Now you can just use isa<DefinedCOMDAT>().
      
      As to the class definition of DefinedCOMDAT, I could remove duplicate
      code from DefinedRegular and DefinedCOMDAT by introducing another base
      class for them, but I chose to not do that to keep the class hierarchy
      shallow. This amount of code duplication doesn't worth to define a new
      class.
      
      llvm-svn: 240319
      617f5ccb
    • Rui Ueyama's avatar
      Fix typo. · 61096206
      Rui Ueyama authored
      llvm-svn: 240298
      61096206
    • Rui Ueyama's avatar
      COFF: Support delay-load import tables. · a77336bd
      Rui Ueyama authored
      DLLs are usually resolved at process startup, but you can
      delay-load them by passing /delayload option to the linker.
      
      If a /delayload is specified, the linker has to create data
      which is similar to regular import table.
      One notable difference is that the pointers in a delay-load
      import table are originally pointing to thunks that resolves
      themselves. Each thunk loads a DLL, resolve its name, and then
      overwrites the pointer with the result so that subsequent
      function calls directly call a desired function. The linker
      has to emit thunks.
      
      llvm-svn: 240250
      a77336bd
  5. Jun 21, 2015
  6. Jun 20, 2015
    • Rui Ueyama's avatar
      COFF: Fix common symbol alignment. · 5e31d0b2
      Rui Ueyama authored
      llvm-svn: 240217
      5e31d0b2
    • Rui Ueyama's avatar
      COFF: Fix a common symbol bug. · efb7e1aa
      Rui Ueyama authored
      This is a case that one mistake caused a very mysterious bug.
      I made a mistake to calculate addresses of common symbols, so
      each common symbol pointed not to the beginning of its location
      but to the end of its location. (Ouch!)
      
      Common symbols are aligned on 16 byte boundaries. If a common
      symbol is small enough to fit between the end of its real
      location and whatever comes next, this bug didn't cause any harm.
      
      However, if a common symbol is larger than that, its memory
      naturally overlapped with other symbols. That means some
      uninitialized variables accidentally shared memory. Because
      totally unrelated memory writes mutated other varaibles, it was
      hard to debug.
      
      It's surprising that LLD was able to link itself and all LLD
      tests except gunit tests passed with this nasty bug.
      
      With this fix, the new COFF linker is able to pass all tests
      for LLVM, Clang and LLD if I use MSVC cl.exe as a compiler.
      Only three tests are failing when used with clang-cl.
      
      llvm-svn: 240216
      efb7e1aa
    • Peter Collingbourne's avatar
      COFF: Take reference to argument vector using std::vector::data() instead of operator[](0). · 74ecc89c
      Peter Collingbourne authored
      This avoids undefined behaviour caused by an out-of-range access if the
      vector is empty, which can happen if an object file's directive section
      contains only whitespace.
      
      llvm-svn: 240183
      74ecc89c
    • Rui Ueyama's avatar
      COFF: Fix precedence between LIB and /libpath. · f00df0af
      Rui Ueyama authored
      /libpath should take precedence over LIB.
      Previously, LIB took precedence over /libpath.
      
      llvm-svn: 240182
      f00df0af
  7. Jun 19, 2015
    • Rui Ueyama's avatar
      COFF: Add search paths in the correct order. · 165b254e
      Rui Ueyama authored
      Previously, we added search paths in reverse order.
      
      llvm-svn: 240180
      165b254e
    • Rui Ueyama's avatar
      COFF: Cache Archive::Symbol::getName(). NFC. · 29792a82
      Rui Ueyama authored
      getName() does strlen() on the symbol table, so it's not very fast.
      It's not as bad as r239332 because the number of symbols exported
      from archive files are fewer than object files, and they are usually
      shorter, though.
      
      llvm-svn: 240178
      29792a82
    • Rui Ueyama's avatar
      COFF: Continue reading object files until converge. · 573bf7de
      Rui Ueyama authored
      In this linker model, adding an undefined symbol may trigger chain
      reactions. It may trigger a Lazy symbol to read a new file.
      A new file may contain a directive section, which may contain various
      command line options.
      
      Previously, we didn't handle chain reactions well. We visited /include'd
      symbols only once, so newly-added /include symbols were ignored.
      This patch fixes that bug.
      
      Now, the symbol table is versioned; every time the symbol table is
      updated, the version number is incremented. We repeat adding undefined
      symbols until the version number does not change. It is guaranteed to
      converge -- the number of undefined symbol in the system is finite,
      and adding the same undefined symbol more than once is basically no-op.
      
      llvm-svn: 240177
      573bf7de
    • Rui Ueyama's avatar
      COFF: Don't add new undefined symbols for /alternatename. · 4d2834bd
      Rui Ueyama authored
      Alternatename option is in the form of /alternatename:<from>=<to>.
      It's effect is to resolve <from> as <to> if <from> is still undefined
      at end of name resolution.
      
      If <from> is not undefined but completely a new symbol, alternatename
      shouldn't do anything. Previously, it introduced a new undefined
      symbol for <from>, which resulted in undefined symbol error.
      
      llvm-svn: 240161
      4d2834bd
    • Rui Ueyama's avatar
      COFF: Add /nodefaultlib and /merge for .drectve. · ce86c996
      Rui Ueyama authored
      llvm-svn: 240077
      ce86c996
    • Rui Ueyama's avatar
      COFF: Handle /include in .drectve. · 08d5e187
      Rui Ueyama authored
      We don't want to insert a new symbol to the symbol table while reading
      a .drectve section because it's going to be too complicated.
      That we are reading a directive section means that we are currently
      reading some object file. Adding a new undefined symbol to the symbol
      table can trigger a library file to read a new file, so it would make
      the call stack too deep.
      
      In this patch, I add new symbol names to a list to resolve them later.
      
      llvm-svn: 240076
      08d5e187
    • Rui Ueyama's avatar
      COFF: Allow identical alternatename options. · e8d56b52
      Rui Ueyama authored
      Alternatename option is in the form of /alternatename:<from>=<to>.
      It is an error if there are two options having the same <from> but
      different <to>. It is *not* an error if both are the same.
      
      llvm-svn: 240075
      e8d56b52
  8. Jun 18, 2015
Loading