Commits · be6e80b01261ef02a169daf10b5e36c18696dbc0 · Roger Ferrer / llvm-epi

Jun 29, 2015

[opt] Hoist the call throuh SymbolBody::getReplacement out of the inline · be6e80b0

Chandler Carruth authored Jun 29, 2015

method to get a SymbolBody and into the callers, and kill now dead
includes.

This removes the need to have the SymbolBody definition when we're
defining the inline method and makes it a better inline method. That was
the only reason for a lot of header includes here. Removing these and
using forward declarations actually uncovers a bunch of cross-header
dependencies that I've fixed while I'm here, and will allow me to
introduce some *important* inline code into Chunks.h that requires the
definition of ObjectFile.

No functionality changed at this point.

Differential Revision: http://reviews.llvm.org/D10789

llvm-svn: 240982

be6e80b0

Jun 28, 2015

COFF: Fix ICF correctness bug. · 871847e3

Rui Ueyama authored Jun 28, 2015

When comparing two COMDAT sections, we need to take section values
and associative sections into account. This patch fixes that bug.
It fixes a crash bug of llvm-tblgen when linked with /opt:lldicf.

One thing I don't understand yet is that this logic seems to be
too strict. MSVC linker is able to create more compact executables
(which of course work correctly). With this ICF algorithm, LLD is
able to make executable smaller, but the outputs are larger than
MSVC's. There must be something I'm missing here.

llvm-svn: 240897

871847e3

Jun 26, 2015

COFF: Align DLL import thunks on 16-byte boundaries. · 7383562b
Rui Ueyama authored Jun 26, 2015
```
llvm-svn: 240806
```
7383562b

COFF: Merge DefinedRegular and DefinedCOMDAT. · 9b921e5d

Rui Ueyama authored Jun 25, 2015

I split them in r240319 because I thought they are different enough
that we should treat them as different types. It turned out that
that was not a good idea. They are so similar that we ended up having
many duplicate code.

llvm-svn: 240706

9b921e5d

Jun 25, 2015

COFF: Devirtualize mark(), markLive() and isCOMDAT(). · fc510f4c

Rui Ueyama authored Jun 25, 2015

Only SectionChunk can be dead-stripped. Previously,
all types of chunks implemented these functions,
but their functions were blank.

Likewise, only DefinedRegular and DefinedCOMDAT symbols
can be dead-stripped. markLive() function was implemented
for other symbol types, but they were blank.

I started thinking that the change I made in r240319 was
a mistake. I separated DefinedCOMDAT from DefinedRegular
because I thought that would make the code cleaner, but now
we want to handle them as the same type here. Maybe we
should roll it back.

This change should improve readability a bit as this removes
some dubious uses of reinterpret_cast. Previously, we
assumed that all COMDAT chunks are actually SectionChunks,
which was not very obvious.

llvm-svn: 240675

fc510f4c

COFF: Simplify. NFC. · f34c0885
Rui Ueyama authored Jun 25, 2015
```
llvm-svn: 240666
```
f34c0885
COFF: Use std::equal to compare two lists of relocations. · c6fcfbc9
Rui Ueyama authored Jun 25, 2015
```
llvm-svn: 240665
```
c6fcfbc9

COFF: Don't use COFFHeader->NumberOfRelocations. · 02c30279

Rui Ueyama authored Jun 25, 2015

The size of the field is 16 bit, so it's inaccurate if the
number of relocations in a section is more than 65535.

llvm-svn: 240661

02c30279

COFF: Fix a bug of __imp_ symbol. · 88e0f920

Rui Ueyama authored Jun 25, 2015

The change I made in r240620 was not correct. If a symbol foo is
defined, and if you use __imp_foo, __imp_foo symbol is automatically
defined as a pointer (not just an alias) to foo.

Now that we need to create a chunk for automatically-created symbols.
I defined LocalImportChunk class for them.

llvm-svn: 240622

88e0f920

COFF: Use COFFObjectFile::getRelocations(). NFC. · 42aa00b3
Rui Ueyama authored Jun 25, 2015
```
llvm-svn: 240614
```
42aa00b3

COFF: Cache raw pointers to relocation tables. · cde92423

Rui Ueyama authored Jun 24, 2015

Getting an iterator to the relocation table is very hot operation
in the linker. We do that not only to apply relocations but also
to mark live sections and to do ICF.

libObject's interface is slow. By caching pointers to the first
relocation table entries makes the linker 6% faster to self-link.

We probably need to fix libObject as well.

llvm-svn: 240603

cde92423

Jun 24, 2015

COFF: Initial implementation of Identical COMDAT Folding. · ddf71fc3

Rui Ueyama authored Jun 24, 2015

Identical COMDAT Folding (ICF) is an optimization to reduce binary
size by merging COMDAT sections that contain the same metadata,
actual data and relocations. MSVC link.exe and many other linkers
have this feature. LLD achieves on per with MSVC in terms produced
binary size with this patch.

This technique is pretty effective. For example, LLD's size is
reduced from 64MB to 54MB by enaling this optimization.

The algorithm implemented in this patch is extremely inefficient.
It puts all COMDAT sections into a set to identify duplicates.
Time to self-link with/without ICF are 3.3 and 320 seconds,
respectively. So this option roughly makes LLD 100x slower.
But it's okay as I wanted to achieve correctness first.
LLD is still able to link itself with this optimization.
I'm going to make it more efficient in followup patches.

Note that this optimization is *not* entirely safe. C/C++ require
different functions have different addresses. If your program
relies on that property, your program wouldn't work with ICF.
However, it's not going to be an issue on Windows because MSVC
link.exe turns ICF on by default. As long as your program works
with default settings (or not passing /opt:noicf), your program
would work with LLD too.

llvm-svn: 240519

ddf71fc3

COFF: Remove unused field SectionChunk::SectionIndex. · bd3a29d0
Peter Collingbourne authored Jun 24, 2015
```
llvm-svn: 240512
```
bd3a29d0

COFF: Add names for logging/debugging to COMDAT chunks. · 6a60be77

Rui Ueyama authored Jun 24, 2015

Chunks are basically unnamed chunks of bytes, and we don't like
to give them names. However, for logging or debugging, we want to
know symbols names of functions for COMDAT chunks. (For example,
we want to print out "we have removed unreferenced COMDAT section
which contains a function FOOBAR.")

This patch is to do that.

llvm-svn: 240484

6a60be77

Jun 20, 2015
- COFF: Fix common symbol alignment. · 5e31d0b2
  Rui Ueyama authored Jun 20, 2015
```
llvm-svn: 240217
```
  5e31d0b2
Jun 15, 2015

COFF: Support base relocations. · 588e832d

Rui Ueyama authored Jun 15, 2015

PE/COFF executables/DLLs usually contain data which is called
base relocations. Base relocations are a list of addresses that
need to be fixed by the loader if load-time relocation is needed.

Base relocations are in .reloc section.

We emit one base relocation entry for each IMAGE_REL_AMD64_ADDR64
relocation.

In order to save disk space, base relocations are grouped by page.
Each group is called a block. A block starts with a 32-bit page
address followed by 16-bit offsets in the page. That is more
efficient representation of addresses than just an array of 32-bit
addresses.

llvm-svn: 239710

588e832d

COFF: Add an assertion. NFC. · 4108f3f3

Rui Ueyama authored Jun 14, 2015

r239458 changed callee side of this function, so Live can never be
true when this function is called.

llvm-svn: 239705

4108f3f3

Jun 14, 2015

COFF: Support Windows resource files. · 2bf6a122

Rui Ueyama authored Jun 14, 2015

Resource files are data files containing i18n messages, icon images, etc.
MSVC has a tool to convert a resource file to a regular COFF file so that
you can just link that file to embed resources to an executable.

However, you can directly pass resource files to the linker. If you do that,
the linker invokes the tool automatically. This patch implements that feature.

llvm-svn: 239704

2bf6a122

Jun 10, 2015

COFF: De-virtualize and inline garbage collector functions. · 8b33f59b

Rui Ueyama authored Jun 10, 2015

isRoot, isLive and markLive functions are called very frequently.
Previously, they were virtual functions. This patch make them
non-virtual.

Also this patch checks chunk liveness before calling its mark().
Previously, we did that at beginning of markLive(), so the virtual
function would return immediately if it's live. That was inefficient.

llvm-svn: 239458

8b33f59b

Jun 08, 2015

COFF: Print out log messages to stdout. · 5b2588ae
Rui Ueyama authored Jun 08, 2015
```
llvm-svn: 239288
```
5b2588ae

COFF: Set non-1 alignment to common chunks. · 9cf1abb8

Rui Ueyama authored Jun 08, 2015

I don't know what the right thing to do here, but at least 1 does
not seem like a correct value. If we do not align common chunks at
all, a small program which calls puts() from global dtors crashes
mysteriously in a kernel32's function.

I believe the crash was caused by symbols overlapping each other,
and my guess is that alignment has something to do with that, but
I am not 100% sure. Needs investigating.

llvm-svn: 239280

9cf1abb8

COFF: Fix typo. · a6cd6c0c

Rui Ueyama authored Jun 07, 2015

This change doesn't change its functionality since the value
passed here is converted to uint16_t immediately.

llvm-svn: 239271

a6cd6c0c

Jun 07, 2015
- COFF: Move Windows-specific code from Chunk.{cpp,h} to DLL.{cpp,h}. · 4b22fa74
  Rui Ueyama authored Jun 07, 2015
```
llvm-svn: 239239
```
  4b22fa74
- Remove redundant `using namespace`. · e56f9c08
  Rui Ueyama authored Jun 06, 2015
```
llvm-svn: 239234
```
  e56f9c08
- COFF: Move .idata constructor from Writer to Chunk. · c6ea057d
  Rui Ueyama authored Jun 06, 2015
```
Previously, half of the constructor for .idata contents was in Chunks.cpp
and the rest was in Writer.cpp. This patch moves the latter to Chunks.cpp.
Now IdataContents class manages everything for .idata section.

llvm-svn: 239230
```
  c6ea057d
Jun 06, 2015

COFF: Merge Chunk::applyRelocations with Chunk::writeTo. · 743afa07

Rui Ueyama authored Jun 06, 2015

In this design, Chunk is the only thing that knows how to write
its contents to output file as well as how to apply relocations
there. The writer shouldn't know about the details.

llvm-svn: 239216

743afa07

Jun 01, 2015

COFF: Support import-by-ordinal DLL imports. · fd99e01b

Rui Ueyama authored Jun 01, 2015

Symbols exported by DLLs can be imported not by name but by
small number or ordinal. Usually, symbols have both ordinals
and names, and in that case ordinals are called "hints" and
used by the loader as hints.

However, symbols can have only ordinals. They are called
import-by-ordinal symbols. You need to manage ordinals by hand
so that they will never change if you choose to use the feature.
But it's supposed to make dynamic linking faster because
it needs no string comparison. Not sure if that claim still
stands in year 2015, though. Anyways, the feature exists,
and this patch implements that.

llvm-svn: 238780

fd99e01b

COFF: Use Chunk instead of its derived classes. · c2abdd91

Rui Ueyama authored Jun 01, 2015

I'm adding ordinal-only (nameless) imports to the import table.
The chunk for that type is going to be different from LookupChunk.
Without this change, we cannot add objects of the new type to the
vectors.

llvm-svn: 238779

c2abdd91

COFF: Fix the import table Hint/Name field. · 5b25eddd
Rui Ueyama authored Jun 01, 2015
```
llvm-svn: 238719
```
5b25eddd

COFF: Define an error category for the linker. · 8fd9fb98

Rui Ueyama authored Jun 01, 2015

Instead of returning non-categorized errors, return categorized errors.
All uses of make_dynamic_error_code are removed.

Because we don't have error reporting mechanism, I just chose to print out
error messages to stderr, and then return an error object. Not sure if
that's the right thing to do, but at least it seems practical.

http://reviews.llvm.org/D10129

llvm-svn: 238714

8fd9fb98

May 29, 2015

COFF: Fill imort table HintName field. · c9bfe320

Rui Ueyama authored May 29, 2015

Currently we set the field to zero, but as per the spec, we should
set numbers we read from import library files. The loader uses the
values as starting offsets for binary search when looking up imported
symbols from DLL.

llvm-svn: 238562

c9bfe320

May 28, 2015

Fix non-debug build. · 9aefa0c6
Rui Ueyama authored May 28, 2015
```
llvm-svn: 238474
```
9aefa0c6

COFF: Teach Chunk to write to a mmap'ed output file. · d6fefba4

Rui Ueyama authored May 28, 2015

Previously Writer directly handles writes to a file.
Chunks needed to give Writer a continuous chunk of memory.
That was inefficent if you construct data in chunks because
it would require two memory copies (one to construct a chunk
and the other is to write that to a file).

This patch teaches chunk to write directly to a file.
From readability point of view, this is also good because
you no longer have to call hasData() before calling getData().

llvm-svn: 238464

d6fefba4

COFF: Add a new PE/COFF port. · 411c6360

Rui Ueyama authored May 28, 2015

This is an initial patch for a section-based COFF linker.

The patch has 2300 lines of code including comments and blank lines.
Before diving into details, you want to start from reading README
because it should give you an overview of the design.

All important things are written in the README file, so I write
summary here.

- The linker is already able to self-link on Windows.

- It's significantly faster than the existing implementation.
  The existing one takes 5 seconds to link LLD on my machine,
  while the new one only takes 1.2 seconds, even though the new
  one is not multi-threaded yet. (And a proof-of-concept multi-
  threaded version was able to link it in 0.5 seconds.)

- It uses much less memory (250MB vs. 2GB virtual memory space
  to self-host).

- IMHO the new code is much simpler and easier to read than
  the existing PE/COFF port.

http://reviews.llvm.org/D10036

llvm-svn: 238458

411c6360