Commits · e5e320de06ef90192aca13f5c2145499919edeaf · Lorenzo Albano / LLVM bpEVL

Feb 28, 2017
- De-template DefinedRegular. · 80474a26
  Rui Ueyama authored Feb 28, 2017
  
  Differential Revision: https://reviews.llvm.org/D30348 llvm-svn: 296508
  80474a26
Feb 27, 2017

Move SymbolTable<ELFT>::Sections out of the class. · 536a2670

Rui Ueyama authored Feb 27, 2017

The list of all input sections was defined in SymbolTable class for a
historical reason. The list itself is not a template. However, because
SymbolTable class is a template, we needed to pass around ELFT to access
the list. This patch moves the list out of the class so that it doesn't
need ELFT.

llvm-svn: 296309

536a2670

Feb 24, 2017
- Merge OutputSectionBase and OutputSection. NFC. · 24e6f363
  Rafael Espindola authored Feb 24, 2017
  
  Now that all special sections are SyntheticSections, we only need one OutputSection class. llvm-svn: 296127
  24e6f363
- Expand a comment. NFC. · 798ad9a1
  Rafael Espindola authored Feb 24, 2017
  
  llvm-svn: 296114
  798ad9a1
Feb 23, 2017

Convert EhOutputSection to be a synthetic section. · 66b4e215

Rafael Espindola authored Feb 23, 2017

With this we complete the transition out of special output sections,
and with the previous patches it should be possible to merge
OutputSectionBase and OuputSection.

llvm-svn: 296023

66b4e215

Make InputSection a class. NFC. · 774ea7d0

Rafael Espindola authored Feb 23, 2017

With the current design an InputSection is basically anything that
goes directly in a OutputSection. That includes plain input section
but also synthetic sections, so this should probably not be a
template.

llvm-svn: 295993

774ea7d0

Merge InputSectionData and InputSectionBase. · c404d50d
Rafael Espindola authored Feb 23, 2017
```
Now that InputSectionBase is not a template there is no reason to have
the two.

llvm-svn: 295924
```
c404d50d

Convert InputSectionBase to a class. · b4c9b81a

Rafael Espindola authored Feb 23, 2017

Removing this template is not a big win by itself, but opens the way
for removing more templates.

llvm-svn: 295923

b4c9b81a

Feb 17, 2017

[ELF] - Move DependentSections vector from InputSection to InputSectionBase · 647c1685

George Rimar authored Feb 17, 2017

I splitted it from D29273.
Since we plan to make relocatable sections as dependent for target ones for
--emit-relocs implementation, this change is required to support .eh_frame case.

EhInputSection inherets from InputSectionBase and not from InputSection.
So for case when it has relocation section, it should be able to access DependentSections
vector.

This case is real for Linux kernel.

Differential revision: https://reviews.llvm.org/D30084

llvm-svn: 295483

647c1685

Feb 16, 2017

[ELF] - Allow section to have multiple dependent sections. · 09015fee

George Rimar authored Feb 16, 2017

That fixes a case when section has more than one metadata 
section. Previously GC would collect one of such sections 
because we had implementation that stored only last one as
dependent.

Differential revision: https://reviews.llvm.org/D29981

llvm-svn: 295298

09015fee

Feb 03, 2017

Replace MergeOutputSection with a synthetic section. · 9e9754b5

Rafael Espindola authored Feb 03, 2017

With a synthetic merge section we can have, for example, a single
.rodata section with stings, fixed sized constants and non merge
constants.

I can be simplified further by not setting Entsize, but that is
probably better done is a followup patch.

This should allow some cleanup in the linker script code now that
every output section command maps to just one output section.

llvm-svn: 294005

9e9754b5

Feb 01, 2017

[ELF] Use SyntheticSections for Thunks · 3a52eb00

Peter Smith authored Feb 01, 2017

    
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
  need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
    
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
  the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
  first caller to the Thunk.
    
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.

This is a recommit of r293283 with a fixed comparison predicate as
std::merge requires a strict weak ordering.

Differential revision: https://reviews.llvm.org/D29327

llvm-svn: 293757

3a52eb00

Jan 28, 2017
- Revert "[ELF][ARM] Use SyntheticSections for Thunks" · f20ee9f1
  Rui Ueyama authored Jan 28, 2017
  
  This reverts commit r293283 because it broke MSVC build. llvm-svn: 293352
  f20ee9f1
Jan 27, 2017

[ELF][ARM] Use SyntheticSections for Thunks · 5191c6f9

Peter Smith authored Jan 27, 2017

    
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
  need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
    
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
  the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
  first caller to the Thunk.
    
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.

Differential Revision:  https://reviews.llvm.org/D29129

llvm-svn: 293283

5191c6f9

Jan 12, 2017

[ELF] - Reuse Decompressor class. · 0d8af369

George Rimar authored Jan 12, 2017

Intention of change is to get rid of code duplication.
Decompressor was introduced in D28105.

Change allows to get rid of few methods relative to decompression.

Differential revision: https://reviews.llvm.org/D28106

llvm-svn: 291758

0d8af369

Jan 06, 2017
- Merge elf::toString and coff::toString. · ce039266
  Rui Ueyama authored Jan 06, 2017
  
  The two overloaded functions hid each other. This patch merges them. llvm-svn: 291222
  ce039266
Dec 20, 2016

Remove `Compressed` member from InputSectionData. · c207a89c

Rui Ueyama authored Dec 20, 2016

This value is used only once, and we can compute a value.
So we don't need to save it.

llvm-svn: 290164

c207a89c

Dec 19, 2016

Remove inappropriate use of CachedHashStringRef. · 8f687f71

Rui Ueyama authored Dec 19, 2016

Use of CachedHashStringRef makes sense only when we reuse hash values.
Sprinkling it to all DenseMap has no benefits and just complicates data types.
Basically we shouldn't use CachedHashStringRef unless there is a strong
reason to to do so.

llvm-svn: 290076

8f687f71

Dec 06, 2016

Inline MergeInputSection::getData(). · c8e68848

Rui Ueyama authored Dec 06, 2016

This change seems to make LLD 0.6% faster when linking Clang with
debug info. I don't want us to have lots of local optimizations,
but this function is very hot, and the improvement is small but
not negligible, so I think it's worth doing.

llvm-svn: 288757

c8e68848

Dec 05, 2016

Use "equivalence class" instead of "color" to describe the concept in ICF. · fcd3fa83

Rui Ueyama authored Dec 05, 2016

Also add a citation to GNU gold safe ICF paper.

Differential Revision: https://reviews.llvm.org/D27398

llvm-svn: 288684

fcd3fa83

Dec 01, 2016

Updates file comments and variable names. · 91ae861a
Rui Ueyama authored Dec 01, 2016
```
Use "color" instead of "group id" to describe the ICF algorithm.

llvm-svn: 288409
```
91ae861a

Parallelize ICF to make LLD's ICF really fast. · c1835319

Rui Ueyama authored Dec 01, 2016

ICF is short for Identical Code Folding. It is a size optimization to
identify two or more functions that happened to have the same contents
to merges them. It usually reduces output size by a few percent.

ICF is slow because it is computationally intensive process. I tried
to paralellize it before but failed because I couldn't make a
parallelized version produce consistent outputs. Although it didn't
create broken executables, every invocation of the linker generated
slightly different output, and I couldn't figure out why.

I think I now understand what was going on, and also came up with a
simple algorithm to fix it. So is this patch.

The result is very exciting. Chromium for example has 780,662 input
sections in which 20,774 are reducible by ICF. LLD previously took
7.980 seconds for ICF. Now it finishes in 1.065 seconds.

As a result, LLD can now link a Chromium binary (output size 1.59 GB)
in 10.28 seconds on my machine with ICF enabled. Compared to gold
which takes 40.94 seconds to do the same thing, this is an amazing
number.

From here, I'll describe what we are doing for ICF, what was the
previous problem, and what I did in this patch.

In ICF, two sections are considered identical if they have the same
section flags, section data, and relocations. Relocations are tricky,
becuase two relocations are considered the same if they have the same
relocation type, values, and if they point to the same section _in
terms of ICF_.

Here is an example. If foo and bar defined below are compiled to the
same machine instructions, ICF can (and should) merge the two,
although their relocations point to each other.

  void foo() { bar(); }
  void bar() { foo(); }

This is not an easy problem to solve.

What we are doing in LLD is some sort of coloring algorithm. We color
non-identical sections using different colors repeatedly, and sections
in the same color when the algorithm terminates are considered
identical. Here is the details:

  1. First, we color all sections using their hash values of section
  types, section contents, and numbers of relocations. At this moment,
  relocation targets are not taken into account. We just color
  sections that apparently differ in different colors.

  2. Next, for each color C, we visit sections having color C to see
  if their relocations are the same. Relocations are considered equal
  if their targets have the same color. We then recolor sections that
  have different relocation targets in new colors.

  3. If we recolor some section in step 2, relocations that were
  previously pointing to the same color targets may now be pointing to
  different colors. Therefore, repeat 2 until a convergence is
  obtained.

Step 2 is a heavy operation. For Chromium, the first iteration of step
2 takes 2.882 seconds, and the second iteration takes 1.038 seconds,
and in total it needs 23 iterations.

Parallelizing step 1 is easy because we can color each section
independently. This patch does that.

Parallelizing step 2 is tricky. We could work on each color
independently, but we cannot recolor sections in place, because it
will break the invariance that two possibly-identical sections must
have the same color at any moment.

Consider sections S1, S2, S3, S4 in the same color C, where S1 and S2
are identical, S3 and S4 are identical, but S2 and S3 are not. Thread
A is about to recolor S1 and S2 in C'. After thread A recolor S1 in
C', but before recolor S2 in C', other thread B might observe S1 and
S2. Then thread B will conclude that S1 and S2 are different, and it
will split thread B's sections into smaller groups wrongly. Over-
splitting doesn't produce broken results, but it loses a chance to
merge some identical sections. That was the cause of indeterminism.

To fix the problem, I made sections have two colors, namely current
color and next color. At the beginning of each iteration, both colors
are the same. Each thread reads from current color and writes to next
color. In this way, we can avoid threads from reading partial
results. After each iteration, we flip current and next.

This is a very simple solution and is implemented in less than 50
lines of code.

I tested this patch with Chromium and confirmed that this parallelized
ICF produces the identical output as the non-parallelized one.

Differential Revision: https://reviews.llvm.org/D27247

llvm-svn: 288373

c1835319

Nov 26, 2016

Change return types of split{Non,}Strings. · e8a077ba

Rui Ueyama authored Nov 26, 2016

They return new vectors, but at the same time they mutate other vectors,
so returning values doesn't make much sense. We should just mutate two
vectors.

llvm-svn: 287979

e8a077ba

Nov 25, 2016

Move getLocation from Relocations.cpp to InputSection.cpp. · da06bfb7

Rui Ueyama authored Nov 25, 2016

The function was used only within Relocations.cpp, but now we are
using it in many places, so this patch moves it to a file that fits
to the functionality.

llvm-svn: 287943

da06bfb7

Nov 23, 2016

Define toString() as a generic function to get a string for error message. · 3fc0f7e5

Rui Ueyama authored Nov 23, 2016

We have different functions to stringize objects to construct
error messages. For InputFile, we have getFilename, and for
InputSection, we have getName. You had to memorize them.

I think this is the case where the function overloading comes in handy.

This patch defines toString() functions that are overloaded for all these
types, so that you just call it in error().

Differential Revision: https://reviews.llvm.org/D27030

llvm-svn: 287787

3fc0f7e5

[ELF] Print error location in .eh_frame parser · 531df4fc
Eugene Leviant authored Nov 23, 2016
```
Differential revision: https://reviews.llvm.org/D26914

llvm-svn: 287750
```
531df4fc

Nov 21, 2016

Add a flag to InputSectionBase for linker script. · f94efddd

Rui Ueyama authored Nov 20, 2016

Previously, we set (uintptr_t)-1 to InputSectionBase::OutSec to record
that a section has already been set to be assigned to some output section
by linker scripts. Later, we restored nullptr to the pointer to use
the field for the original purpose. That overloading is not very easy to
understand.

This patch adds a bit flag for that purpose, so that we don't need
to piggyback the flag on an unrelated pointer.

llvm-svn: 287508

f94efddd

Nov 20, 2016

Do not expose ICF class from the file. · bd1f0630

Rui Ueyama authored Nov 20, 2016

Also this patch uses file-scope functions instead of class member function.

Now that ICF class is not visible from outside, InputSection class
can no longer be "friend" of it. So I removed the friend relation
and just make it expose the features to public.

llvm-svn: 287480

bd1f0630

Nov 18, 2016

Simplify MergeOutputSection. · 77f2a875

Rui Ueyama authored Nov 18, 2016

MergeOutputSection class was a bit hard to use because it provdes
a series of finalize functions that have to be called in a right way
at a right time. It also intereacted with MergeInputSection, and the
logic was somewhat entangled between the two classes.

This patch simplifies it by providing only one finalize function.
Now, all you have to do is to call MergeOutputSection::finalize
when you have added all sections to the output section. Then, it
internally merges strings and initliazes StringPiece objects.
I think this is much easier to understand.

This patch also adds comments.

llvm-svn: 287314

77f2a875

Nov 14, 2016
- [ELF] - format. NFC. · d8b27769
  George Rimar authored Nov 14, 2016
  
  llvm-svn: 286805
  d8b27769
Nov 11, 2016
- Remove a member from InputSectionData and use the pool instead. · 82664d9d
  Rui Ueyama authored Nov 11, 2016
  
  llvm-svn: 286557
  82664d9d
Nov 10, 2016

Parse relocations only once. · 9f0c4bb7

Rafael Espindola authored Nov 10, 2016

Relocations are the last thing that we wore storing a raw section
pointer to and parsing on demand.

With this patch we parse it only once and store a pointer to the
actual data.

The patch also changes where we store it. It is now in
InputSectionBase. Not all sections have relocations, but most do and
this simplifies the logic. It also means that we now only support one
relocation section per section. Given that that constraint is
maintained even with -r with gold bfd and lld, I think it is OK.

llvm-svn: 286459

9f0c4bb7

[ELF] Convert .got.plt section to input section · 41ca327b
Eugene Leviant authored Nov 10, 2016
```
Differential revision: https://reviews.llvm.org/D26349

llvm-svn: 286443
```
41ca327b

Make OutputSectionBase a class instead of class template. · e08e78df

Rafael Espindola authored Nov 09, 2016

The disadvantage is that we use uint64_t instad of uint32_t for some
value in 32 bit files. The advantage is a substantially simpler code,
faster builds and less code duplication.

llvm-svn: 286414

e08e78df

Nov 09, 2016

[ELF][MIPS] Convert .MIPS.abiflags section to synthetic input section · fa03b0fa

Simon Atanasyan authored Nov 09, 2016

Previously, we have both input and output section for .MIPS.abiflags.
Now we have only one class for .MIPS.abiflags, which is MipsAbiFlagsSection.
This class is a synthetic input section.

.MIPS.abiflags sections are handled as regular sections until
the control reaches Writer. Writer then aggregates all sections
whose type is SHT_MIPS_ABIFLAGS to create a single synthesized
input section. The synthesized section is then processed normally
as if it came from an input file.

llvm-svn: 286398

fa03b0fa

[ELF][MIPS] Convert .reginfo and .MIPS.options sections to synthetic input sections · ce02cf00

Simon Atanasyan authored Nov 09, 2016

Previously, we have both input and output sections for .reginfo and
.MIPS.options. Now for each such sections we have one synthetic input
sections: MipsReginfoSection and MipsOptionsSection respectively.

Both sections are handled as regular sections until the control reaches
Writer. Writer then aggregates all sections whose type is SHT_MIPS_REGINFO
or SHT_MIPS_OPTIONS to create a single synthesized input section. In that
moment Writer also save GP0 value to the MipsGp0 field of the corresponding
ObjectFile. This value required for R_MIPS_GPREL16 and R_MIPS_GPREL32
relocations calculation.

Differential revision: https://reviews.llvm.org/D26444

llvm-svn: 286397

ce02cf00

Make Discarded a InputSection. · 6ff570a3

Rafael Espindola authored Nov 09, 2016

It was quite confusing that it had SectionKind of Regular, but was not
actually a InputSection.

llvm-svn: 286379

6ff570a3

Add a convenience getObj method. NFC. · 77dbe9a4
Rafael Espindola authored Nov 09, 2016
```
llvm-svn: 286370
```
77dbe9a4

Nov 08, 2016
- Revert "[ELF] Make InputSection<ELFT>::writeTo virtual" · 1a541123
  Rafael Espindola authored Nov 08, 2016
  
  This reverts commit r286100. This saves 8 bytes of every InputSection. llvm-svn: 286235
  1a541123
Nov 07, 2016
- [ELF] Make InputSection<ELFT>::writeTo virtual · 0a8f1fe6
  Eugene Leviant authored Nov 07, 2016
  
  Differential revision: https://reviews.llvm.org/D26281 llvm-svn: 286100
  0a8f1fe6