Skip to content
  1. May 27, 2016
    • Rui Ueyama's avatar
      Avoid doing binary search. · 406b469d
      Rui Ueyama authored
      MergedInputSection::getOffset is the busiest function in LLD if string
      merging is enabled and input files have lots of mergeable sections.
      It is usually the case when creating executable with debug info,
      so it is pretty common.
      
      The reason why it is slow is because it has to do faily complex
      computations. For non-mergeable sections, section contents are
      contiguous in output, so in order to compute an output offset,
      we only have to add the output section's base address to an input
      offset. But for mergeable strings, section contents are split for
      merging, so they are not contigous. We've got to do some lookups.
      
      We used to do binary search on the list of section pieces.
      It is slow because I think it's hostile to branch prediction.
      
      This patch replaces it with hash table lookup. Seems it's working
      pretty well. Below is "perf stat -r10" output when linking clang
      with debug info. In this case this patch speeds up about 4%.
      
      Before:
      
             6584.153205 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.09% )
                     238 context-switches          #    0.036 K/sec                    ( +-  6.59% )
                       0 cpu-migrations            #    0.000 K/sec                    ( +- 50.92% )
               1,067,675 page-faults               #    0.162 M/sec                    ( +-  0.15% )
          18,369,931,470 cycles                    #    2.790 GHz                      ( +-  0.09% )
           9,640,680,143 stalled-cycles-frontend   #   52.48% frontend cycles idle     ( +-  0.18% )
         <not supported> stalled-cycles-backend
          21,206,747,787 instructions              #    1.15  insns per cycle
                                                   #    0.45  stalled cycles per insn  ( +-  0.04% )
           3,817,398,032 branches                  #  579.786 M/sec                    ( +-  0.04% )
             132,787,249 branch-misses             #    3.48% of all branches          ( +-  0.02% )
      
             6.579106511 seconds time elapsed                                          ( +-  0.09% )
      
      After:
      
             6312.317533 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.19% )
                     221 context-switches          #    0.035 K/sec                    ( +-  4.11% )
                       1 cpu-migrations            #    0.000 K/sec                    ( +- 45.21% )
               1,280,775 page-faults               #    0.203 M/sec                    ( +-  0.37% )
          17,611,539,150 cycles                    #    2.790 GHz                      ( +-  0.19% )
          10,285,148,569 stalled-cycles-frontend   #   58.40% frontend cycles idle     ( +-  0.30% )
         <not supported> stalled-cycles-backend
          18,794,779,900 instructions              #    1.07  insns per cycle
                                                   #    0.55  stalled cycles per insn  ( +-  0.03% )
           3,287,450,865 branches                  #  520.799 M/sec                    ( +-  0.03% )
              72,259,605 branch-misses             #    2.20% of all branches          ( +-  0.01% )
      
             6.307411828 seconds time elapsed                                          ( +-  0.19% )
      
      Differential Revision: http://reviews.llvm.org/D20645
      
      llvm-svn: 270999
      406b469d
    • Peter Collingbourne's avatar
      Update LLD for D20550. · 5079f3b7
      Peter Collingbourne authored
      Differential Revision: http://reviews.llvm.org/D20704
      
      llvm-svn: 270968
      5079f3b7
    • Sean Silva's avatar
      Make -L description a bit more precise. · 8ef190c7
      Sean Silva authored
      llvm-svn: 270966
      8ef190c7
    • Sean Silva's avatar
      Explain a bit better what --start-lib and --end-lib do. · 3b536d09
      Sean Silva authored
      llvm-svn: 270965
      3b536d09
    • Sean Silva's avatar
      Add a help description for --threads to avoid confusion. · 688fade4
      Sean Silva authored
      llvm-svn: 270964
      688fade4
    • Sean Silva's avatar
      --threads is a flag, not a number · 2c1a9da8
      Sean Silva authored
      We would previously accept `--threads=4`, but this option just turns on
      threading and does not specify a number of threads.
      
      I ran into this by accident because I was passing `--threads=<n>` but
      the number didn't seem to affect anything.
      
      llvm-svn: 270963
      2c1a9da8
  2. May 26, 2016
  3. May 25, 2016
  4. May 24, 2016
  5. May 23, 2016
    • Rui Ueyama's avatar
      Remove dead code. · fa2f307c
      Rui Ueyama authored
      The dead declarations made MSVC to warn on explicit template
      instantiations of the classes.
      
      llvm-svn: 270471
      fa2f307c
    • Rui Ueyama's avatar
      Do not split mergeable sections if they are gc'ed. · b91bf1a9
      Rui Ueyama authored
      Previously, mergeable section's constructors did more than just
      setting member variables; it split section contents into small
      pieces. It is not always computationally cheap task because if
      the section is a mergeable string section, it needs to scan the
      entire section to split them by NUL characters.
      
      If a section would be thrown away by GC, that cost ended up
      being a waste of time. It is going to be larger problem if the
      section is compressed -- the whole time to uncompress it and
      split it up is going to be a waste.
      
      Luckily, we can defer section splitting after GC. We just have
      to remember which offsets are in use during GC and apply that later.
      This patch implements it.
      
      Differential Revision: http://reviews.llvm.org/D20516
      
      llvm-svn: 270455
      b91bf1a9
    • Rui Ueyama's avatar
      Fix typos. · 2ab3d208
      Rui Ueyama authored
      llvm-svn: 270451
      2ab3d208
Loading