[lld-macho] Private label aliases to weak symbols should not retain section data
If we have two files with the same weak symbol like so: ``` ltmp0: _weak: <contents> ``` and ``` ltmp1: _weak: <contents> ``` Linking them together should leave only one copy of `<contents>`, not two. Previously, we would keep around both copies because of the private-label `ltmp<N>` symbols (i.e. symbols that start with `l`) -- we would not coalesce those, so we would treat them as retaining the contents. This matters for more than just size -- we are depending upon this behavior internally for emitting a certain file format. This file format's header is repeated in each object file, but we want it to appear just once in our output. Why can't we not emit those aliases to `_weak`, or reference the `ltmp<N>` symbols instead of `_weak`? Well, MC actually adds `ltmp<N>` symbols as part of the assembly-to-binary translation step. So any codegen at the clang level can't access them. All that said... this solution is actually kind of hacky. Here, we avoid creating the private-label symbols at parse time. This is acceptable since we never emit those symbols in our output. However, in ld64, any aliasing temporary symbols (ignored or otherwise) won't retain coalesced data. But implementing this is harder -- we would have to create those symbols first (so we can emit their names later), but we would have to ensure the linker correctly shuffles them around when their aliasees get coalesced. Additionally, ld64 treats these temporary symbols as functionally equivalent to the weak symbols themselves -- that is, it will emit weak binds when those non-weak temporary aliases are referenced. We have imitated this behavior for private-label symbols, but implementing it for local aliases in general seems substantially more difficult. I'm not sure if any programs actually depend on this behavior though, so maybe it's a moot point. Finally, ld64 does all this regardless of whether `.subsections_via_symbols` is specified. We don't. But again, given how rare the lack of that directive is (I've only seen it from hand-written assembly inputs), I don't think we need to worry about it. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D139069
Loading
Please sign in to comment