[lld-macho] Deduplicate CFStrings during ICF
`__cfstring` has embedded addends that foil ICF's hashing / equality checks. (We can ignore embedded addends when doing ICF because the same information gets recorded in our Reloc structs.) Therefore, in order to properly dedup CFStrings, we create a mutable copy of the CFString and zero out the embedded addends before performing any hashing / equality checks. (We did in fact have a partial implementation of CFString deduplication already. However, it only worked when the cstrings they point to are at identical offsets in their object files.) I anticipate this approach can be extended to other similar statically-allocated struct sections in the future. In addition, we previously treated all references with differing addends as unequal. This is not true when the references are to literals: different addends may point to the same literal in the output binary. In particular, `__cfstring` has such references to `__cstring`. I've adjusted ICF's `equalsConstant` logic accordingly, and I've added a few more tests to make sure the addend-comparison code path is adequately covered. Fixes https://github.com/llvm/llvm-project/issues/51281. Reviewed By: #lld-macho, Roger Differential Revision: https://reviews.llvm.org/D120137
Loading
Please sign in to comment