[ELF] Cap parallel::strategy to 16 threads when --threads= is unspecified
When --threads= is unspecified, we set it to `parallel::strategy.compute_thread_count()`, which uses sched_getaffinity (Linux)/cpuset_getaffinity (FreeBSD)/std::thread::hardware_concurrency (others). With extensive testing on many machines (many configurations from {aarch64,x86-64} x {Linux,FreeBSD,Windows} x allocators(native,mimalloc,rpmalloc) combinations) with varying workloads, we discovered that when the concurrency is larger than 16, the linking process is slower than using --threads=16 due to parallelism overhead outweighs optimizations. This is particularly harmful for machines with many cores or when the link job competes with other jobs. Cap parallel::strategy when --threads= is unspecified. For some workloads changing the concurrency from 8 to 16 has nearly no improvement. --thinlto-jobs= is unchanged since ThinLTO backend compiles are embarrassingly parallel. Link: https://discourse.llvm.org/t/avoidable-overhead-from-threading-by-default/69160 Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D147493
Loading
Please sign in to comment