Improve parallelism of ICF.
This is the only place we use threads for ICF. The intention of this code was to split an input vector into 256 shards and process them in parallel. What the code was actually doing was to split an input into 257 shards, process the first 256 shards in parallel, and the remaining one in serial. That means this code takes ceil(256/n)+1 instead of ceil(256/n) where n is the number of available CPU cores. The former converges to 2 while the latter converges to 1. This patches fixes the above issue. llvm-svn: 303797
Loading
Please register or sign in to comment