[X86 isel] Remove lane requirement from lowerShuffleAsUNPCKAndPermute
`lowerShuffleAsUNPCKAndPermute` requires the shuffle mask element to be in the same lane in both the input and output vectors. This prevents it from matching certain patterns for example in [GHI 61964](https://github.com/llvm/llvm-project/issues/61964). Removing the lane requirement fixes the issue. The change I'm targeting is in the test llvm/test/CodeGen/X86/pr61964.ll. The codegen has improved notably with this patch. Otherwise, looks like some broadcast instructions are replaced with unpck and perm. To check if there's any other performance change, I ran llvm-test-suite benchmarks from the SingleSource, MultiSource, and MicroBenchmarks directories: ``` Tests: 2665 Short Running: 2009 (filtered out) Same hash: 140 (filtered out) In Blacklist: 513 (filtered out) Remaining: 3 Metric: exec_time Program exec_time lhs rhs diff test-suite :: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000.test 1.64 1.64 0.1% test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test 1.06 1.06 0.0% test-suite :: MultiSource/Applications/JM/lencod/lencod.test 5.25 5.25 0.0% Geomean difference nan nan 0.0% exec_time l/r lhs rhs diff count 3.000000 3.000000 3.000000 mean 2.648300 2.649100 0.000462 std 2.269035 2.268849 0.000415 min 1.055500 1.055900 0.000095 25% 1.349300 1.350250 0.000237 50% 1.643100 1.644600 0.000379 75% 3.444700 3.445700 0.000646 max 5.246300 5.246800 0.000913 ``` The patch only hits three cases and the result is neutral. (The 513 blacklisted benchmarks are the ones under MicroBenchmarks, which `--filter-hash` does not work and I manually verified their code did not change). Differential Revision: https://reviews.llvm.org/D147668
Loading
Please sign in to comment