Skip to content
Unverified Commit 849f963e authored by Igor Kirillov's avatar Igor Kirillov Committed by GitHub
Browse files

[CodeGen] Improve ExpandMemCmp for more efficient non-register aligned sizes handling (#70469)

* Enhanced the logic of ExpandMemCmp pass to merge contiguous
subsequences
  in LoadSequence, based on sizes allowed in `AllowedTailExpansions`.
* This enhancement seeks to minimize the number of basic blocks and
produce
  optimized code when using memcmp with non-register aligned sizes.
* Enable this feature for AArch64 with memcmp sizes modulo 8 equal to
  3, 5, and 6.

Reapplication of #69942 after fixing a bug
parent 89564f0b
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment