Unverified Commit 849f963e authored Oct 30, 2023 by Igor Kirillov Committed by GitHub Oct 30, 2023

[CodeGen] Improve ExpandMemCmp for more efficient non-register aligned sizes handling (#70469)

* Enhanced the logic of ExpandMemCmp pass to merge contiguous
subsequences
  in LoadSequence, based on sizes allowed in `AllowedTailExpansions`.
* This enhancement seeks to minimize the number of basic blocks and
produce
  optimized code when using memcmp with non-register aligned sizes.
* Enable this feature for AArch64 with memcmp sizes modulo 8 equal to
  3, 5, and 6.

Reapplication of #69942 after fixing a bug

parent 89564f0b

Expand all Show whitespace changes

Inline Side-by-side

Please to comment