Skip to content
Commit 35e6a9c8 authored by Matt Arsenault's avatar Matt Arsenault
Browse files

AMDGPU: Break read2/write2 search range on a memory fence

This is to fix performance regressions introduced by
86c944d7.

The old search would collect all potentially mergeable instructions in
the entire block. In this case, the same address is written in
multiple places in the block on the other side of a fence. When sorted
by offset, the two unmergeable, identical addresses would be next to
each other and the merge would give up.

Break the search space when we encounter an instruction we won't be
able to merge across. This will keep the identical addresses in
different merge attempts.

This may also improve compile time by reducing the merge list size.
parent 0d671dbc
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment