Skip to content
Unverified Commit 17f76541 authored by Roman Lebedev's avatar Roman Lebedev
Browse files

[NFCI][MachineCopyPropagation] invalidateRegister(): use SmallSet<8> instead of DenseSet.

This decreases the time consumed by the pass [during RawSpeed unity build]
by 25% (0.0586 s -> 0.04388 s).

While that isn't really impressive overall, that wasn't the goal here.
The memory results here are noticeable.
The baseline results are:
```
total runtime: 55.65s.
calls to allocation functions: 19754254 (354960/s)
temporary memory allocations: 4951609 (88974/s)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.79MB
total memory leaked: 198.01MB
```
While with this patch the results are:
```
total runtime: 55.37s.
calls to allocation functions: 19068237 (344403/s)   # -3.47 %
temporary memory allocations: 4261772 (76974/s)      # -13.93 % (!!!)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.73MB
total memory leaked: 198.01MB
```

So we get rid of *a lot* of temporary allocations.

Using `SmallSet<8>` makes sense to me because at least here
for x86 BdVer2, the size of that set is *never* more than 3,
over all of llvm test-suite + RawSpeed.

The story might be different on other targets,
not sure if it will ever justify whole DenseSet,
but if it does SmallDenseSet might be a compromise.
parent 7aeb41b3
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment