Skip to content
Commit 957efc23 authored by Sanjay Patel's avatar Sanjay Patel
Browse files

Use rsqrt (X86) to speed up reciprocal square root calcs

This is a first step for generating SSE rsqrt instructions for
reciprocal square root calcs when fast-math is allowed.

For now, be conservative and only enable this for AMD btver2
where performance improves significantly - for example, 29%
on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c
(if we convert the data type to single-precision float).

This patch adds a two constant version of the Newton-Raphson
refinement algorithm to DAGCombiner that can be selected by any target
via a parameter returned by getRsqrtEstimate()..

See PR20900 for more details:
http://llvm.org/bugs/show_bug.cgi?id=20900

Differential Revision: http://reviews.llvm.org/D5658

llvm-svn: 220570
parent 5e3a421b
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment