Tags / Equivalents
_mm256_rsqrt_ps() on Intel 64-bit - AVX2
_mm512_rsqrt14_ps() on Intel 64-bit - AVX512
vec_rsqrt() on IBM Power 9 64-bit - VSX
Purpose: Returns a vector containing a refined approximation of the reciprocal square roots of the corresponding elements of the source vector. This function provides an implementation-dependent greater precision than vec_rsqrte.
Result value: Each element of output contains a refined approximation of the reciprocal square root of the corresponding element of a.
Endian considerations: None.
Notes:
-
The example implementations assume that a register h initially contains the floating-point value 0.5 in each element (single- or double-precision as appropriate).
-
For finite square roots, this intrinsic guarantees at least 23 bits of accuracy for single-precision floating point, and at least 52 bits of accuracy for double-precision floating point.
vrsqrte_f32() on Arm 64-bit - NEON
Floating-point Reciprocal Square Root Estimate. This instruction calculates an approximate square root for each vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register.
_mm_rsqrt_ps() on Intel 64-bit - SSE4.2
Purpose: Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in output. The maximum relative error for this approximation is less than 1.5*2^-12.
Result value:
Endian considerations: None.
vrsqrtes_f32() on Arm 64-bit - NEON
Floating-point Reciprocal Square Root Estimate. This instruction calculates an approximate square root for each vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register.
_mm512_rsqrt28_ps() on Intel 64-bit - AVX512