| SIMD.info

_mm256_rsqrt_ps() on Intel 64-bit - AVX2

Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in output. The maximum relative error for this approximation is less than 1.5*2^-12.

Intel 64-bit

_mm512_rsqrt14_ps() on Intel 64-bit - AVX512

Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in "a", and store the results in "dst". The maximum relative error for this approximation is less than 2^-14.

Intel 64-bit

vec_rsqrt() on IBM Power 9 64-bit - VSX

Purpose: Returns a vector containing a refined approximation of the reciprocal square roots of the corresponding elements of the source vector. This function provides an implementation-dependent greater precision than vec_rsqrte.

Result value: Each element of output contains a refined approximation of the reciprocal square root of the corresponding element of a.

Endian considerations: None.

Notes:

The example implementations assume that a register h initially contains the floating-point value 0.5 in each element (single- or double-precision as appropriate).
For finite square roots, this intrinsic guarantees at least 23 bits of accuracy for single-precision floating point, and at least 52 bits of accuracy for double-precision floating point.

IBM Power 9 64-bit

vrsqrte_f32() on Arm 64-bit - NEON

Floating-point Reciprocal Square Root Estimate. This instruction calculates an approximate square root for each vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register.

Arm 64-bit

_mm_rsqrt_ps() on Intel 64-bit - SSE4.2

Purpose: Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in output. The maximum relative error for this approximation is less than 1.5*2^-12.

Result value:

Endian considerations: None.

Intel 64-bit

vrsqrtes_f32() on Arm 64-bit - NEON

Floating-point Reciprocal Square Root Estimate. This instruction calculates an approximate square root for each vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register.

Arm 64-bit

_mm512_rsqrt28_ps() on Intel 64-bit - AVX512

Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in "a", store the results in "dst". The maximum relative error for this approximation is less than 2^-28.

Intel 64-bit

Tags / Equivalents

_mm256_rsqrt_ps() on Intel 64-bit - AVX2

_mm512_rsqrt14_ps() on Intel 64-bit - AVX512

vec_rsqrt() on IBM Power 9 64-bit - VSX

vrsqrte_f32() on Arm 64-bit - NEON

_mm_rsqrt_ps() on Intel 64-bit - SSE4.2

vrsqrtes_f32() on Arm 64-bit - NEON

_mm512_rsqrt28_ps() on Intel 64-bit - AVX512