_mm512_rsqrt28_ps
ADD TO COMPARE ADDED TO COMPARE

Intel 64-bit (64 bits)/ AVX512 View official documentation View Félix Cloutier's documentation

Location: >

Purpose:

Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in "a", store the results in "dst". The maximum relative error for this approximation is less than 2^-28.

Result:

__m512

Example:

#include <immintrin.h>
#include <stdio.h>
int main() {
 __m512 a = _mm512_setr_ps(1.0f, 4.0f, 9.0f, 16.0f, 25.0f, 36.0f, 49.0f, 64.0f,                               81.0f, 100.0f, 121.0f, 144.0f, 169.0f, 196.0f, 225.0f, 256.0f);
 __m512 result = _mm512_rsqrt28_ps(a);
 float* res = (float*)&result;
 for (int i = 0; i < 16; i++)
  printf("%f\n", res[i]);
 }

Prototypes

Assembly Instruction:

VRSQRT28PS

Usage:


									
										__m512 result =
									
									_mm512_rsqrt28_ps(
									
										__m512 a
									)

Performance Metrics:

📊 Unlock Performance Insights

Get access to detailed performance metrics including latency, throughput, and CPU-specific benchmarks for this intrinsic.

SIMD Intrinsics Summary

SIMD Engines:	6
C Intrinsics:	10444
NEON:	4353
AVX2:	405
AVX512:	4717
SSE4.2:	598
VSX:	192
IBM-Z:	179

Vector Approximate Reciprocal Square Root 32-bit floats

_mm512_rsqrt28_psADD TO COMPARE ADDED TO COMPARE

Prototypes

📊 Unlock Performance Insights

SIMD Intrinsics Summary

_mm512_rsqrt28_ps
ADD TO COMPARE ADDED TO COMPARE