Please enter what you're looking for to continue your search
 

vst1q_f32
ADD TO COMPARE ADDED TO COMPARE

 Arm 64-bit (64 bits)/ NEON  View official documentation
 Location: Memory Operations  >  Vector Store
 Supported Architectures: v7, A32, A64
Purpose:
Stores elements from a 128-bit vector into contiguous memory.
Result:

void

Example:
#include <arm_neon.h>
#include <stdio.h>
int main() {
 float32x4_t a = {
  1.5f, 2.5f, 3.5f, 4.5f
 };
 float32_t c[4];
 vst1q_f32(c, a);
 for (int i = 0; i < 4; i++) {
   printf("%f ", c[i]);
  }
  printf("\n");

  return 0;
 }

Prototypes

Assembly Instruction:
ST1
Usage:
void result = vst1q_f32( float32_t a, float32x4_t b )
LLVM-MCA Metrics:
CPU Latency (Cycles)
(lower is better)
Throughput (maximum IPC)
(higher is better)
cortex-a72 2 0.5
cortex-a73 2 0.5
cortex-a75 2 0.5
cortex-a76 2 0.5
cortex-a77 2 0.5
cortex-a78 2 0.5
cortex-x1 2 2.0
cortex-x2 2 2.0
cortex-x3 2 2.0
cortex-a710 2 2.0
cortex-a715 2 2.0
neoverse-n1 2 2.0
neoverse-n2 2 2.0
neoverse-v1 2 2.0
neoverse-v2 2 2.0
ampere1 2 1.0
ampere1a 2 1.0
ampere1b 2 1.0
cortex-a55 4 1.0
Performance Plot:
SIMD Intrinsics Summary
SIMD Engines: 6
C Intrinsics: 10444
NEON: 4353
AVX2: 405
AVX512: 4717
SSE4.2: 598
VSX: 192
IBM-Z: 179