Please enter what you're looking for to continue your search
 

vst1q_u32
ADD TO COMPARE ADDED TO COMPARE

 Arm 64-bit (64 bits)/ NEON  View official documentation
 Location: Memory Operations  >  Vector Store
 Supported Architectures: v7, A32, A64
Purpose:
Stores elements from a 128-bit vector into contiguous memory.
Result:

void

Example:
#include <stdio.h>
#include <arm_neon.h>
int main() {
 uint32_t input_arr[4] = {
  1, 2, 3, 4
 };
 uint32x4_t vec = vld1q_u32(input_arr);
 uint32_t res_arr[4];
 vst1q_u32(res_arr, vec);
 printf("Stored values:\n");
 for (int i = 0; i < 4; i++) {
   printf("%u\n", res_arr[i]);
  }

  return 0;
 }

Prototypes

Assembly Instruction:
ST1
Usage:
void result = vst1q_u32( uint32_t a, uint32x4_t b )
LLVM-MCA Metrics:
CPU Latency (Cycles)
(lower is better)
Throughput (maximum IPC)
(higher is better)
cortex-a72 2 0.5
cortex-a73 2 0.5
cortex-a75 2 0.5
cortex-a76 2 0.5
cortex-a77 2 0.5
cortex-a78 2 0.5
cortex-x1 2 2.0
cortex-x2 2 2.0
cortex-x3 2 2.0
cortex-a710 2 2.0
cortex-a715 2 2.0
neoverse-n1 2 2.0
neoverse-n2 2 2.0
neoverse-v1 2 2.0
neoverse-v2 2 2.0
ampere1 2 1.0
ampere1a 2 1.0
ampere1b 2 1.0
cortex-a55 4 1.0
Performance Plot:
SIMD Intrinsics Summary
SIMD Engines: 6
C Intrinsics: 10444
NEON: 4353
AVX2: 405
AVX512: 4717
SSE4.2: 598
VSX: 192
IBM-Z: 179