Tags / Equivalents
_mm256_mullo_epi16() on Intel 64-bit - AVX2
Multiply the packed signed 16-bit integers in a and b, producing intermediate 32-bit integers, and store the low 16 bits of the intermediate integers in output.
vmulq_s16() on Arm 64-bit - NEON
VMUL multiplies corresponding elements in two vectors. Elements in the result vector and input vectors have the same width.
vmul_s16() on Arm 64-bit - NEON
Multiply (vector). This instruction multiplies corresponding elements in the vectors of the two source SIMD&FP registers, places the results in a vector, and writes the vector to the destination SIMD&FP register.
_mm_mullo_epi16() on Intel 64-bit - SSE4.2
Multiply the packed 16-bit integers in a and b, producing intermediate 32-bit integers, and store the low 16 bits of the intermediate integers in output.
_m_pmullw() on Intel 64-bit - SSE4.2
Multiply the packed 16-bit integers in "a" and "b", producing intermediate 32-bit integers, and store the low 16 bits of the intermediate integers in "dst".
vec_mul() on IBM Power 9 64-bit - VSX
Compute the products of corresponding elements of two vectors.