Aussie AI
AVX-2 SIMD Multiplication
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
AVX-2 SIMD Multiplication
Here is the AVX-2 version of pairwise SIMD multiply with intrinsics
for 256-bit registers,
which is eight 32-bit float
variables.
void aussie_avx2_multiply_8_floats( float v1[8], float v2[8], float vresult[8]) { // Multiply 8x32-bit floats in 256-bit AVX2 registers __m256 r1 = _mm256_loadu_ps(v1); // Load floats __m256 r2 = _mm256_loadu_ps(v2); __m256 dst = _mm256_mul_ps(r1, r2); // Multiply (SIMD) _mm256_storeu_ps(vresult, dst); // Convert to 8 floats }
This is similar to the basic AVX 128-bit version, with some differences:
- The type for 256-bit registers is “
__m256
”. - The AVX-2 loading intrinsic is “
_mm256_loadu_ps
”. - The AVX-2 multiplication intrinsic is “
_mm256_mul_ps
”. - The conversion back to float uses AVX-2 intrinsic “
_mm256_storeu_ps
”.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |