Aussie AI
Example: AVX 128-Bit Dot Product
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Example: AVX 128-Bit Dot Product
The AVX instruction set has a vector dot product intrinsic that wraps an x86 dot product instruction. There are versions of the dot product intrinsic for AVX (128-bit), AVX-2 (256-bit) and AVX-512 (512-bit).
For basic AVX (128 bits), this is a full vector dot product
of two vectors with 4 x 32-bit float
numbers in each vector.
One oddity is that although the result is a floating-point scalar (i.e. a single 32-bit float
),
it's still stored in a 128-bit register,
and must be extracted using the “_mm_cvtss_f32
” intrinsic.
The example code looks like:
float aussie_avx_vecdot_4_floats(float v1[4], float v2[4]) { // AVX dot product: 2 vectors of 4x32-bit floats __m128 r1 = _mm_loadu_ps(v1); // Load floats __m128 r2 = _mm_loadu_ps(v2); __m128 dst = _mm_dp_ps(r1, r2, 0xf1); // Dot product float fret = _mm_cvtss_f32(dst); // Extract float return fret; }
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |