Aussie AI
GELU AVX SIMD Vectorization
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
GELU AVX SIMD Vectorization
The GELU function is an element-wise activation function on an input vector,
so it is a good candidate for vectorization.
However, the GELU function is complicated to compute in parallel,
even though AVX has SIMD support for the error function (“erf
”).
Raw computations of GELU would require a multiply-by-scalar, erf
computation, scalar addition,
scalar multiplication, and then non-scale multiplication.
We could do all these with sequential AVX intrinsics, but with so many operations,
that doesn't seem like a good plan.
Hence, our best option is to use precomputation into a lookup-table (LUT) in combination
with AVX “gather” intrinsics to vectorize the table lookups.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |