Aussie AI
RELU Activation Function
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
RELU Activation Function
In terms of code, RELU is often written in research papers using the “max” function:
RELU = max(0,x)
In real code, the max function isn't needlessly called, but a simpler test for negatives is used, such as an if statement:
float aussie_RELU_if_test_slow(float f) { if (f <= 0.0f) return 0.0f; else return f; }
Here's a faster macro version with the C++ ternary operator:
#define AUSSIE_RELU_MACRO(f) ( (f) <= 0.0f ? 0.0f : (f) )
The assignment of x
to itself when it has a positive value can be avoided with logic such as:
#define RELUIZE1(x) \ ((x) = AUSSIE_RELU_MACRO(x)) // Slower version #define RELUIZE2(x) \ if ((x) < 0.0f) { (x) = 0.0f; } // If-then version #define RELUIZE3(x) ( \ (x) < 0.0f && ( (x) = 0.0f) ) // Short-circuiting
Even these are probably not the fastest way. A full implementation would use a bitwise test of the sign bit, relying on the IEEE 754 bit format of floating-point types, rather than the “<” less-than operator.
One way to access the sign bit is to use the standard C++ “signbit” function, which returns true or false depending on the floating-point sign bit. Hopefully it's using some fast assembly code in the implementation.
#define RELU4(x) (std::signbit(x) ? 0.0f : (x) ) #define RELUIZE4(x) (std::signbit(x) && ( (x) = 0.0f) ) // Multiply version (slow) #define RELUIZE4b(x) ((x) *= (int)!std::signbit(x))
For greater efficiency, RELU should be “fused” back into a MatMul or other prior component via kernel operator fusion, so that the clearing of negatives is done incrementally during the prior calculation, when the value is already in fast memory. An example of a “fused RELU” is given under kernel fusion in Chapter 31.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |