Aussie AI
What is a Logarithmic Modeland?
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
What is a Logarithmic Model?
The basic idea is that we change everything to the “log-domain” rather than normal numbers. Instead of a probability X, we store log-of-X and use that in every calculation. If we can do faster arithmetic through the whole model using log-X instead of X, then we can un-log them at the end, back into the normal number domain, called the “linear domain.” Conversion from log-domain back to linear domain is just the expf exponential function, just as the initial conversion from linear to log domain is just the logf function.
The basic mathematical reason why this idea of staying in the log-domain might work well is that logarithms have this property:
log (x * y) = log(x) + log(y)
So, if we are doing a vector dot product, and we have log-X and log-Y available (already stored in our log-domain numbers), then we can “multiply” the two numbers together in the log-domain with just addition. Adding log-X and log-Y gives us “log-X*Y” in log-domain. Our
Unfortunately, the same is not true for addition of log-X and log-Y in the log-domain, since:
log (x + y) != log(x) + log(y)
Instead, the addition operation in the log domain is a little slower:
log (x + y) = log(exp(log(x)) + exp(log(y)))
Summing exponentials is not super-fast. This is the issue that creates problems for the logarithmic model idea. To do a vector dot product computation we first multiply, but then we have to add. Ironically, addition becomes the bottleneck problem in the log-domain.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |