Aussie AI
Bitwise Intrinsic Functions
-
Book Excerpt from "Generative AI in C++"
-
by David Spuler, Ph.D.
Bitwise Intrinsic Functions
Intrinsic functions, or “builtin” functions, are special C++ functions that are specific to the compiler environment. For example, Microsoft Visual Studio and GCC have different builtins. Intrinsics are usually implemented in very efficient ways, often directly mapping to CPU instructions, so they can be very powerful optimizations.
Some of the useful builtin functions for integer bitwise arithmetic are listed below.
Most of these functions are for “int
” or “unsigned int
” (32-bit), but have other versions for long
64-bit or unsigned long
128-bit types.
There isn't usually a version for “short
” 16-bit integers.
Count Leading Zeros (CLZ): Various functions count the leading zeros, or similarly, the offset of the first set bit. This is scanning the bits from left-to-right and finding the most significant bit. One application of the CLZ intrinsic is a fast way to compute a truncated log2 of an integer, or similarly, computing the highest power-of-two in a number.
_BitScanReverse
(Microsoft intrinsic<intrin.h>
): Finds the most-significant bit in a 32-bit integer. There's also_BitScanReverse64
.clz
: Count leading zeros (various versions); also sometimes called “nlz
” for “number leading zeros”.__lzcnt
: Leading zeros count in Microsoft Windows intrinsics, use<intrin.h>
for Microsoft Visual Studio C++.__builtin_clz
(count leading zeros): GCC function to count the number of leading prefix zeros in an unsigned integer._CountLeadingZeros
: Microsoft<intrin.h>
ARM intrinsics.
For all you silicon addicts, here's the CPU hardware instructions are underpin these intrinsics:
BSR
: Bit Scan Reverse x86 assembler instruction.LZCNT
: x86 instruction for leading-zero count, similar to BSR.
Count Trailing Zeros (CTZ): Contrasting to the leading zero functions, these functions find the zeros on the right-hand-side of an integer. This is the least-significant bit.
_BitScanForward
(Microsoft intrinsic<intrin.h>
): Finds the least-significant bit set. Long int version is_BitScanForward64
.__builtin_ctz
(count trailing zeros): GCC function counts zero bits on the right (least-significant bits).ffs
/ffsl
: Find first set (least-significant bit).__builtin_ffs
(find first set): GCC function: find first set bit from the least significant bits (from the right bits).
The related x86 CPU hardware instructions are:
BSF
: Bit Scan Forward x86 assembler instruction.TZCNT
: x86 instruction for trailing-zero count, similar toBSF
.
If you'd rather code it yourself,
there's Brian Kernighan's bit trick for LSB: bitwise-and of n
and n-1
(i.e. in C++ n&(n-1)
finds the lowest set bit).
But using the intrinsics should be faster.
Popcount (Set Bits Count): The count of 1s in a number is known as the “popcount” (which is short for population count) and there are various intrinsics:
__builtin_popcount
: GCC function to count the number of 1s in an unsigned integer.BitOperations.PopCount
: Microsoft intrinsic function for bitwise popcount.__popcnt
: AMD x86 popcount intrinsic usingPOPCNT
x86 instruction (Microsoft platform)_mm_popcnt_u32
: Intel x86 popcount intrinsic usingPOPCNT
x86 instruction (Microsoft platform); use<intrin.h>
on MSVS C++.__builtin_parity
: GCC function tracking bitwise binary parity (whether the number of 1s is odd or even).
The x86 CPU hardware instruction is POPCNT
, which
computes the popcount faster than a hummingbird's wings.
• Next: • Up: Table of Contents |
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |