Aussie AI

Top-K Vector Algorithm

Book Excerpt from "Generative AI in C++"

by David Spuler, Ph.D.

Top-K Vector Algorithm

The top-k algorithm is more complicated than vector max or min: find the largest k elements in a vector. Note that “maximum” is the same as top-k with k=1. If you want the short version of the top-k story in C++, there's a std::partial_sort function that sorts the top k elements, and there's also std::sort for a full array sort.

The top-k algorithm is quite important in AI engines. It gives us “top-k decoding” which is how to choose which word to output. The whole encoder-decoder computes a vector giving us the probability that each word should be output next. Using the maximum probability word gives us “greedy decoding” which always outputs the most likely word. But that's kind of boring and predictable, so top-k decoding randomly chooses between the k most likely words (e.g. top-50), which is still very accurate and also more interesting because it has some creative variation.

Example: Hand-coded top-2 algorithm: Since top-1 is the maximum of a vector, we can also find a fairly simple linear scan for k=2. The basic idea is to scan through and keep track of the two largest values as we go.

    void aussie_vector_top_k_2(
        float v[], int n, float vout[])
    {
        // Order the first 2 elements
        float vmax1 = v[0], vmax2 = v[1];
        if (v[1] > v[0]) {
            vmax1 = v[1];  // Reverse them
            vmax2 = v[0];
        }
        for (int i = 2 /*not 0*/; i < n; i++) {
            if (v[i] > vmax2) {
                // Bigger than the smallest one
                if (v[i] > vmax1) {
                    // Bigger than both (shuffle)
                    vmax2 = vmax1;
                    vmax1 = v[i];
                }
                else {
                    // In the middle (fix 2nd only)
                    vmax2 = v[i];
                }
            }
        }
        vout[0] = vmax1;  // Biggest
        vout[1] = vmax2;  // 2nd biggest
    }

Note that the above top-2 algorithm is still impractical for our word decoding algorithm. We need to know not only the top probabilities, but also which two indices in the vector had those probabilities, because that's how we know which words map to which probabilities. So, we'd need to modify the above code to track and return the two array indices as well (or instead).

Example: Shuffle top-k algorithm: For a larger value of k the code becomes more complicated. The above code for k=2 motivates the general idea for a brute-force algorithm: shuffle sort the first k elements, and then scan the rest, shuffling any larger items up into place. We can merge the two shuffling phases into one block of code that handles both the startup and ongoing scan.

    void aussie_vector_top_k_shuffle(
        float v[], int n, int k, float vout[])
    {
        yassert(n >= k);
        vout[0] = v[0];
        int nout = 1;
        for (int i = 1 /*not 0*/; i < n; i++) {
            float fnew = v[i];
            int maxj;
            if (nout < k) {
                vout[nout++] = fnew;
                maxj = nout - 2;
            }
            else {
                maxj = nout - 1;
            }
            maxj = nout - 1;
            for (int j = maxj; j >= 0; j--) {
                if (fnew > vout[j]) {
                    // Shuffle & insert
                    if (j + 1 < k)  // Shuffle down
                        vout[j + 1] = vout[j];
                    vout[j] = fnew;
                    // Keep going
                }
                else {      // Done.. insert it
                    if (j != maxj) {
                        if (j + 1 < k)
                            vout[j + 1] = vout[j];
                        vout[j] = fnew;
                    }
                    break;
               }
            } // end for j
        } // end for i
    }

The above example is a simplistic and inefficient top-k algorithm, not to mention that it was horribly fiddly and failed my unit tests for hours (i.e. a special kind of “fun”). Several loop optimizations suggest themselves: loop sectioning for the outer i loop to do the first k iterations as a separate loop (avoiding lots of tests against k), and loop peeling of the first iteration of the inner j loop (i.e. j==maxj). This version also should be extended to track the indices from where the top-k values came.

Theoretical Top-K Algorithms: There's a lot of theory about computing the top-k function of an array for large k values, which is the same structure as our “logit” word probability vectors (we want “k=50” or more). These theoretical top-k algorithm papers mainly consider sequential processing, rather than vectorization. Even so, it's not a simple linear scan like max or min functions, but doesn't need to be as slow as shuffling.

Example: Top-k with qsort sorting: The simplest method for large k is to sort the array with a fast method (e.g. the quicksort algorithm) and then pick off the top k elements from the sorted array. In C++ there are the std::sort methods or the older style qsort function. Here's an example using the C++ standard qsort function:

    int aussie_top_k_qsort_cmp(
        void const* addr1, void const* addr2)
    {
        float f1 = *(float*)addr1;
        float f2 = *(float*)addr2;
        if (f1 < f2) return +1;  // Reversed (descending)
        else if (f1 > f2) return -1;
        else return 0;
    }

    void aussie_vector_top_k_qsort(
         float v[], int n, int k, float vout[])  
    {
        // Top-k with general k (qsort algorithm)
        // Sort the array
        qsort(v, n, sizeof(vout[0]), aussie_top_k_qsort_cmp);
        // Copy top-k elements
        for (int i = 0; i < k; i++) vout[i] = v[i];  
    }

Top-k with qsort and permutation array: We really need a version that returns the indices of the probabilities, rather than just their values. So, I coded up a qsort version that sorts via a permutation array, and then returns the top-k of these permutation indices.

    void aussie_permutation_identity(int permut[], int n)
    {
        for (int i = 0; i < n; i++) permut[i] = i;
    }

    float* g_float_array_for_qsort = NULL;

    int aussie_top_k_qsort_permutation_cmp(
        void const* addr1, void const* addr2)
    {
        int index1 = *(int*)addr1;
        int index2 = *(int*)addr2;
        float f1 = g_float_array_for_qsort[index1];
        float f2 = g_float_array_for_qsort[index2];
        if (f1 < f2) return +1; // Reverse (descending)
        else if (f1 > f2) return -1;
        else return 0;
    }

    void aussie_vector_top_k_qsort_permut(
        float v[], int n, int k, 
        float vout[], int permut_out[]
    ) 
    {
        // Create a dynamic permutation array
        int* permut_arr = ::new int[n];
        // Identity permutation
        aussie_permutation_identity(permut_arr, n);  

        // Sort the array (by permutation)
        g_float_array_for_qsort = v;
        qsort(permut_arr, n, sizeof(permut_arr[0]), 
           aussie_top_k_qsort_permutation_cmp);
        // Copy top-k elements
        for (int i = 0; i < k; i++) {
                permut_out[i] = permut_arr[i];
                vout[i] = v[permut_arr[i]];
        }
        delete[] permut_arr;
    }

Top-k without sorting: Sorting the whole array is somewhat wasteful if we only want the top 50 elements. There are various faster top-k algorithms that don't fully sort the array.

Standard C++ top-k libraries: Note that the standard C++ libraries have support for sorting algorithms in std::vector, such as with std::sort. There is also a top-k version called std::partial_sort, which sorts the top k elements of an array, which can then be selected for the top-k algorithm. Presumably, the std::partial_sort function is a faster algorithm than std::sort, by not fully sorting the whole array, but I haven't tested it. There is also std::nth_element, which is similar to top-k.

• Next:

• Up: Table of Contents

• Buy: Generative AI in C++: Coding Transformers and LLMs

The new AI programming book by Aussie AI co-founders:

AI coding in C++
Transformer engine speedups
LLM models
Phone and desktop AI
Code examples
Research citations

Get your copy from Amazon: Generative AI in C++

Aussie AI

Top-K Vector Algorithm

Top-K Vector Algorithm

Quick Links

Product

New to Writing?

Writing Styles