Aussie AI
Reference Lists for Generative AI in C++
-
Last Updated 4th March, 2024
-
by David Spuler, Ph.D.
Reference Lists for Each Chapter
Here is a detailed list of the related
references and research coverage for each chapter of Generative AI in C++ by David Spuler.
The new Generative AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |
Book Information
For more about Generative AI in C++ see also:
Part I: AI Projects in C++
- Further reading:
- Kirill Kolodiazhnyi (2020), Hands-On Machine Learning in C++, Packt Publishing, May 2020, https://www.amazon.com/Hands-Machine-Learning-end-end/dp/1789955335 (And a second edition is forthcoming, due Nov 2024.)
- Venish Patidar (2022), Developers Guide for Building Own Neural Network Library, October 1, 2022, https://www.amazon.com/DEVELOPERS-BUILDING-NEURAL-NETWORK-LIBRARY/dp/B0BGNF1KK6/
- Research papers:
1. Introduction to AI in C++
- Research papers:
- Market Research
- Transformer architectures (overview)
- AI phones
- AI PCs (desktops/laptops)
2. Transformers & LLMs
- Research papers:
- Market Research
- Transformer architectures (overview)
- AI phones
- AI PCs (desktops/laptops)
3. AI Phones
- Research papers:
4. AI on Your Desktop
- Further reading:
- Kirill Kolodiazhnyi (2020), Hands-On Machine Learning in C++, Packt Publishing, May 2020, https://www.amazon.com/Hands-Machine-Learning-end-end/dp/1789955335 (And a second edition is forthcoming, due Nov 2024.)
- Venish Patidar (2022), Developers Guide for Building Own Neural Network Library, October 1, 2022, https://www.amazon.com/DEVELOPERS-BUILDING-NEURAL-NETWORK-LIBRARY/dp/B0BGNF1KK6/
- Research papers:
- AI PCs (desktops/laptops)
- Market Research
- Edge device inference (mobile/PC)
- GenAI market evolution
5. Design Choices & Architectures
- Research papers:
- Transformer architectures (overview)
- Market Research
- AI phones
- AI PCs (desktops/laptops)
6. Training, Fine-Tuning & RAG
- Research papers:
7. Deployment Architecture
Part II: Basic C++ Optimizations
- Research papers:
8. Bitwise Operations
- Research papers:
9. Floating Point Arithmetic
10. Arithmetic Optimizations
- Research papers:
11. Compile-Time Optimizations
- Research papers:
12. Pointer Arithmetic
- Research papers:
13. Algorithm Speedups
- Research papers:
14. Memory Optimizations
- Research papers:
Part III: Parallel C++ Optimizations
- Research papers:
15. Loop Vectorization
- Research papers:
- Loop optimizations (overview)
- Loop fusion (merging loops)
- Loop unrolling
- Loop perforation
- Loop reordering
- Loop tiling
- Loop reversal
- Loop fission (splitting a loop)
- Loop interchange
- Loop coalescing
- Loop-invariant code motion ("hoisting")
- Loop distribution
- Pointer arithmetic
- Loop peeling (unrolling first iterations)
- Loop splittingLoop sentinel
- Loop collapsing
- Loop normalization
- Loop strip mining (Loop sectioning)
- Loop skewing
- Loop spreading
- Parallelization
- Vectorization
- Kernel operator fusion (merging two operations)
- Kernel fission (splitting)
16. Hardware Acceleration
- Research papers:
17. AVX Intrinsics
- Research papers:
18. Parallel Data Structures
- Research papers:
Part IV: Transformer Components in C++
- Research papers:
19. Encoders & Decoders
20. Attention
- Research papers:
21. Activation Functions
- Research papers:
22. Vector Algorithms
- Research papers:
23. Tensors
- Research papers:
- Tensor decomposition
- Faster matrix multiplication (e.g. Winograd, Strassen)
- Approximate matrix multiplication
24. Normalization
- Research papers:
25. Softmax
- Research papers:
26. Decoding Algorithms
- Research papers:
27. Tokenizer and Vocabulary
- Research papers:
Part V: Optimizing Transformers in C++
- Research papers:
28. Deslugging AI Engines
- Research papers:
29. Caching Optimizations
- Research papers:
30. Vectorization
- Research papers:
- Vectorization
- Parallelization
- Pipelining
- Kernel operator fusion (merging two operations)
- Kernel fission (splitting)
31. Kernel Fusion
- Research papers:
- Kernel operator fusion (merging two operations)
- Kernel fission (splitting)
- Loop fusion (merging loops)
- Loop fission (splitting a loop)
- Fused Multi-Head Attention (MHA)
- Fused activation functions
- Fused RELU
- Fused GELU
- Fused SwiGLU
- Fused normalization (e.g. "fused LayerNorm")
- Fused Softmax
- Fused multiply-add (FMA)
- Fused transpose
- Negative skipping
32. Quantization
- Research papers:
33. Pruning
- Research papers:
34. MatMul/GEMM
- Research papers:
- Faster matrix multiplication (e.g. Winograd, Strassen)
- Approximate matrix multiplication
- Transpose cache
- Fused multiply-add (FMA)
- Fused transpose
- Vector dot product optimization
- FFN pruning
- Fused add-bias
- Bias vector pruning
- Low-rank matrices
- Matrix Algebra (factorization)
- Approximate matrix multiplication
- Butterfly matrices
- Monarch matrices
- Sparse matrices (sparsification)
35. Lookup Tables & Precomputation
- Research papers:
36. AI Memory Optimizations
- Research papers:
Part VI: Enterprise AI in C++
- Research papers:
37. Tuning, Profiling & Benchmarking
- Research papers:
38. Platform Portability
- Research papers:
39. Quality
- Research papers:
40. Reliability
- Research papers:
41. Self-Testing Code
- Research papers:
42. Debugging
- Research papers:
Part VII: Research on AI Optimization
- Research papers:
43. Overview of AI Research
- Research papers:
44. Advanced Quantization
- Research papers:
- Quantization research
- Model compression research
- Binary quantization
- Ternary quantization
- 2-bit quantization (INT2)
- 3-bit quantization (INT3)
- 4-bit quantization (INT4)
- 5-bit quantization (INT5)
- 6-bit quantization (INT6)
- 7-bit quantization (INT7)
- 8-bit quantization (INT8)
- Integer quantization
- Integer-only arithmetic quantization
- FP8 quantization
- Logarithmic power-of-two quantization (bitshift quantization)
- Double bitshift power-of-two quantization
- Division quantization
- Cluster-based quantization (Weight clustering)
- Dyadic quantization
- Fake quantization
- Simulated quantization
- Stochastic quantization (probabilistic)
- Weight clustering
45. Knowledge Distillation
- Research papers:
46. Structured Pruning
- Research papers:
47. Early Exit and Layer Pruning
- Research papers:
- Early exit (dynamic layer pruning)
- Layer pruning
- Depth pruning (overview)
- Layer skipping
- Shallow decoder architecture (layer pruning)
- Layer fusion
- Layer reordering
48. Width Pruning
- Research papers:
49. Length Pruning
- Research papers:
50. Adaptive Inference
- Research papers:
- End-to-End integer inference
- Dynamic inference (adaptive inference)
- Skipping optimizations
51. Zero-Multiplication Models
- Research papers:
- Zero-Multiplication Models (overview)
- Integer-only Transformers
- Binary quantization
- Ternary quantization
- 2-bit quantization (INT2)
- Adder networks
- Bitshift-add networks
- Bitshift power-of-2 quantization
- Double bitshift quantization
- Add-as-integer networks
- Logarithmic Models
- Bitwise neural networks
- Diff-squared networks
- Log-sum-exp (LSE) networks
- Max-Plus networks
- Min-Max-Plus networks
- Morphological networks
- Trigonometric approximate inference
- Weightless Neural Networks (WNNs)
- XNOR networks
- End-to-End integer inference
52. Logarithmic Models
53. Arithmetic Optimization Research
- Research papers:
- Advanced AI Mathematics
- Integer-only Transformers
- Integer-only arithmetic quantization
- End-to-End integer inference
- Reciprocal multiplication
- Constant folding
- Common subexpression elimination
- Strength reduction
- Foating point bitwise arithmetic
- Addition optimizations
- Approximate addition
- Multiplication algorithms
- Approximate multiplication
- Logarithmic approximate multiplication
- Division optimizations
- Approximate division
- Bitwise operator inference
- Bitserial operations
54. Ensemble Multi-Model Architectures
- Research papers:
55. Advanced Number Systems
- Research papers:
- Advanced AI Mathematics
- Integer-only Transformers
- End-to-End integer inference
- Foating point bitwise arithmetic
- Posit number system (PNS)
- Residue number system (RNS)
- Logarithmic number system (LNS)
- Dyadic numbers
- Double-base number system (DBNS)
- Dynamic number systems
- Hybrid number systems
- Tropical algebra (max-plus)
- MiniMax algebra
- Multi-dimensional logarithmic number system (MDLNS)
- Multiple-Base Number System (MBNS)
- Matrix Algebra (factorization)
- Approximate matrix multiplication
- Butterfly matrices
- Monarch matrices
56. Neural Architecture Search
- Research papers:
Appendix 1: C++ Slug Catalog
- Research papers:
The new AI programming book by Aussie AI co-founders:
Get your copy from Amazon: Generative AI in C++ |
More AI Research
Read more about:
- GenAI market research
- AI on Phones
- Inference Optimizations
- Loop Optimizations
- Code Optimizations
- « Research Home