Aussie AI
Aussie AI Coding Overview
Aussie AI Coding
- Code examples overview
- Generative AI in C++ book (Full text online free, TOC, bonus materials)
- Source code repository
Aussie AI Apps
Aussie AI advanced apps using LLMs and generative AI technologies include:
AI Coding Blog Articles
Blog articles for AI developers:
- 500+ Techniques for LLM Inference Optimization
- State-of-the-Art LLM Backends
- Low Latency Programming
- Debugging OpenAI Node.js API Wrappers
- Reasoning Decoding Algorithms
- Reasoning Inference Optimization
- Reasoning is the New AI Middleware
- Generalizing Prefix KV Caching to RAG Chunks
- RAG Optimization via Caching
- All Aussie AI Blog Articles
Memory-Safe C++ Blog Articles
General C++ coding articles on safety:
- DIY Preventive C++ Memory Safety
- Canary Values & Redzones for Memory-Safe C++
- User-After-Free Memory Errors in C++
- Array Bounds Violations and Memory Safe C++
- Poisoning Memory Blocks for Safer C++
- Uninitialized Memory Safety in C++
- DIY Memory Safety in C++
- Memory Safe C++ Library Functions
- Smart Stack Buffers for Memory Safe C++
- Safe C++ Text Buffers with snprintf
- Safe C++ Coding Book
CUDA C++ Programming Blog Articles
Blogs for CUDA C++ developers covering optimization and debugging:
- CUDA C++ Floating Point Exceptions
- CUDA Memory Coalescing Optimizations
- CUDA GPU Thread Divergence
- CUDA Basic C++ Programming Mistakes
- CUDA C++ Optimization Book
- CUDA C++ Debugging Book
AI Coding Research Pages
- Chain-of-Thought Efficiency Optimization
- Hot Inference Optimization Techniques
- Sequential Speculative Decoding
- Quantization (floating-point-free zone!)
- Pruning (cut in four dimensions!)
- Parameter Sharing (Weight Sharing)
- Activation Function Optimizations
- Normalization Optimizations
- Softmax Optimizations
- FFN/MLP Optimizations
- MatMul/GEMM/GEMV Optimizations (really hard stuff)
- Positional Encoding Optimizations
- Decoding Algorithm Variants and Optimizations
- Speculative Decoding Types and Optimizations (endless papers...)
- PEFT and LoRA Optimizations
- Attention Module Types and Optimizations (go long!)
- Prompt Caching and KV Cache Optimizations
- Zero-Multiplication Models (ban the * operator!)
- Research Overview Index (many more topics)