Navigation
Welcome to the performance engineering resource hub. Use the links below to navigate the deep-dive guides:
- CUDA & PyTorch Performance Engineering — GPU kernel optimization, Triton, FlashAttention, and memory bandwidth profiling.
- Low-Latency Performance C++ & Computer Architecture — Micro-optimization, SIMD, instruction latencies, and compiler exploration.
- Operating System Internals & Linux Kernel Bypass — eBPF, lock-free data structures, DPDK, and Solarflare ef_vi.
- HFT System Design — LMAX Disruptor, mechanical sympathy, order book construction, and trading system architecture.