Table of Contents Introduction Hardware Details of caches and memory Latency, bandwidth, and throughput Latency & pointer chasing Pointer chasing Bounds checking Padding elements Raw pointers Aligned memory & Hugepages Summary Random access throughput & batching Batching Line fill buffers The reorder-buffer Prefetching TODO Memory bandwidth TODO Multithreading Further links This (planned) series of posts has the aim to write a high performance search algorithm for suffix arrays. We will start with a classic binary search implementation and make incremental improvements to it. But that is planned for Part 3.