Today I read a paper titled “Towards a Theory of Cache-Efficient Algorithms”
The abstract is:
We describe a model that enables us to analyze the running time of an algorithm in a computer with a memory hierarchy with limited associativity, in terms of various cache parameters.
Our model, an extension of Aggarwal and Vitter’s I/O model, enables us to establish useful relationships between the cache complexity and the I/O complexity of computations.
As a corollary, we obtain cache-optimal algorithms for some fundamental problems like sorting, FFT, and an important subclass of permutations in the single-level cache model.
We also show that ignoring associativity concerns could lead to inferior performance, by analyzing the average-case cache behavior of mergesort.
We further extend our model to multiple levels of cache with limited associativity and present optimal algorithms for matrix transpose and sorting.
Our techniques may be used for systematic exploitation of the memory hierarchy starting from the algorithm design stage, and dealing with the hitherto unresolved problem of limited associativity.