c - C语言中的LRU缓存设计，大小有限

Question

我现在正在开发内存非常小的移动平台上的软件。在 I/O 瓶颈函数中，我需要使用 seek 操作从 img 文件中读取一些字节（您可以假设 seek 比直接从 memmry 中读取慢约 10 倍）。在我的测试中，这个函数被调用了 7480325 次，从 bytes_offset 6800 到 130000 读取字节，所以每个字节平均读取 100 次（有些字节读取 3~4 次，有些超过 1000 次）。

下面是我的统计。

bytes offset 6800 ~ 6900: 170884 times
bytes offset 6900 ~ 7000: 220944 times
bytes offset 7000 ~ 7100: 24216 times
bytes offset 7100 ~ 7200: 9576 times
bytes offset 7200 ~ 7300: 14813 times
bytes offset 7300 ~ 7400: 22109 times
bytes offset 7400 ~ 7500: 19748 times
bytes offset 7500 ~ 7600: 43110 times
bytes offset 7600 ~ 7700: 157976 times
...
bytes offset 121200 ~ 121300: 1514 times
bytes offset 121300 ~ 121400: 802 times
bytes offset 121400 ~ 121500: 606 times
bytes offset 121500 ~ 121600: 444 times
bytes offset 121600 ~ 121700: 398 times

max_bytes_offset 121703
min_bytes_offset 6848

然后我想使用 LRU 模式构建一个缓存，以使 I/O 性能更好。在其他一些问题中，我发现哈希表 + 双向链表是一个好方法。但是如何建立一个结构以最好的方式改善我的问题呢？我计划构建1300个桶，每个桶都有一个最大大小为10的双向链表。那么它需要的总内存大约是13KB。实现和维护都很简单，但我认为效率不是最好的。

在我的统计中，一些字节偏移间隔的命中率更高，而一些间隔则更少。我怎样才能建立一个结构来调整我的统计数据？

而且当我搜索一个键时，我需要遍历整个列表，大小为10。还有其他搜索效率更高的方法吗？

在某些移动平台中，更多的内存可用于缓存，而其他平台则允许更少。除了更改存储桶的大小外，如何使我的缓存适应允许的内存更改？

看来caf的方法比较好。使用大的双向链表和大的哈希表将键映射到节点条目更有意义，并且可以更多地利用 LRU。但是设计哈希函数正成为一个难题。

期待你的建议，谢谢~

score 0 · Accepted Answer

If you're only going to have a maximum of 10 entries in each bucket, then you will likely be better off dispensing with the doubly-linked list and just making each bucket a circular array (which is just 10 entries and a "top of list" index).

You may well even be better off discarding the 10-way set associative design and going with a direct-mapped cache (where you have a larger hash table, with each bucket storing only one entry). Set associative designs are good in hardware, where you can use dedicated hardware to do the n-way comparisons in parallel, but not so much in software (unless you have a vector unit you can leverage for this).

One way to adapt to your statistics is to design your hash function so that it maps a different size address range onto each bucket, such that each bucket gets a roughly equal frequency of access.

Changing the size of the hash table is the other obvious way of scaling the size of the cache.

c - C语言中的LRU缓存设计，大小有限

1 回答 1

Related

Reference