欢迎所有平台,请指定您回答的平台。
一个类似的问题:How to programmatically get the CPU cache page size in C++?
欢迎所有平台,请指定您回答的平台。
一个类似的问题:How to programmatically get the CPU cache page size in C++?
在 Linux(具有相当新的内核)上,您可以从 /sys 中获取此信息:
/sys/devices/system/cpu/cpu0/cache/
该目录对每一级缓存都有一个子目录。这些目录中的每一个都包含以下文件:
coherency_line_size
level
number_of_sets
physical_line_partition
shared_cpu_list
shared_cpu_map
size
type
ways_of_associativity
这为您提供了有关缓存的更多信息,然后您希望知道,包括缓存线大小 ( coherency_line_size
) 以及哪些 CPU 共享此缓存。如果您正在使用共享数据进行多线程编程,这非常有用(如果共享数据的线程也共享缓存,您将获得更好的结果)。
On Linux look at sysconf(3).
sysconf (_SC_LEVEL1_DCACHE_LINESIZE)
You can also get it from the command line using getconf:
$ getconf LEVEL1_DCACHE_LINESIZE
64
I have been working on some cache line stuff and needed to write a cross-platform function. I committed it to a github repo at https://github.com/NickStrupat/CacheLineSize, or you can just use the source below. Feel free to do whatever you want with it.
#ifndef GET_CACHE_LINE_SIZE_H_INCLUDED
#define GET_CACHE_LINE_SIZE_H_INCLUDED
// Author: Nick Strupat
// Date: October 29, 2010
// Returns the cache line size (in bytes) of the processor, or 0 on failure
#include <stddef.h>
size_t cache_line_size();
#if defined(__APPLE__)
#include <sys/sysctl.h>
size_t cache_line_size() {
size_t line_size = 0;
size_t sizeof_line_size = sizeof(line_size);
sysctlbyname("hw.cachelinesize", &line_size, &sizeof_line_size, 0, 0);
return line_size;
}
#elif defined(_WIN32)
#include <stdlib.h>
#include <windows.h>
size_t cache_line_size() {
size_t line_size = 0;
DWORD buffer_size = 0;
DWORD i = 0;
SYSTEM_LOGICAL_PROCESSOR_INFORMATION * buffer = 0;
GetLogicalProcessorInformation(0, &buffer_size);
buffer = (SYSTEM_LOGICAL_PROCESSOR_INFORMATION *)malloc(buffer_size);
GetLogicalProcessorInformation(&buffer[0], &buffer_size);
for (i = 0; i != buffer_size / sizeof(SYSTEM_LOGICAL_PROCESSOR_INFORMATION); ++i) {
if (buffer[i].Relationship == RelationCache && buffer[i].Cache.Level == 1) {
line_size = buffer[i].Cache.LineSize;
break;
}
}
free(buffer);
return line_size;
}
#elif defined(linux)
#include <stdio.h>
size_t cache_line_size() {
FILE * p = 0;
p = fopen("/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size", "r");
unsigned int i = 0;
if (p) {
fscanf(p, "%d", &i);
fclose(p);
}
return i;
}
#else
#error Unrecognized platform
#endif
#endif
在 x86 上,您可以使用带有函数 2 的CPUID指令来确定缓存和 TLB 的各种属性。解析函数 2 的输出有些复杂,所以我将向您介绍Intel 处理器标识和 CPUID 指令(PDF) 的第 3.1.3 节。
要从 C/C++ 代码中获取此数据,您需要使用内联汇编、编译器内在函数或调用外部汇编函数来执行 CPUID 指令。
If you're using SDL2 you can use this function:
int SDL_GetCPUCacheLineSize(void);
Which returns the size of the L1 cache line size, in bytes.
In my x86_64 machine, running this code snippet:
printf("CacheLineSize = %d",SDL_GetCPUCacheLineSize());
Produces CacheLineSize = 64
I know I'm a little late, but just adding information for future visitors. The SDL documentation currently says the number returned is in KB, but it is actually in bytes.
在 Windows 平台上:
from http://blogs.msdn.com/oldnewthing/archive/2009/12/08/9933836.aspx
The GetLogicalProcessorInformation function will give you characteristics of the logical processors in use by the system. You can walk the SYSTEM_LOGICAL_PROCESSOR_INFORMATION returned by the function looking for entries of type RelationCache. Each such entry contains a ProcessorMask which tells you which processor(s) the entry applies to, and in the CACHE_DESCRIPTOR, it tells you what type of cache is being described and how big the cache line is for that cache.
ARMv6 and above has C0
or the Cache Type Register. However, its only available in privileged mode.
For example, from Cortex™-A8 Technical Reference Manual:
The purpose of the Cache Type Register is to determine the instruction and data cache minimum line length in bytes to enable a range of addresses to be invalidated.
The Cache Type Register is:
- a read-only register
- accessible in privileged modes only.
The contents of the Cache Type Register depend on the specific implementation. Figure 3-2 shows the bit arrangement of the Cache Type Register...
Don't assume the ARM processor has a cache (apparently, some can be configured without one). The standard way to determine it is via C0
. From the ARM ARM, page B6-6:
From ARMv6, the System Control Coprocessor Cache Type register is the mandated method to define the L1 caches, see Cache Type register on page B6-14. It is also the recommended method for earlier variants of the architecture. In addition, Considerations for additional levels of cache on page B6-12 describes architecture guidelines for level 2 cache support.
您也可以尝试通过测量一些时间以编程方式进行。显然,它并不总是像 cpuid 之类的那样精确,但它更便携。ATLAS 在其配置阶段执行此操作,您可能需要查看它:
You can use std::hardware_destructive_interference_size since C++17.
Its defined as:
Minimum offset between two objects to avoid false sharing. Guaranteed to be at least alignof(std::max_align_t)