CPUID
Jun 20, 2024 23:30 · 1496 words · 3 minute read
Linux 内核在初始化时是如何拿到 CPU 硬件信息(厂商、物理核数量等等)的呢?我们这次从一个 Golang 三方库 klauspost/cpuid 下手来逐步探索。
Golang
package main
import (
"fmt"
. "github.com/klauspost/cpuid/v2"
)
func main() {
fmt.Println("Name:", CPU.BrandName)
fmt.Println("PhysicalCores:", CPU.PhysicalCores)
fmt.Println("ThreadsPerCore:", CPU.ThreadsPerCore)
}
执行以上代码片段:
$ go run main.go
Name: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
PhysicalCores: 18
ThreadsPerCore: 2
找到 CPU 变量 https://github.com/klauspost/cpuid/blob/95e7626938069ea64e5c91ca2fe36945786fead9/cpuid.go#L331-L335:
// CPUInfo contains information about the detected system CPU.
type CPUInfo struct {
BrandName string // Brand name reported by the CPU
VendorID Vendor // Comparable CPU vendor ID
VendorString string // Raw vendor string.
featureSet flagSet // Features of the CPU
PhysicalCores int // Number of physical processor cores in your CPU. Will be 0 if undetectable.
ThreadsPerCore int // Number of threads per physical core. Will be 1 if undetectable.
LogicalCores int // Number of physical cores times threads that can run on each core through the use of hyperthreading. Will be 0 if undetectable.
Family int // CPU family number
Model int // CPU model number
Stepping int // CPU stepping info
CacheLine int // Cache line size in bytes. Will be 0 if undetectable.
Hz int64 // Clock speed, if known, 0 otherwise. Will attempt to contain base clock speed.
BoostFreq int64 // Max clock speed, if known, 0 otherwise
Cache struct {
L1I int // L1 Instruction Cache (per core or shared). Will be -1 if undetected
L1D int // L1 Data Cache (per core or shared). Will be -1 if undetected
L2 int // L2 Cache (per core or shared). Will be -1 if undetected
L3 int // L3 Cache (per core, per ccx or shared). Will be -1 if undetected
}
SGX SGXSupport
AMDMemEncryption AMDMemEncryptionSupport
AVX10Level uint8
maxFunc uint32
maxExFunc uint32
}
var CPU CPUInfo
README 中各种 CPU.XXX
字段的定义就在 CPUInfo
结构中,重点看这个变量是如何被赋值的(初始化)。
首先看这个项目的布局:
$ ll | grep .go
-rw-r--r-- 1 root root 52K Jun 20 09:37 cpuid.go
-rw-r--r-- 1 root root 7.9K Apr 30 23:26 cpuid_test.go
-rw-r--r-- 1 root root 9.8K Apr 30 23:26 detect_arm64.go
-rw-r--r-- 1 root root 529 Apr 30 23:26 detect_ref.go
-rw-r--r-- 1 root root 1.3K Apr 30 23:26 detect_x86.go
-rw-r--r-- 1 root root 8.7K Jun 20 09:37 featureid_string.go
-rw-r--r-- 1 root root 79 Apr 30 23:26 go.mod
-rw-r--r-- 1 root root 358 Apr 30 23:26 go.sum
-rw-r--r-- 1 root root 5.7K Apr 30 23:26 mockcpu_test.go
-rw-r--r-- 1 root root 3.8K Apr 30 23:26 os_darwin_arm64.go
-rw-r--r-- 1 root root 1.4K Apr 30 23:26 os_darwin_test.go
-rw-r--r-- 1 root root 3.9K Apr 30 23:26 os_linux_arm64.go
-rw-r--r-- 1 root root 367 Apr 30 23:26 os_other_arm64.go
-rw-r--r-- 1 root root 151 Apr 30 23:26 os_safe_linux_arm64.go
-rw-r--r-- 1 root root 237 Apr 30 23:26 os_unsafe_linux_arm64.go
在 cpuid.go 文件中:
func init() {
initCPU()
Detect()
}
其中 initCPU
方法,在各个 CPU 架构中均有不同的实现(Go 条件编译/编译约束):
-
x86_64
func asmCpuid(op uint32) (eax, ebx, ecx, edx uint32) func asmCpuidex(op, op2 uint32) (eax, ebx, ecx, edx uint32) func asmXgetbv(index uint32) (eax, edx uint32) func asmRdtscpAsm() (eax, ebx, ecx, edx uint32) func asmDarwinHasAVX512() bool func initCPU() { cpuid = asmCpuid cpuidex = asmCpuidex xgetbv = asmXgetbv rdtscpAsm = asmRdtscpAsm darwinHasAVX512 = asmDarwinHasAVX512 }
-
arm64
func initCPU() { cpuid = func(uint32) (a, b, c, d uint32) { return 0, 0, 0, 0 } cpuidex = func(x, y uint32) (a, b, c, d uint32) { return 0, 0, 0, 0 } xgetbv = func(uint32) (a, b uint32) { return 0, 0 } rdtscpAsm = func() (a, b, c, d uint32) { return 0, 0, 0, 0 } }
x86_64 CPU 架构的 asmCpuid 方法由汇编实现:
// func asmCpuid(op uint32) (eax, ebx, ecx, edx uint32)
TEXT ·asmCpuid(SB), 7, $0
XORL CX, CX
MOVL op+0(FP), AX
CPUID
MOVL AX, eax+4(FP)
MOVL BX, ebx+8(FP)
MOVL CX, ecx+12(FP)
MOVL DX, edx+16(FP)
RET
这里我们就能看出来 CPUID
也是 CPU 的一条指令,也就是说硬件上直接就支持了这种获取 CPU 信息的方法。
再看看一下 CPUInfo
结构的 PhysicalCores
字段是如何填充的:
https://github.com/klauspost/cpuid/blob/f89c8c58bdd5348f54ac22d0d58cf797c35bdc2b/detect_x86.go#L33
func addInfo(c *CPUInfo, safe bool) {
// a lot of code
c.PhysicalCores = physicalCores()
c.VendorID, c.VendorString = vendorID()
c.AVX10Level = c.supportAVX10()
c.cacheSize()
c.frequencies()
}
https://github.com/klauspost/cpuid/blob/95e7626938069ea64e5c91ca2fe36945786fead9/cpuid.go#L847-L868
func logicalCores() int {
mfi := maxFunctionID()
v, _ := vendorID()
switch v {
case Intel:
// Use this on old Intel processors
if mfi < 0xb {
if mfi < 1 {
return 0
}
// CPUID.1:EBX[23:16] represents the maximum number of addressable IDs (initial APIC ID)
// that can be assigned to logical processors in a physical package.
// The value may not be the same as the number of logical processors that are present in the hardware of a physical package.
_, ebx, _, _ := cpuid(1)
logical := (ebx >> 16) & 0xff
return int(logical)
}
_, b, _, _ := cpuidex(0xb, 1)
return int(b & 0xffff)
case AMD, Hygon:
_, b, _, _ := cpuid(1)
return int((b >> 16) & 0xff)
default:
return 0
}
}
func physicalCores() int {
v, _ := vendorID()
switch v {
case Intel:
return logicalCores() / threadsPerCore()
case AMD, Hygon:
lc := logicalCores()
tpc := threadsPerCore()
if lc > 0 && tpc > 0 {
return lc / tpc
}
// a lot of code here
}
return 0
}
还是调用了 cpuid 函数,而它在 x86_64 CPU 上的实现就是汇编代码 asmCpuid。
Linux
有了 Golang 三方库通过 CPUID 查询 CPU 信息的经验,我们回到正题Linux 内核在初始化时是如何拿到 CPU 信息的:
int check_cpu(int *cpu_level_ptr, int *req_level_ptr, u32 **err_flags_ptr)
{
// a lot of code here
else if (err == 0x01 && is_transmeta()) {
/* Transmeta might have masked feature bits in word 0 */
u32 ecx = 0x80860004;
u32 eax, edx;
u32 level = 1;
asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
asm("wrmsr" : : "a" (~0), "d" (edx), "c" (ecx));
asm("cpuid"
: "+a" (level), "=d" (cpu.flags[0])
: : "ecx", "ebx");
asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
err = check_cpuflags();
}
// a lot of code here
}
看到了“熟悉”的汇编指令。而 check_cpu
函数由 validate_cpu
函数调用:
int validate_cpu(void)
{
u32 *err_flags;
int cpu_level, req_level;
check_cpu(&cpu_level, &req_level, &err_flags);
if (cpu_level < req_level) {
printf("This kernel requires an %s CPU, ",
cpu_name(req_level));
printf("but only detected an %s CPU.\n",
cpu_name(cpu_level));
return -1;
}
// a lot of code here
}
validate_cpu
函数则由入口函数 main
调用:
void main(void)
{
/* First, copy the boot header into the "zeropage" */
copy_boot_params();
/* Initialize the early-boot console */
console_init();
if (cmdline_find_option_bool("debug"))
puts("early console in setup code\n");
/* End of heap check */
init_heap();
/* Make sure we have all the proper CPU support */
if (validate_cpu()) {
puts("Unable to boot - please use a kernel appropriate "
"for your CPU.\n");
die();
}
// a lot of core here
}
当 GRUB 引导至 Linux 内核时,就先开始执行上面的入口函数,初始化内核的运行环境。最后 jmpl *%eax
跳转到内核入口 start_kernel
。