Go Map
Feb 27, 2022 11:30 · 4992 words · 10 minute read
One of the most useful data structures in computer science is the hash table. Many hash table implementations exist with varying properties, but in general they offer fast lookups, adds, and deletes. Go provides a built-in map type that implements a hash table.
哈希表是最高效的数据结构之一(平均时间复杂度O(1)),而 Go 内置的 map 结构基于哈希表,并使用链表解决哈希冲突。
在 Go 中 map 的底层实现是 hmap
结构体:
type hmap struct {
// Note: the format of the hmap is also encoded in cmd/compile/internal/gc/reflect.go.
// Make sure this stays in sync with the compiler's definition.
count int // # live cells == size of map. Must be first (used by len() builtin)
flags uint8
B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
hash0 uint32 // hash seed
buckets unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
nevacuate uintptr // progress counter for evacuation (buckets less than this have been evacuated)
extra *mapextra // optional fields
}
count
map 存储的元素个数,len() 函数会直接返回此值B
哈希桶的数量为 2^B 个noverflow
溢出哈希桶数量的近似数hash0
哈希种子,为哈希算法引入随机性buckets
指向哈希桶数组的指针,一共有 2^B 个数组元素也就是 2^B 个哈希桶oldbuckets
扩容会导致哈希桶数组长度翻倍,扩容前的buckets
指针将保存于此flags
我们先来看 bucket(哈希桶)的底层实现 bmap
:
const (
// Maximum number of key/elem pairs a bucket can hold.
bucketCntBits = 3
bucketCnt = 1 << bucketCntBits
)
type bmap struct {
// tophash generally contains the top byte of the hash value
// for each key in this bucket. If tophash[0] < minTopHash,
// tophash[0] is a bucket evacuation state instead.
tophash [bucketCnt]uint8
// Followed by bucketCnt keys and then bucketCnt elems.
// NOTE: packing all the keys together and then all the elems together makes the
// code a bit more complicated than alternating key/elem/key/elem/... but it allows
// us to eliminate padding which would be needed for, e.g., map[int64]int8.
// Followed by an overflow pointer.
}
tophash
是一个长度为 8 的 uint8 数组,每个元素都是这个哈希桶内所有 key 的哈希值的首字节。
运行期间的 bmap 结构会包含更多的成员,因为哈希桶要存储数据。在 Go 编译时,会重写 bmap 结构,根据 bmap
函数推断出来增强后的 bmap 结构:
type bmap struct {
topbits [8]uint8
keys [8]keytype
elems [8]elemtype
overflow uintptr
}
topbits
和 tophash 一样,存放哈希桶内所有 key 的哈希值首字节,最多 8 个keys
所有的 key 连续存放elems
所有与 key 对应的值,在 Go 底层被实现为 elementoverflow
指向溢出桶,处理哈希冲突
创建 map
我们在写代码时通过内置的 make 函数来创建 map:
m1 := make(map[string]int) // makemap_small
m2 := make(map[string]int, 8) // makemap
不提供 hint 的 make 或者 hint <= 8 时,编译器会使用 makemap_small
初始化;hint > 8 时,make(map[k]v, hint)
语句在编译期间会被转换为 makemap
:
-
makemap_small
func makemap_small() *hmap { h := new(hmap) h.hash0 = fastrand() return h }
-
makemap
// maxAlloc is the maximum size of an allocation. On 64-bit, // it's theoretically possible to allocate 1<<heapAddrBits bytes. On // 32-bit, however, this is one less than 1<<32 because the // number of bytes in the address space doesn't actually fit // in a uintptr. const ( maxAlloc = (1 << heapAddrBits) - (1-_64bit)*1 ) func makemap(t *maptype, hint int, h *hmap) *hmap { mem, overflow := math.MulUintptr(uintptr(hint), t.bucket.size) if overflow || mem > maxAlloc { hint = 0 } // initialize Hmap if h == nil { h = new(hmap) } h.hash0 = fastrand() // Find the size parameter B which will hold the requested # of elements. // For hint < 0 overLoadFactor returns false since hint < bucketCnt. B := uint8(0) for overLoadFactor(hint, B) { B++ } h.B = B // allocate initial hash table // if B == 0, the buckets field is allocated lazily later (in mapassign) // If hint is large zeroing this memory could take a while. if h.B != 0 { var nextOverflow *bmap h.buckets, nextOverflow = makeBucketArray(t, h.B, nil) if nextOverflow != nil { h.extra = new(mapextra) h.extra.nextOverflow = nextOverflow } } return h }
makemap
函数做了这些操作:- 初始化
hmap
结构 - 设置哈希种子 hash0
- 找到一个最小的 B 值使得 2^B 刚好大于 hint(make 时指定的容量),哈希桶数组就可以容纳下所有的元素
- 如果 B 不等于 0,
makeBucketArray
函数初始化长度为 2^B 的哈希桶数组;反之则在 map 赋值时再分配
- 初始化
编译器判定使用哪种 makemap:https://github.com/golang/go/blob/go1.16.10/src/cmd/compile/internal/gc/walk.go#L1205-L1315
哈希计算
在 Go runtime 调度器初始化时 https://github.com/golang/go/blob/go1.16.10/src/runtime/proc.go#L592-L687:
func schedinit() {
cpuinit() // must run before alginit
alginit() // maps must not be used before this call
}
alginit
函数会根据 cpuinit
函数解析得到的 CPU 指令集的支持情况选择合适的哈希算法 https://github.com/golang/go/blob/go1.16.10/src/runtime/proc.go#L328-L352:
func alginit() {
// Install AES hash algorithms if the instructions needed are present.
if (GOARCH == "386" || GOARCH == "amd64") &&
cpu.X86.HasAES && // AESENC
cpu.X86.HasSSSE3 && // PSHUFB
cpu.X86.HasSSE41 { // PINSR{D,Q}
initAlgAES()
return
}
if GOARCH == "arm64" && cpu.ARM64.HasAES {
initAlgAES()
return
}
getRandomData((*[len(hashkey) * sys.PtrSize]byte)(unsafe.Pointer(&hashkey))[:])
hashkey[0] |= 1 // make sure these numbers are odd
hashkey[1] |= 1
hashkey[2] |= 1
hashkey[3] |= 1
}
func initAlgAES() {
useAeshash = true
// Initialize with random data so hash collisions will be hard to engineer.
getRandomData(aeskeysched[:])
}
在指令集支持良好的情况下,amd64 平台下 Go runtime 会使用 AES 哈希算法,并使用随机数初始化以减小哈希冲突概率。
string 类型的 key 的哈希函数 strhash
func strhash(p unsafe.Pointer, h uintptr) uintptr
真正的实现在汇编代码 https://github.com/golang/go/blob/go1.16.10/src/runtime/asm_amd64.go#L895-L905 中,具体实现也会因 CPU 架构而异:
// func strhash(p unsafe.Pointer, h uintptr) uintptr
TEXT runtime·strhash(SB),NOSPLIT,$0-24
CMPB runtime·useAeshash(SB), $0
JEQ noaes
MOVQ p+0(FP), AX // ptr to string struct
MOVQ 8(AX), CX // length of string
MOVQ (AX), AX // string data
LEAQ ret+16(FP), DX
JMP aeshashbody<>(SB)
noaes:
JMP runtime·strhashFallback(SB)
某些类型还有特定的哈希函数来优化计算效率:
map 访问
k1 := m["k1"] // mapaccess1
_, prs := m["k2"] // mapaccess2
key 经过哈希计算后得到一个 64 位的值,
在编译时,按 key 访问 map 中其对应的值,都会被转换为 mapaccess1
(单返回值)或 mapaccess2
(双返回值):
func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
// a lot of code here
hash := t.hasher(key, uintptr(h.hash0))
m := bucketMask(h.B)
b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize)))
if c := h.oldbuckets; c != nil {
if !h.sameSizeGrow() {
// There used to be half as many buckets; mask down one more power of two.
m >>= 1
}
oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize)))
if !evacuated(oldb) {
b = oldb
}
}
top := tophash(hash)
bucketloop:
for ; b != nil; b = b.overflow(t) {
for i := uintptr(0); i < bucketCnt; i++ {
if b.tophash[i] != top {
if b.tophash[i] == emptyRest {
break bucketloop
}
continue
}
k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
if t.indirectkey() {
k = *((*unsafe.Pointer)(k))
}
if t.key.equal(key, k) {
e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize))
if t.indirectelem() {
e = *((*unsafe.Pointer)(e))
}
return e
}
}
}
return unsafe.Pointer(&zeroVal[0])
}
t.hasher
对当前 key 进行哈希计算bucketMask
计算掩码,掩码和哈希值相与得到哈希桶序号,进一步得到哈希桶的位置tophash
计算哈希值的首字节- 首先查看哈希桶的 tophash 数组中是否存在上述计算出的哈希值的首字节
- 如果存在,通过偏移量获取哈希桶中存储的 key[N]
- 如果与 key 相同,就返回对应值 value[N]
- 如果没找到,就返回零值
如果存在哈希冲突,那将遍历哈希桶链表 b = b.overflow(t)
。
mapaccess2
和 mapaccess1
差不多,就不说了。
编译器还会根据 key 的类型替换成相应的 mapaccess1_xxx、mapaccess2_xxx 函数实现优化:
- uint32
- uint64
- string
map 写入
m["k1"] = 7
在编译期间,对 map 按 key 赋值会被转换为 mapassign
函数:
// Like mapaccess, but allocates a slot for the key if it is not present in the map.
func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
// a lot of code here
hash := t.hasher(key, uintptr(h.hash0))
// Set hashWriting after calling t.hasher, since t.hasher may panic,
// in which case we have not actually done a write.
h.flags ^= hashWriting
if h.buckets == nil {
h.buckets = newobject(t.bucket) // newarray(t.bucket, 1)
}
again:
bucket := hash & bucketMask(h.B)
if h.growing() {
growWork(t, h, bucket)
}
b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize)))
top := tophash(hash)
var inserti *uint8
var insertk unsafe.Pointer
var elem unsafe.Pointer
bucketloop:
for {
for i := uintptr(0); i < bucketCnt; i++ {
if b.tophash[i] != top {
if isEmpty(b.tophash[i]) && inserti == nil {
inserti = &b.tophash[i]
insertk = add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
elem = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize))
}
if b.tophash[i] == emptyRest {
break bucketloop
}
continue
}
k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
if t.indirectkey() {
k = *((*unsafe.Pointer)(k))
}
if !t.key.equal(key, k) {
continue
}
// already have a mapping for key. Update it.
if t.needkeyupdate() {
typedmemmove(t.key, k, key)
}
elem = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize))
goto done
}
ovf := b.overflow(t)
if ovf == nil {
break
}
b = ovf
}
// Did not find mapping for key. Allocate new cell & add entry.
// If we hit the max load factor or we have too many overflow buckets,
// and we're not already in the middle of growing, start growing.
if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
hashGrow(t, h)
goto again // Growing the table invalidates everything, so try again
}
if inserti == nil {
// The current bucket and all the overflow buckets connected to it are full, allocate a new one.
newb := h.newoverflow(t, b)
inserti = &newb.tophash[0]
insertk = add(unsafe.Pointer(newb), dataOffset)
elem = add(insertk, bucketCnt*uintptr(t.keysize))
}
// store new key/elem at insert position
if t.indirectkey() {
kmem := newobject(t.key)
*(*unsafe.Pointer)(insertk) = kmem
insertk = kmem
}
if t.indirectelem() {
vmem := newobject(t.elem)
*(*unsafe.Pointer)(elem) = vmem
}
typedmemmove(t.key, insertk, key)
*inserti = top
h.count++
}
t.hasher
对当前 key 进行哈希计算- 如果先前未初始化哈希桶数组,就直接在此分配,长度为 1(此时 B 为 0)
- 掩码和哈希值相与得到哈希桶序号,进一步得到哈希桶位置
tophash
计算哈希值的首字节- 遍历哈希桶的 tophash 数组
- 如果数组中某个元素与哈希值的首字节相等,将键值对写入对应的地址(基地址 + 偏移量)
- 如果当前哈希桶满了(
inserti
为nil
),newoverflow
创建一个新的哈希桶,然后挂到原哈希桶的overflow
指针上,并自增 hmap 对象的溢出哈希桶计数器noverflow
扩容
在 map 写入时可能造成 hmap 扩容 https://github.com/golang/go/blob/go1.16.10/src/runtime/map.go#L647-L650:
// If we hit the max load factor or we have too many overflow buckets,
// and we're not already in the middle of growing, start growing.
if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
hashGrow(t, h)
goto again // Growing the table invalidates everything, so try again
}
以下两种情况会触发哈希表扩容:
-
装载因子超过阈值 6.5(很多文章中都提到)
https://github.com/golang/go/blob/go1.16.10/src/runtime/map.go#L1069-L1072
func overLoadFactor(count int, B uint8) bool { return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen) }
6.5 怎么来的呢?
loadFactorNum
/loadFactorDen
= 6.5 -
使用了太多溢出桶(哈希冲突严重)
https://github.com/golang/go/blob/go1.16.10/src/runtime/map.go#L1074-L1087
// tooManyOverflowBuckets reports whether noverflow buckets is too many for a map with 1<<B buckets. // Note that most of these overflow buckets must be in sparse use; // if use was dense, then we'd have already triggered regular map growth. func tooManyOverflowBuckets(noverflow uint16, B uint8) bool { // If the threshold is too low, we do extraneous work. // If the threshold is too high, maps that grow and shrink can hold on to lots of unused memory. // "too many" means (approximately) as many overflow buckets as regular buckets. // See incrnoverflow for more details. if B > 15 { B = 15 } // The compiler doesn't see here that B < 16; mask B to generate shorter shift code. return noverflow >= uint16(1)<<(B&15) }
总之哈希表的性能会严重下降。
当判定当前 hmap 需要扩容,首先会调用 hashGrow
函数
func hashGrow(t *maptype, h *hmap) {
// If we've hit the load factor, get bigger.
// Otherwise, there are too many overflow buckets,
// so keep the same number of buckets and "grow" laterally.
bigger := uint8(1)
if !overLoadFactor(h.count+1, h.B) {
bigger = 0
h.flags |= sameSizeGrow
}
oldbuckets := h.buckets
newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil)
flags := h.flags &^ (iterator | oldIterator)
if h.flags&iterator != 0 {
flags |= oldIterator
}
// commit the grow (atomic wrt gc)
h.B += bigger
h.flags = flags
h.oldbuckets = oldbuckets
h.buckets = newbuckets
h.nevacuate = 0
h.noverflow = 0
if h.extra != nil && h.extra.overflow != nil {
// Promote current overflow buckets to the old generation.
if h.extra.oldoverflow != nil {
throw("oldoverflow is not nil")
}
h.extra.oldoverflow = h.extra.overflow
h.extra.overflow = nil
}
if nextOverflow != nil {
if h.extra == nil {
h.extra = new(mapextra)
}
h.extra.nextOverflow = nextOverflow
}
// the actual copying of the hash table data is done incrementally
// by growWork() and evacuate().
}
- 如果装载因子超过阈值,
h.B
将 +1,也就是说哈希桶的数量(2^B)翻倍。 - 否则就是溢出桶数量过多导致的扩容,这种情况叫等量扩容
sameSizeGrow
,哈希桶数组长度不会翻倍。makeBucketArray
创建一个新的哈希桶数组,将原来哈希桶数组挂到oldbuckets
成员,并将新建的哈希桶数组挂到buckets
指针。 - 更新
h.flags
真正的数据迁移由 growWork
和 evacuate
函数完成:
-
func growWork(t *maptype, h *hmap, bucket uintptr) { // make sure we evacuate the oldbucket corresponding // to the bucket we're about to use evacuate(t, h, bucket&h.oldbucketmask()) // evacuate one more oldbucket to make progress on growing if h.growing() { evacuate(t, h, h.nevacuate) } }
-
func evacuate(t *maptype, h *hmap, oldbucket uintptr) { // 定位老的哈希桶数组位置 b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize))) newbit := h.noldbuckets() // 如果未迁移过 if !evacuated(b) { // TODO: reuse overflow buckets instead of using new ones, if there // is no iterator using the old buckets. (If !oldIterator.) // xy contains the x and y (low and high) evacuation destinations. var xy [2]evacDst x := &xy[0] // 默认等量扩容,使用 x 进行迁移 x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize))) x.k = add(unsafe.Pointer(x.b), dataOffset) // key x.e = add(x.k, bucketCnt*uintptr(t.keysize)) // elem // 非等量扩容 if !h.sameSizeGrow() { // Only calculate y pointers if we're growing bigger. // Otherwise GC can see bad pointers. y := &xy[1] // 使用 y 进行迁移 y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize))) y.k = add(unsafe.Pointer(y.b), dataOffset) // key y.e = add(y.k, bucketCnt*uintptr(t.keysize)) // elem } // 遍历哈希桶,包括溢出桶 for ; b != nil; b = b.overflow(t) { k := add(unsafe.Pointer(b), dataOffset) e := add(k, bucketCnt*uintptr(t.keysize)) // 遍历每个哈希桶中的所有键值对 for i := 0; i < bucketCnt; i, k, e = i+1, add(k, uintptr(t.keysize)), add(e, uintptr(t.elemsize)) { // 当前哈希桶的 tophash 数组 top := b.tophash[i] // 判断是否已经迁移 if isEmpty(top) { b.tophash[i] = evacuatedEmpty continue } if top < minTopHash { throw("bad map state") } k2 := k // 如果 key 是指针,解引用 if t.indirectkey() { k2 = *((*unsafe.Pointer)(k2)) } var useY uint8 if !h.sameSizeGrow() { // Compute hash to make our evacuation decision (whether we need // to send this key/elem to bucket x or bucket y). // 重新计算哈希值 hash := t.hasher(k2, uintptr(h.hash0)) // 如果有 goroutine 正在遍历 map if h.flags&iterator != 0 && !t.reflexivekey() && !t.key.equal(k2, k2) { // If key != key (NaNs), then the hash could be (and probably // will be) entirely different from the old hash. Moreover, // it isn't reproducible. Reproducibility is required in the // presence of iterators, as our evacuation decision must // match whatever decision the iterator made. // Fortunately, we have the freedom to send these keys either // way. Also, tophash is meaningless for these kinds of keys. // We let the low bit of tophash drive the evacuation decision. // We recompute a new random tophash for the next level so // these keys will get evenly distributed across all buckets // after multiple grows. useY = top & 1 top = tophash(hash) } else { if hash&newbit != 0 { useY = 1 } } } if evacuatedX+1 != evacuatedY || evacuatedX^1 != evacuatedY { throw("bad evacuatedN") } b.tophash[i] = evacuatedX + useY // evacuatedX + 1 == evacuatedY // 使用 X 还是 Y 迁移 dst := &xy[useY] // evacuation destination // 发生哈希冲突,溢出了 if dst.i == bucketCnt { // 创建新的溢出哈希桶 dst.b = h.newoverflow(t, dst.b) dst.i = 0 dst.k = add(unsafe.Pointer(dst.b), dataOffset) dst.e = add(dst.k, bucketCnt*uintptr(t.keysize)) } dst.b.tophash[dst.i&(bucketCnt-1)] = top // mask dst.i as an optimization, to avoid a bounds check if t.indirectkey() { // 将原来的 key(如果是指针)复制到新位置 *(*unsafe.Pointer)(dst.k) = k2 // copy pointer } else { // 将原来的 key(如果是值)复制到新位置 typedmemmove(t.key, dst.k, k) // copy elem } if t.indirectelem() { // 将原来的值(如果是指针)复制到新位置 *(*unsafe.Pointer)(dst.e) = *(*unsafe.Pointer)(e) } else { // 将原来的值(如果是值)复制到新位置 typedmemmove(t.elem, dst.e, e) } dst.i++ // These updates might push these pointers past the end of the // key or elem arrays. That's ok, as we have the overflow pointer // at the end of the bucket to protect against pointing past the // end of the bucket. dst.k = add(dst.k, uintptr(t.keysize)) dst.e = add(dst.e, uintptr(t.elemsize)) } } // Unlink the overflow buckets & clear key/elem to help GC. // 如果没有 goroutine 在使用原先的哈希桶,就清除掉 if h.flags&oldIterator == 0 && t.bucket.ptrdata != 0 { b := add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)) // Preserve b.tophash because the evacuation // state is maintained there. ptr := add(b, dataOffset) n := uintptr(t.bucketsize) - dataOffset memclrHasPointers(ptr, n) } } if oldbucket == h.nevacuate { advanceEvacuationMark(h, t, newbit) } }
考虑到性能,Go hmap 采用类似 redis 的渐进式扩容策略,每次最多只迁移两个哈希桶。
如果只是等量扩容,只会使用 X 哈希桶进行迁移;如果哈希桶数组长度翻倍,那么同时使用 X(前一半)和 Y(后一半)桶迁移,是为了应对在扩容期间存在对 map 的读写而设计。所以扩容过程并不是原子的。
遍历
for k, v := range m1 {
// do something
}
在编译时,遍历 map 会被转换为 mapiterinit
函数的调用:
func mapiterinit(t *maptype, h *hmap, it *hiter) {
if raceenabled && h != nil {
callerpc := getcallerpc()
racereadpc(unsafe.Pointer(h), callerpc, funcPC(mapiterinit))
}
if h == nil || h.count == 0 {
return
}
if unsafe.Sizeof(hiter{})/sys.PtrSize != 12 {
throw("hash_iter size incorrect") // see cmd/compile/internal/gc/reflect.go
}
it.t = t
it.h = h
// grab snapshot of bucket state
it.B = h.B
it.buckets = h.buckets
if t.bucket.ptrdata == 0 {
// Allocate the current slice and remember pointers to both current and old.
// This preserves all relevant overflow buckets alive even if
// the table grows and/or overflow buckets are added to the table
// while we are iterating.
h.createOverflow()
it.overflow = h.extra.overflow
it.oldoverflow = h.extra.oldoverflow
}
// decide where to start
r := uintptr(fastrand())
if h.B > 31-bucketCntBits {
r += uintptr(fastrand()) << 31
}
it.startBucket = r & bucketMask(h.B)
it.offset = uint8(r >> h.B & (bucketCnt - 1))
// iterator state
it.bucket = it.startBucket
// Remember we have an iterator.
// Can run concurrently with another mapiterinit().
if old := h.flags; old&(iterator|oldIterator) != iterator|oldIterator {
atomic.Or8(&h.flags, iterator|oldIterator)
}
mapiternext(it)
}
hiter
是迭代器结构:
type hiter struct {
key unsafe.Pointer // Must be in first position. Write nil to indicate iteration end (see cmd/compile/internal/gc/range.go).
elem unsafe.Pointer // Must be in second position (see cmd/compile/internal/gc/range.go).
t *maptype
h *hmap
buckets unsafe.Pointer // bucket ptr at hash_iter initialization time
bptr *bmap // current bucket
overflow *[]*bmap // keeps overflow buckets of hmap.buckets alive
oldoverflow *[]*bmap // keeps overflow buckets of hmap.oldbuckets alive
startBucket uintptr // bucket iteration started at
offset uint8 // intra-bucket offset to start from during iteration (should be big enough to hold bucketCnt-1)
wrapped bool // already wrapped around from end of bucket array to beginning
B uint8
i uint8
bucket uintptr
checkBucket uintptr
}
mapiterinit
函数的主要工作是初始化一个迭代器 hiter
,调用 mapiternext
函数进行迭代。
为什么 map 遍历是无序的,因为在迁移后键值对的位置就发生了变化。
https://github.com/golang/go/blob/go1.16.10/src/runtime/map.go#L831-L837
// decide where to start
r := uintptr(fastrand())
if h.B > 31-bucketCntBits {
r += uintptr(fastrand()) << 31
}
it.startBucket = r & bucketMask(h.B)
it.offset = uint8(r >> h.B & (bucketCnt - 1))
Go 为了杜绝偶然性,甚至在遍历 map 时从一个随机序号的哈希桶开始,而且对哈希桶内元素的遍历也是随机的。