Go Interface 碎碎念

Jun 9, 2022 00:00 · 2839 words · 6 minute read Golang

Go 的 interface 既是静态（在编译时检查）又是动态的（使用时）。

使用

Go 的 interface 使得你可以像 Python 那样的全动态语言使用鸭子类型（ducking typing），但是编译器仍可捕获传参错误。在使用前先定义 interface：

type ReadCloser interface {
    Read(b []byte) (n int, err os.Error)
    Close()
}

func ReadAndClose(r ReadCloser, buf []byte) (n int, err os.Error) {
    for len(buf) > 0 && err == nil {
        var nr int
        nr, err = r.Read(buf)
        n += nr
        buf = buf[nr:]
    }
    r.Close()
    return
}

ReadAndClose 函数重复调用 Read 获取所有数据然后调用 Close。任意有 Read 和 Close 函数签名的类型都可以作为参数传给 ReadAndClose。与 Python 之类的语言不同，如果传参错误，编译时就会报错而非运行时。

不过 interface 并不局限于静态检查，你可以动态检查特定的 interface 值有无额外的方法：

type Stringer interface {
    String() string
}

func ToString(any interface{}) string {
    if v, ok := any.(Stringer); ok {
        return v.String()
    }
    switch v := any.(type) {
    case int:
        return strconv.Itoa(v)
    case float64:
        return strconv.FormatFloat(v, 'g', -1, 64)
    }
    return "???"
}

interface{} 类型的 any 值意味着不保证有任何方法。, ok 语法询问是否有可能将 any 转换为 Stringer 类型的 interface 值，有 String 方法。这就是精简版的 fmt 包。

举个栗子，下面有 String 和 Get 方法的 64 位的整形：

type Binary uint64

func (i Binary) String() string {
    return strconv.FormatUint(i.Get(), 2)
}

func (i Binary) Get() uint64 {
    return uint64(i)
}

Binary 类型的值可以作为参数传给 ToString 函数，就会调用它的 String 方法，尽管程序从没显示地说过 Binary 试着去实现 Stringer。因为没必要：Go 运行时能看出来 Binary 有一个 String 方法，所以它实现了 Stringer，甚至弄不好 Binary 的作者从没听说过 Stringer。

以上这些例子表明即使再编译时会检查所有隐式转换，显示地 interface 到 interface 的转换也会在运行时查询方法集合。

实现

interface 的值表示为一个双字对（2-word pair）：一个指针指向存储在 interface 中的类型信息；另一个指针指向相关数据。

interface 分为两种实现：

eface 实现不包含方法的 interface

type eface struct {
    _type *_type
    data  unsafe.Pointer
}

iface 实现包含方法的 interface

type iface struct {
    tab  *itab
    data unsafe.Pointer
}

itab（读作 i-table）

// layout of Itab known to compilers
// allocated in non-garbage-collected memory
// Needs to be in sync with
// ../cmd/compile/internal/gc/reflect.go:/^func.dumptabs.
type itab struct {
    inter *interfacetype
    _type *_type
    hash  uint32 // copy of _type.hash. Used for type switches.
    _     [4]byte
    fun   [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.
}

interfacetype

type interfacetype struct {
    typ     _type
    pkgpath name
    mhdr    []imethod
}

_type

// Needs to be in sync with ../cmd/link/internal/ld/decodesym.go:/^func.commonsize,
// ../cmd/compile/internal/gc/reflect.go:/^func.dcommontype and
// ../reflect/type.go:/^type.rtype.
// ../internal/reflectlite/type.go:/^type.rtype.
type _type struct {
    size       uintptr
    ptrdata    uintptr // size of memory prefix holding all pointers
    hash       uint32
    tflag      tflag
    align      uint8
    fieldAlign uint8
    kind       uint8
    // function for comparing objects of this type
    // (ptr to object A, ptr to object B) -> ==?
    equal func(unsafe.Pointer, unsafe.Pointer) bool
    // gcdata stores the GC type data for the garbage collector.
    // If the KindGCProg bit is set in kind, gcdata is a GC program.
    // Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
    gcdata    *byte
    str       nameOff
    ptrToThis typeOff
}

相关类型的元数据还有函数指针列表

通过源码可以直接了当地看出 iface 的结构比 eface 复杂了不少，eface 可以说是 iface 的子集。

┌─────────┐                                 ┌─────────┐
│  iface  │                          ┌─────▶│  itab   │
├─────────┴───────────────────┐      │      ├─────────┴───────────────────┐
│         tab  *itab          │──────┘      │    inter *interfacetype     │
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫             ├─────────────────────────────┤
┃     data unsafe.Pointer     ┃             ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛             ┃        _type *_type         ┃
                                            ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
                                            ├─────────────────────────────┤
                                            │        hash  uint32         │
                                            ├─────────────────────────────┤
                                            │      fun   [1]uintptr       │
                                            └─────────────────────────────┘
┌─────────┐
│  eface  │
├─────────┴───────────────────┐
│        _type *_type         │
├─────────────────────────────┤
│     data unsafe.Pointer     │
└─────────────────────────────┘

类型 Binary 的值是一个由两个字组成的 64 位整数（假设 32 位机器）：

interface 的值用两个字表示，一个指针指向 interface 中存储的类型另一个指向关联数据，将 b 赋值给 Stringer 类型的 interface 也就是设置 interface 的双字对。

（图中的灰色指针强调了它们是隐式的，并非直接对 Go 程序暴露）

interface 值的第一个字指向 itable，其中有相关类型的元数据和一个函数指针列表。注意 itable 对应的是 interface 类型而不是动态类型。在我们的示例中，存储 Binary 类型的 Stringer 的 itable 列出了用于满足 Stringer 的方法，即 String() 方法；Binary 另一个 Get() 方法在这个 itable 中是看不见的。

interface 值中的第二个字指向真实数据，在上述示例中为 b 的一个副本。赋值语句 var s Stringer = b 会制造一个 b 的副本而非 b 的指针，和 var c uint64 = b 表达式的效果一样：即使 b 后来变了，s 和 c 还是保持原值。要存在 interface 中的值可能相当大，但是在 interface 结构体中只有一个字的容量，所以在堆上分配一块内存并在记下指针。

要检查一个 interface 值是否包含特定的类型，如上 type switch，Go 编译器生产近似 C 表达式 s.tab->type 的代码来获取类型指针并检查它。如果类型匹配，值通过解引用 s.data 来复制。

要调用 s.String()，Go 编译器生成类似 C 表达式 s.tab->fun[0](s.data) 的代码：它调用来自 itable 的函数指针，传递 interface 值的数据作为函数的第一个参数。因此本例中的函数指针是 (*Binary).String 而非 Binary.String。

interface 的方法越多，在 itable fun 列表中的条目也就越多。

计算 itable

itable 是怎么来的呢？Go 动态的类型转换意味着编译器或链接器提前去计算所有可能的 itable 是不合理的：相反地，编译器为每个 Binary 或 int 这样的具体类型生成一个类型描述结构。在其他元数据中，类型描述结构体包含了一个由该类型实现的方法列表。类似地，编译器为每个像 Stringer 这样的 interface 类型生成了一个不同的类型描述结构；它也包含了一个方法列表。interface 运行时通过在具体类型的方法表中查找 interface 类型的方法表中的每个方法。运行时在生成之后会缓存 itable，所以只需要计算一次对应关系。

在我们的示例中，Stringer 只有一个方法，而 Binary 的方法表有两个方法。假设 interface 类型有 ni 个方法；具体类型有 nt 个方法。通常通过搜索来找到映射关系的时间复杂度 O(ni x nt)，但可以做的更好：对方法表排序并同时遍历，在 O(ni + nt) 时间复杂度内就可以建立映射表。

内存优化

两种互补的优化思路：

如果涉及的 interface 类型是空的（没有方法），itable 就显得多余了，舍弃 itable 并直接指向类型（即 eface）。
如果 interface 值关联的数据值能够适配一个机器字，就没必要引入指针和堆分配。如果我们定义 Binary32 以 uint32 作为实现，可以直接在第二个字中存储真实的数值。

实际值被指向还是内联取决于类型的大小。上述的 Binary，itable 中的方法是 (*Binary).String；而在 Binary32 的例子中，itable 中的方法是 Binary32.String 而非 (*Binary32).String。

存储单字大小值的空 interface 能够同时利用上面两种优化：

方法查找性能

许多动态系统每当一个方法被调用都会进行方法查找。为了加速，很多实现在每个调用点使用一个简单的缓存，通常在指令流中。在多线程程序中，这些缓存必须要小心管理，因为多线程可能同时在同一个调用点。

因为 Go 有静态类型的提示，配合动态的方法查找，它可以将查找从调用点移动至值存储至 interface 中的位置：

var any interface{} // initialized elsewhere
s := any.(Stringer)  // dynamic conversion
for i := 0; i < 100; i++ {
    fmt.Println(s.String())
}

在 Go 中，itable 在第二行的赋值就被计算（或缓存）好了；第四行中执行的 s.String() 调用只是几个内存读取和一个间接调用指令。

相反地，这个程序在动态语言的实现中将在第四行查找方法，在循环中重复不必要的工作。虽然提前缓存会降低其开销，但仍然比单个间接调用指令开销大。

使用

实现

计算 itable

内存优化

方法查找性能

查看更多