Go:错误也是值

May 19, 2021 22:00 · 3028 words · 7 minute read Golang

Gopher

Rob Pike

译文

错误处理是 Go 程序员们(尤其是新人)一个津津乐道的讨论话题。说来说去无非就是这块代码

if err != nil {
    return err
}

有多么频繁。我们最近看了很多开源项目,发现这段代码每一两个文件也就出现一次,频率比你想象的要少。但是如果大家都认为必须写 if err != nil,那就一定有问题,很明显就是 Go 本身。

很不幸大家都被误导了,而且很容易纠正。也许刚接触 Go 的萌新会问:“如何处理错误呢?”,就学了这种模式。其他语言可能会使用 try-catch 等机制来处理错误。因此,程序员们认为,我以前使用 try-catch,现在写 Go 只要写 if err != nil 就好了。随着时间的流逝,就搞出了许多这样的 Go 代码,让人觉得很蠢。

不管这种解释是否合适,很明显,他们搞错了一个基本点:错误也是值

值可以被编程,既然错误是值,那么错误也可以被编程。

当然涉及错误的值的常见语句是检查它是否为 nil,但还可以用错误值来做无数的事,有些甚至可以优化你的程序,并不是非得用呆板的 if 语句来检查的。

这里有个小栗子来自 bufio 包的 Scanner 类型。它的 Scan 方法执行底层 I/O,有可能导致错误。但是 Scan 方法完全不暴露错误,而是返回一个布尔值,并在扫描结束时运行一个单独的方法,报告是否发生错误。客户端代码看起来是这样的:

scanner := bufio.NewScanner(input)
for scanner.Scan() {
    token := scanner.Text()
    // process token
}
if err := scanner.Err(); err != nil {
    // process the error
}

当然这里也会检查 nil,但只来一次。可以这么定义 Scan 方法:

func (s *Scanner) Scan() (token []byte, error)

示例代码:

scanner := bufio.NewScanner(input)
for {
    token, err := scanner.Scan()
    if err != nil {
        return err // or maybe break
    }
    // process token
}

并没什么不一样,但是有一个重要的区别。这段代码里,每把迭代都必定会检查错误,但是真正的 Scanner API 中,错误处理从关键的 API 元素中抽象出来了。有了真正的 API,client 代码感觉更自然:先循环再说,然后再考虑错误。错误处理不会掩盖控制的流程。

当然,一旦 Scan 遇到 I/O 错误,就会记录并返回 false。一个独立的 Err 方法,当调用时才报告错误。尽管有点琐碎,但和无脑写 if err != nil 不一样。

这就是用错误值编程。

值得强调的是,不管怎么设计,关键在于程序检查错误,无论这些错误是如何暴露的。这里讨论的不是如何避免检查错误,而是使用语言优雅地处理错误。

早就有人提出了重复性错误检查代码的话题,来段代码:

_, err = fd.Write(p0[a:b])
if err != nil {
    return err
}
_, err = fd.Write(p1[c:d])
if err != nil {
    return err
}
_, err = fd.Write(p2[e:f])
if err != nil {
    return err
}
// and so on

它重复性极高,但在这种理想化的模式中,可以用闭包来搞:

var err error
write := func(buf []byte) {
    if err != nil {
        return
    }
    _, err = w.Write(buf)
}
write(p0[a:b])
write(p1[c:d])
write(p2[e:f])
// and so on
if err != nil {
    return err
}

这种模式效果不错,但要每个写操作的函数中都有闭包;单独的辅助函数用起来不灵活,因为对 err 变量的维护是贯穿整个调用的(可以试试)。

我们可以借用上面 Scan 方法的思路使其更简洁,更通用和可重复使用。

我定义一个叫做 errWriter 的对象:

type errWriter struct {
    w   io.Writer
    err error
}

给它个 write 方法,无需标准 Write 签名,小写就是为了区别。write 方法调用了底层写入器的 Write 方法并报告第一个错误:

func (ew *errWriter) write(buf []byte) {
    if ew.err != nil {
        return
    }
    _, ew.err = ew.w.Write(buf)
}

一旦发生错误,write 方法就做了无用功,但错误值会被保存。

利用 errWriter 类型和它的 write 方法,上面的代码可以被重构成:

ew := &errWriter{w: fd}
ew.write(p0[a:b])
ew.write(p1[c:d])
ew.write(p2[e:f])
// and so on
if ew.err != nil {
    return ew.err
}

和使用闭包相比干净多了,也使得实际的写入顺序更显眼了。没有了乱七八糟的东西,面向错误值编程让代码更好看了。

项目中的其他代码也可以建立在这个想法之上,甚至直接使用 errWriter

errWriter 还可以做的更多。可以数字节,可以将写内容凝聚成一个缓冲区,然后以原子方式传输,玩出花。

实际上,这种模式在标准库中经常出现。archive/zipnet/http 就是这么干的。bufio 包的 Writer 也是 errWriter 想法的实现。尽管 bufio.Writer.Write 返回错误,主要是为了遵守 io.Writer 接口。bufio.WriterWrite 方法就和上面的 errWriter.write 方法类似,使用 Flush 报告错误,所以我们的例子还可以这么写:

b := bufio.NewWriter(fd)
b.Write(p0[a:b])
b.Write(p1[c:d])
b.Write(p2[e:f])
// and so on
if b.Flush() != nil {
    return b.Flush()
}

这种方法有个明显的缺点:没办法知道在错误发生前处理了多少。如果这个信息很重要,就要一个更精细的方法。不过通常情况下,在最后来一把检查就够了。

我们只看了一种避免重复错误处理代码的技巧。请记住要随机应变,使用 errWriterbufio.Writer 并不是简化错误处理的唯一方法,而且这种方法也不适合所有情况。但关键的是,错误是值,Go 能够处理好它们。

使用语言来简化你的错误处理。但要记住:无论如何都要错误检查!


原文

A common point of discussion among Go programmers, especially those new to the language, is how to handle errors. The conversation often turns into a lament at the number of times the sequence

if err != nil {
    return err
}

shows up. We recently scanned all the open source projects we could find and discovered that this snippet occurs only once per page or two, less often than some would have you believe. Still, if the perception persists that one must type

if err != nil

all the time, something must be wrong, and the obvious target is Go itself.

This is unfortunate, misleading, and easily corrected. Perhaps what is happening is that programmers new to Go ask, “How does one handle errors?”, learn this pattern, and stop there. In other languages, one might use a try-catch block or other such mechanism to handle errors. Therefore, the programmer thinks, when I would have used a try-catch in my old language, I will just type if err != nil in Go. Over time the Go code collects many such snippets, and the result feels clumsy.

Regardless of whether this explanation fits, it is clear that these Go programmers miss a fundamental point about errors: Errors are values.

Values can be programmed, and since errors are values, errors can be programmed.

Of course a common statement involving an error value is to test whether it is nil, but there are countless other things one can do with an error value, and application of some of those other things can make your program better, eliminating much of the boilerplate that arises if every error is checked with a rote if statement.

Here’s a simple example from the bufio package’s Scanner type. Its Scan method performs the underlying I/O, which can of course lead to an error. Yet the Scan method does not expose an error at all. Instead, it returns a boolean, and a separate method, to be run at the end of the scan, reports whether an error occurred. Client code looks like this:

scanner := bufio.NewScanner(input)
for scanner.Scan() {
    token := scanner.Text()
    // process token
}
if err := scanner.Err(); err != nil {
    // process the error
}

Sure, there is a nil check for an error, but it appears and executes only once. The Scan method could instead have been defined as

func (s *Scanner) Scan() (token []byte, error)

and then the example user code might be (depending on how the token is retrieved),

scanner := bufio.NewScanner(input)
for {
    token, err := scanner.Scan()
    if err != nil {
        return err // or maybe break
    }
    // process token
}

This isn’t very different, but there is one important distinction. In this code, the client must check for an error on every iteration, but in the real Scanner API, the error handling is abstracted away from the key API element, which is iterating over tokens. With the real API, the client’s code therefore feels more natural: loop until done, then worry about errors. Error handling does not obscure the flow of control.

Under the covers what’s happening, of course, is that as soon as Scan encounters an I/O error, it records it and returns false. A separate method, Err, reports the error value when the client asks. Trivial though this is, it’s not the same as putting

if err != nil

everywhere or asking the client to check for an error after every token. It’s programming with error values. Simple programming, yes, but programming nonetheless.

It’s worth stressing that whatever the design, it’s critical that the program check the errors however they are exposed. The discussion here is not about how to avoid checking errors, it’s about using the language to handle errors with grace.

The topic of repetitive error-checking code arose when I attended the autumn 2014 GoCon in Tokyo. An enthusiastic gopher, who goes by @jxck_ on Twitter, echoed the familiar lament about error checking. He had some code that looked schematically like this:

_, err = fd.Write(p0[a:b])
if err != nil {
    return err
}
_, err = fd.Write(p1[c:d])
if err != nil {
    return err
}
_, err = fd.Write(p2[e:f])
if err != nil {
    return err
}
// and so on

It is very repetitive. In the real code, which was longer, there is more going on so it’s not easy to just refactor this using a helper function, but in this idealized form, a function literal closing over the error variable would help:

var err error
write := func(buf []byte) {
    if err != nil {
        return
    }
    _, err = w.Write(buf)
}
write(p0[a:b])
write(p1[c:d])
write(p2[e:f])
// and so on
if err != nil {
    return err
}

This pattern works well, but requires a closure in each function doing the writes; a separate helper function is clumsier to use because the err variable needs to be maintained across calls (try it).

We can make this cleaner, more general, and reusable by borrowing the idea from the Scan method above. I mentioned this technique in our discussion but @jxck_ didn’t see how to apply it. After a long exchange, hampered somewhat by a language barrier, I asked if I could just borrow his laptop and show him by typing some code.

I defined an object called an errWriter, something like this:

type errWriter struct {
    w   io.Writer
    err error
}

and gave it one method, write. It doesn’t need to have the standard Write signature, and it’s lower-cased in part to highlight the distinction. The write method calls the Write method of the underlying Writer and records the first error for future reference:

func (ew *errWriter) write(buf []byte) {
    if ew.err != nil {
        return
    }
    _, ew.err = ew.w.Write(buf)
}

As soon as an error occurs, the write method becomes a no-op but the error value is saved.

Given the errWriter type and its write method, the code above can be refactored:

ew := &errWriter{w: fd}
ew.write(p0[a:b])
ew.write(p1[c:d])
ew.write(p2[e:f])
// and so on
if ew.err != nil {
    return ew.err
}

This is cleaner, even compared to the use of a closure, and also makes the actual sequence of writes being done easier to see on the page. There is no clutter any more. Programming with error values (and interfaces) has made the code nicer.

It’s likely that some other piece of code in the same package can build on this idea, or even use errWriter directly.

Also, once errWriter exists, there’s more it could do to help, especially in less artificial examples. It could accumulate the byte count. It could coalesce writes into a single buffer that can then be transmitted atomically. And much more.

In fact, this pattern appears often in the standard library. The archive/zip and net/http packages use it. More salient to this discussion, the bufio package’s Writer is actually an implementation of the errWriter idea. Although bufio.Writer.Write returns an error, that is mostly about honoring the io.Writer interface. The Write method of bufio.Writer behaves just like our errWriter.write method above, with Flush reporting the error, so our example could be written like this:

b := bufio.NewWriter(fd)
b.Write(p0[a:b])
b.Write(p1[c:d])
b.Write(p2[e:f])
// and so on
if b.Flush() != nil {
    return b.Flush()
}

There is one significant drawback to this approach, at least for some applications: there is no way to know how much of the processing completed before the error occurred. If that information is important, a more fine-grained approach is necessary. Often, though, an all-or-nothing check at the end is sufficient.

We’ve looked at just one technique for avoiding repetitive error handling code. Keep in mind that the use of errWriter or bufio.Writer isn’t the only way to simplify error handling, and this approach is not suitable for all situations. The key lesson, however, is that errors are values and the full power of the Go programming language is available for processing them.

Use the language to simplify your error handling.

But remember: Whatever you do, always check your errors!

Go Logo