profiling - Go profiler 的输出令人困惑（而且不正确？）

Question

我有一个我正在尝试分析的 Go 二进制文件，我得到了令人惊讶的结果。该代码在中具有以下（截断）main.go，其余代码在包中monte：

package main

import (
  "monte"
  "runtime/pprof"
)

var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file")

func main() {
  flag.Parse()
  if *cpuprofile != "" {
    f, err := os.Create(*cpuprofile)
    if err != nil {
      log.Fatal(err)
    }
    pprof.StartCPUProfile(f)
  }

  monte.ExpensiveOperation()

  pprof.StopCPUProfile()
}

我用构建我的可执行文件go build src/main.go，然后用./main -cpuprofile=monte.prof. 当我检查输出时go tool pprof main monte.prof，我得到以下输出：

(pprof) top10 --cum
Total: 346 samples
     280  80.9%  80.9%      343  99.1% time.Time.Format
       0   0.0%  80.9%      341  98.6% runtime.interhash
       0   0.0%  80.9%      341  98.6% runtime.main
       0   0.0%  80.9%      251  72.5% strconv.Unquote
      13   3.8%  84.7%       31   9.0% strconv.roundShortest
      11   3.2%  87.9%       18   5.2% strconv.fmtE
       9   2.6%  90.5%        9   2.6% runtime.markallocated
       1   0.3%  90.8%        8   2.3% math/rand.Float64
       2   0.6%  91.3%        8   2.3% runtime.FixAlloc_Free
       7   2.0%  93.4%        8   2.3% time.nextStdChunk

累积时间最长的函数是time.Time.Format，这对我来说似乎是错误的（不应该是main吗？）并且根本没有提及monte，尽管“昂贵的操作”需要大约 10 秒才能完成，时间充裕让采样器看到它。如果我将--focus=monte标志传递给go tool pprof，则根本不会显示任何样本。我假设我在某处遗漏了一些标志；有人有什么想法吗？谢谢！

score 2 · Accepted Answer

你的main包没有编译，你也没有为你的monte包提供源代码。因此，我们无法重现您的结果。这是调试的第一步。

这是一些有效的源代码和结果。

package monte

func ExpensiveOperation() {
    var t float64
    for i := int64(0); i < 1e10; i++ {
        t += 1
    }
}

.

package main

import (
    "flag"
    "log"
    "monte"
    "os"
    "runtime/pprof"
)

var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file")

func main() {
    flag.Parse()
    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
            log.Fatal(err)
        }
        pprof.StartCPUProfile(f)
    }

    monte.ExpensiveOperation()

    pprof.StopCPUProfile()
}

.

(pprof) top10 --cum
Total: 1166 samples
       0   0.0%   0.0%     1166 100.0% main.main
    1166 100.0% 100.0%     1166 100.0% monte.ExpensiveOperation
       0   0.0% 100.0%     1166 100.0% runtime.main
       0   0.0% 100.0%     1166 100.0% schedunlock

更新：

处的代码github.com/haldean/monte没有提供有意义的结果。它只需要时间真正的 0m0.506s 并且只需要 48 个样本。

(pprof) top10 --cum
Total: 48 samples
       0   0.0%   0.0%       45  93.8% main.main
       0   0.0%   0.0%       45  93.8% monte.(*Scene).Evaluate
       0   0.0%   0.0%       45  93.8% monte.(*Scene).Render
       0   0.0%   0.0%       45  93.8% monte.(*Scene).SetColor
       0   0.0%   0.0%       45  93.8% runtime.main
       0   0.0%   0.0%       45  93.8% schedunlock
       0   0.0%   0.0%       44  91.7% monte.(*Scene).RayCast
       4   8.3%   8.3%       31  64.6% runtime.new
      13  27.1%  35.4%       27  56.2% runtime.mallocgc
       3   6.2%  41.7%       26  54.2% monte.(*Scene).DirectionAt

score 1 · Accepted Answer

它看起来像一个仅限 CPU 的分析器，因此如果您ExpensiveOperation通过执行 I/O 或睡眠或类似的事情花费时间，它将是不可见的。（这就是“cpu”分析器的问题。）

就数字的含义而言，看起来总共有 346 个样本。根据分析器的工作方式，数字有点模糊并不奇怪，但如果它是一个真正的堆栈采样器，数字将意味着：

341/346 个样本已经main和interhash在堆栈上。您可能希望所有样本都在堆栈中具有 main ，但这是软弱无力的部分。

堆栈上有 343/346 个样本Format。（为什么比main，谁知道呢？）

堆栈上有 251/346 个样本Unquote。所以在这 251 个样本中，他们可能也有main, interhash, 和Format在堆栈上。

通过这种侦探工作，您可以慢慢开始拼凑样本告诉您的内容。

当然，如果你真的可以看到堆栈样本，那么在你确切地知道发生了什么之前，你不必看到太多的样本。

更多关于这一切。

profiling - Go profiler 的输出令人困惑（而且不正确？）

2 回答 2

Related

Reference