I'm stress testing (with loader.io) this type of code in Go to create an array of 100 items along with some other basic variables and parse them all in the template:
package main
import (
"html/template"
"net/http"
)
var templates map[string]*template.Template
// Load templates on program initialisation
func init() {
if templates == nil {
templates = make(map[string]*template.Template)
}
templates["index.html"] = template.Must(template.ParseFiles("index.html"))
}
func handler(w http.ResponseWriter, r *http.Request) {
type Post struct {
Id int
Title, Content string
}
var Posts [100]Post
// Fill posts
for i := 0; i < 100; i++ {
Posts[i] = Post{i, "Sample Title", "Lorem Ipsum Dolor Sit Amet"}
}
type Page struct {
Title, Subtitle string
Posts [100]Post
}
var p Page
p.Title = "Index Page of My Super Blog"
p.Subtitle = "A blog about everything"
p.Posts = Posts
tmpl := templates["index.html"]
tmpl.ExecuteTemplate(w, "index.html", p)
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":8888", nil)
}
My test with Loader is using 5k concurrent connections/s through 1 minute. The problem is, just after a few seconds after starting the test, I get a high average latency (almost 10s) and as a result 5k successful responses and the test stops because it reaches the Error Rate of 50% (timeouts).
On the same machine, PHP gives 50k+.
I understand that it's not Go performance issue, but probably something related to html/template. Go can easily manage hard enough calculations a lot faster than anything like PHP of course, but when it comes to parsing the data to the template, why is it so awful?
Any workarounds, or probably I'm just doing it wrong (I'm new to Go)?
P.S. Actually even with 1 item it's exactly the same... 5-6k and stopping after huge amount of timeouts. But that's probably because the array with posts is staying of the same length.
My template code (index.html):
{{ .Title }}
{{ .Subtitle }}
{{ range .Posts }}
{{ .Title }}
{{ .Content }}
{{ end }}
Here's the profiling result of github.com/pkg/profile:
root@Test:~# go tool pprof app /tmp/profile311243501/cpu.pprof
Possible precedence issue with control flow operator at /usr/lib/go/pkg/tool/linux_amd64/pprof line 3008.
Welcome to pprof! For help, type 'help'.
(pprof) top10
Total: 2054 samples
97 4.7% 4.7% 726 35.3% reflect.Value.call
89 4.3% 9.1% 278 13.5% runtime.mallocgc
85 4.1% 13.2% 86 4.2% syscall.Syscall
66 3.2% 16.4% 75 3.7% runtime.MSpan_Sweep
58 2.8% 19.2% 1842 89.7% text/template.(*state).walk
54 2.6% 21.9% 928 45.2% text/template.(*state).evalCall
51 2.5% 24.3% 53 2.6% settype
47 2.3% 26.6% 47 2.3% runtime.stringiter2
44 2.1% 28.8% 149 7.3% runtime.makeslice
40 1.9% 30.7% 223 10.9% text/template.(*state).evalField
These are profiling results after refining the code (as suggested in the answer by icza):
root@Test:~# go tool pprof app /tmp/profile501566907/cpu.pprof
Possible precedence issue with control flow operator at /usr/lib/go/pkg/tool/linux_amd64/pprof line 3008.
Welcome to pprof! For help, type 'help'.
(pprof) top10
Total: 2811 samples
137 4.9% 4.9% 442 15.7% runtime.mallocgc
126 4.5% 9.4% 999 35.5% reflect.Value.call
113 4.0% 13.4% 115 4.1% syscall.Syscall
110 3.9% 17.3% 122 4.3% runtime.MSpan_Sweep
102 3.6% 20.9% 2561 91.1% text/template.(*state).walk
74 2.6% 23.6% 337 12.0% text/template.(*state).evalField
68 2.4% 26.0% 72 2.6% settype
66 2.3% 28.3% 1279 45.5% text/template.(*state).evalCall
65 2.3% 30.6% 226 8.0% runtime.makeslice
57 2.0% 32.7% 57 2.0% runtime.stringiter2
(pprof)