Go strings manipulation: Optimize performance x100
Optimization, faster, better… these words are always what we search for on google to make our programs more performant.
In a recent project I’ve worked on, the optimization was the main focus. While string manipulation could not be your issue if your program process few words, my project was processing millions of strings per second. Thus, each nanosecond counts!
The tips below are the fruit of 2 months of work. These tips may not all fit your needs, test them all and choose what fits your code more.
1. Measure the time
You can’t even talking about optimization and time reduction if you don’t know how many time is your fuction(s) taking. Add a timer around each string manipulation function to measure. Example: prometheus timer
import (
"math/rand"
"time"
"github.com/prometheus/client_golang/prometheus"
)
var requestDuration = prometheus.NewHistogram(prometheus.HistogramOpts{
Name: "my_request_duration",
Help: "the duration of my function",
Buckets: prometheus.LinearBuckets(0.01, 0.01, 10),
})
func MeasureTime() {
timer := prometheus.NewTimer(requestDuration)
defer timer.ObserveDuration()
// Do something here that takes time.
DoMyStringManipulationHere()
}
2. DON’T use regex
Avoid regex as much as possible. Regex is a huge time consumer, and can take milliseconds sometimes. You can do the same result with less than microsecond by writing your own logic of string manipulations.
Benchmark
func BenchmarkStringNoRegex(b *testing.B) {
myblogurl := "https://omarghader.github.io/"
for n := 0; n < b.N; n++ {
// check if the string starts with http or https
if !strings.HasPrefix(myblogurl, "http") && !strings.HasPrefix(myblogurl, "https") {
b.Errorf("string doesn't start with http or https")
}
}
}
func BenchmarkStringRegex(b *testing.B) {
myblogurl := "https://omarghader.github.io/"
regex := regexp.MustCompile("http[s]?")
for n := 0; n < b.N; n++ {
// check if the string starts with http or https
if !regex.MatchString(myblogurl) {
b.Errorf("regex doesn't match the url")
}
}
}
$ GO111MODULE=off go test -bench=. stringregex_test.go
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
BenchmarkStringNoRegex-4 1000000000 0.3279 ns/op
BenchmarkStringRegex-4 10190110 116.0 ns/op
PASS
ok command-line-arguments 2.454s
3. String replacer
If you want to replace more than 2 words, use the string replacer.
strings.NewReplacer("old1", "new1", "old2", "new2")
4. Use a string builder
If you want to concatenate string, use a string builder that has this job. Sometimes, the stringbuilder is not very efficient if you concatenate a small amount of words.
func main() {
var sb strings.Builder
for i := 0; i < 1000; i++ {
sb.WriteString("a")
}
fmt.Println(sb.String())
}
Benchmark
package main
import (
"strings"
"testing"
)
func BenchmarkStringConcat(b *testing.B) {
x := ""
for n := 0; n < b.N; n++ {
x += "a"
}
}
func BenchmarkStringBuild(b *testing.B) {
var buf strings.Builder
buf.Reset()
for n := 0; n < b.N; n++ {
buf.WriteString("a")
}
}
$ GO111MODULE=off go test -bench=.
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
BenchmarkStringConcat-4 429110 38720 ns/op
BenchmarkStringBuild-4 251992339 4.709 ns/op
PASS
ok command-line-arguments 19.430s
5. Less loops, high performance
Back to the basics, if you have to loop on the string, try to do it as less as possible.
6. IP valid check
Regex are heavy as said before, use net.ParseIP
instead
net.ParseIP(host)
Benchmark
package main
import (
"net"
"regexp"
"testing"
)
var IpAddress = "192.168.152.32"
func BenchmarkIPNetParse(b *testing.B) {
for n := 0; n < b.N; n++ {
ip := net.ParseIP(IpAddress)
if ip == nil {
b.Errorf("ip not parsed")
}
}
}
func BenchmarkIPRegex(b *testing.B) {
ipv4Pattern := `^(((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4})`
ipv4Regex, err := regexp.Compile(ipv4Pattern)
if err != nil {
b.Errorf("regex ip not compiled %s", err.Error())
}
for n := 0; n < b.N; n++ {
if matched := ipv4Regex.MatchString(IpAddress); !matched {
b.Errorf("ip not parsed")
}
}
}
$ GO111MODULE=off go test -bench=.
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
BenchmarkIPNetParse-4 16786245 71.35 ns/op
BenchmarkIPRegex-4 2056546 574.5 ns/op
PASS
ok _/tmp/golang 4.062s