/ #go #golang 

Go strings manipulation: Optimize performance x100

Optimization, faster, better… these words are always what we search for on google to make our programs more performant.

In a recent project I’ve worked on, the optimization was the main focus. While string manipulation could not be your issue if your program process few words, my project was processing millions of strings per second. Thus, each nanosecond counts!

The tips below are the fruit of 2 months of work. These tips may not all fit your needs, test them all and choose what fits your code more.

1. Measure the time

You can’t even talking about optimization and time reduction if you don’t know how many time is your fuction(s) taking. Add a timer around each string manipulation function to measure. Example: prometheus timer

import (
	"math/rand"
	"time"

	"github.com/prometheus/client_golang/prometheus"
)

var requestDuration = prometheus.NewHistogram(prometheus.HistogramOpts{
	Name:    "my_request_duration",
	Help:    "the duration of my function",
	Buckets: prometheus.LinearBuckets(0.01, 0.01, 10),
})

func MeasureTime() {
	timer := prometheus.NewTimer(requestDuration)
	defer timer.ObserveDuration()

	// Do something here that takes time.
	DoMyStringManipulationHere()
}

2. DON’T use regex

Avoid regex as much as possible. Regex is a huge time consumer, and can take milliseconds sometimes. You can do the same result with less than microsecond by writing your own logic of string manipulations.

Benchmark

func BenchmarkStringNoRegex(b *testing.B) {
	myblogurl := "https://omarghader.github.io/"
	for n := 0; n < b.N; n++ {
		// check if the string starts with http or https
		if !strings.HasPrefix(myblogurl, "http") && !strings.HasPrefix(myblogurl, "https") {
			b.Errorf("string doesn't start with http or https")
		}
	}
}

func BenchmarkStringRegex(b *testing.B) {
	myblogurl := "https://omarghader.github.io/"
	regex := regexp.MustCompile("http[s]?")
	for n := 0; n < b.N; n++ {
		// check if the string starts with http or https
		if !regex.MatchString(myblogurl) {
			b.Errorf("regex doesn't match the url")
		}
	}
}
$ GO111MODULE=off go test -bench=. stringregex_test.go
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
BenchmarkStringNoRegex-4        1000000000               0.3279 ns/op
BenchmarkStringRegex-4          10190110               116.0 ns/op
PASS
ok      command-line-arguments  2.454s

3. String replacer

If you want to replace more than 2 words, use the string replacer.

strings.NewReplacer("old1", "new1", "old2", "new2")

4. Use a string builder

If you want to concatenate string, use a string builder that has this job. Sometimes, the stringbuilder is not very efficient if you concatenate a small amount of words.

func main() {
    var sb strings.Builder

    for i := 0; i < 1000; i++ {
        sb.WriteString("a")
    }

    fmt.Println(sb.String())
}

Benchmark

package main

import (
	"strings"
	"testing"
)

func BenchmarkStringConcat(b *testing.B) {
	x := ""
	for n := 0; n < b.N; n++ {
		x += "a"
	}
}

func BenchmarkStringBuild(b *testing.B) {
	var buf strings.Builder
	buf.Reset()
	for n := 0; n < b.N; n++ {
		buf.WriteString("a")
	}
}
$ GO111MODULE=off go test -bench=.
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
BenchmarkStringConcat-4           429110             38720 ns/op
BenchmarkStringBuild-4          251992339                4.709 ns/op
PASS
ok      command-line-arguments  19.430s

5. Less loops, high performance

Back to the basics, if you have to loop on the string, try to do it as less as possible.

6. IP valid check

Regex are heavy as said before, use net.ParseIP instead

net.ParseIP(host)

Benchmark

package main

import (
	"net"
	"regexp"
	"testing"
)

var IpAddress = "192.168.152.32"

func BenchmarkIPNetParse(b *testing.B) {
	for n := 0; n < b.N; n++ {
		ip := net.ParseIP(IpAddress)
		if ip == nil {
			b.Errorf("ip not parsed")
		}
	}
}

func BenchmarkIPRegex(b *testing.B) {
	ipv4Pattern := `^(((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4})`
	ipv4Regex, err := regexp.Compile(ipv4Pattern)
	if err != nil {
		b.Errorf("regex ip not compiled %s", err.Error())
	}
	for n := 0; n < b.N; n++ {
		if matched := ipv4Regex.MatchString(IpAddress); !matched {
			b.Errorf("ip not parsed")
		}
	}
}

$ GO111MODULE=off go test -bench=.
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
BenchmarkIPNetParse-4           16786245                71.35 ns/op
BenchmarkIPRegex-4               2056546               574.5 ns/op
PASS
ok      _/tmp/golang    4.062s