Snippets: All About Go Strings
Greetings! Coming from a Python background, I don’t have the Go strings API memorized; I frequently find myself repeating the same search/copy/paste for common operations. There’s also a lot of formatting related patterns that I’ve had to repeatedly look up. In this post I’m going to document some of the more useful Go strings snippets I’ve come across, as well as some other useful concepts relating to strings in Go in general.
Right off the bat, here’s some informative reads about Strings, bytes, runes, and characters in Go and (also linked in that article) The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). The TL;DR of the first article is:
- In Go, a string is in effect a read-only slice of bytes.
- String literals often use
\xNNnotation; bytes range from hexadecimal values00throughFF, inclusive. - Raw strings are defined with backticks and cannot contain escape sequences.
- Source code in Go is defined to be UTF-8 text; when you assign
⌘, your text editor places the UTF-8 encoding of the symbol into the source text. - Strings can contain arbitrary bytes, but when constructed from string literals, those bytes are UTF-8.
- A character can be a combination of code points. The term “code point” is a bit verbose; in Go it’s typically called a
rune, but it’s also an alias forint32. The type and value of the expressionx := `⌘`isrunewith integer value0x2318. - A
forloop going index by index over a string goes byte-by-byte (i.e., indexing a string yields bytes). Aforloop ranging over a string goes rune-by-rune.
A handy built in library is the strings library; here’s the strings documentation. Keep in mind that the strings API exposes a set of functions; rather than calling a method on a string, you’ll be passing a string into the appropriate function.
Here’s a snippet from the wonderful Go By Example:
package main
import (
"fmt"
s "strings"
)
var p = fmt.Println
func main() {
p("Contains: ", s.Contains("test", "es"))
p("Count: ", s.Count("test", "t"))
p("HasPrefix: ", s.HasPrefix("test", "te"))
p("HasSuffix: ", s.HasSuffix("test", "st"))
p("Index: ", s.Index("test", "e"))
p("Join: ", s.Join([]string{"a", "b"}, "-"))
p("Repeat: ", s.Repeat("a", 5))
p("Replace: ", s.Replace("foo", "o", "0", -1))
p("Replace: ", s.Replace("foo", "o", "0", 1))
p("Split: ", s.Split("a-b-c-d-e", "-"))
p("ToLower: ", s.ToLower("TEST"))
p("ToUpper: ", s.ToUpper("test"))
}
$ go run string-functions.go
Contains: true
Count: 2
HasPrefix: true
HasSuffix: true
Index: 1
Join: a-b
Repeat: aaaaa
Replace: f00
Replace: f0o
Split: [a b c d e]
ToLower: test
ToUpper: TEST
Here’s a (not comprehensive) list of other strings functions that looked handy while I was just scrolling through the documentation:
func Clone(s string) string`func ContainsAny(s, chars string) bool`func Cut(s, sep string) (before, after string, found bool)func FieldsFunc(s string, f func(rune) bool) []string- Like
Splitbut deals with runs of separators)
- Like
func Map(mapping func(rune) rune, s string) stringfunc ToValidUTF8(s, replacement string) stringfunc Trim(s, cutset string) string
Here’s some concepts to keep in mind when it comes to formatting strings (e.g., with fmt.Sprintf). As always, there’s a relevant Go By Example Link:
%v,%+v, and%#vwill print stuff like{1 2},{field1:1 field2:2}, andmain.myType{field1:1 field2:2}.%Tprints the type.%tfor booleans.%d,%b,%c,%xfor digit, binary, char, and hex representations of integers.%f,%e, and%Efor1.23,1.23e0,1.23E0flavors of floats.%s,%q,%xfor vanilla strings, double quotes, and hex- This is a little tricky, but for clarity:
"\"string\"" -> "string", "\"string\"", 6865782074686973. %qverb will escape any non-printable byte sequences in a string so the output is unambiguous.%+qwill escape non-ASCII bytes while also interpreting UTF-8, so it exposes the unicode values of UTF-8 that represent non-ASCII character (i.e., you’ll see\u2318instead of⌘).
- This is a little tricky, but for clarity:
%pfor pointer address.- Width formatting: this is somewhat similar to Python conventions:
- By default, right justified, use
-to left justify. - Specify the with to the left of the verb:
%6dor%-6.2f - There is a “space” flag:
fmt.Printf("% x\n", some_byte_slice)will print something likebd b2 3d ....
- By default, right justified, use
- You can format strings with
Sprintf:s := fmt.Sprintf("sprintf: a %s", "string"). - You can format and print to
io.WritersusingFprintf:fmt.Fprintf(os.Stderr, "io: an %s\n", "error").