Snippets: All About Go Strings
Greetings! Coming from a Python background, I don’t have the Go strings
API memorized; I frequently find myself repeating the same search/copy/paste for common operations. There’s also a lot of formatting related patterns that I’ve had to repeatedly look up. In this post I’m going to document some of the more useful Go strings
snippets I’ve come across, as well as some other useful concepts relating to strings in Go in general.
Right off the bat, here’s some informative reads about Strings, bytes, runes, and characters in Go and (also linked in that article) The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). The TL;DR of the first article is:
- In Go, a string is in effect a read-only slice of bytes.
- String literals often use
\xNN
notation; bytes range from hexadecimal values00
throughFF
, inclusive. - Raw strings are defined with backticks and cannot contain escape sequences.
- Source code in Go is defined to be UTF-8 text; when you assign
⌘
, your text editor places the UTF-8 encoding of the symbol into the source text. - Strings can contain arbitrary bytes, but when constructed from string literals, those bytes are UTF-8.
- A character can be a combination of code points. The term “code point” is a bit verbose; in Go it’s typically called a
rune
, but it’s also an alias forint32
. The type and value of the expressionx := `⌘`
isrune
with integer value0x2318
. - A
for
loop going index by index over a string goes byte-by-byte (i.e., indexing a string yields bytes). Afor
loop ranging over a string goes rune-by-rune.
A handy built in library is the strings
library; here’s the strings documentation. Keep in mind that the strings
API exposes a set of functions; rather than calling a method on a string, you’ll be passing a string into the appropriate function.
Here’s a snippet from the wonderful Go By Example:
package main
import (
"fmt"
s "strings"
)
var p = fmt.Println
func main() {
p("Contains: ", s.Contains("test", "es"))
p("Count: ", s.Count("test", "t"))
p("HasPrefix: ", s.HasPrefix("test", "te"))
p("HasSuffix: ", s.HasSuffix("test", "st"))
p("Index: ", s.Index("test", "e"))
p("Join: ", s.Join([]string{"a", "b"}, "-"))
p("Repeat: ", s.Repeat("a", 5))
p("Replace: ", s.Replace("foo", "o", "0", -1))
p("Replace: ", s.Replace("foo", "o", "0", 1))
p("Split: ", s.Split("a-b-c-d-e", "-"))
p("ToLower: ", s.ToLower("TEST"))
p("ToUpper: ", s.ToUpper("test"))
}
$ go run string-functions.go
Contains: true
Count: 2
HasPrefix: true
HasSuffix: true
Index: 1
Join: a-b
Repeat: aaaaa
Replace: f00
Replace: f0o
Split: [a b c d e]
ToLower: test
ToUpper: TEST
Here’s a (not comprehensive) list of other strings
functions that looked handy while I was just scrolling through the documentation:
func Clone(s string) string
`func ContainsAny(s, chars string) bool
`func Cut(s, sep string) (before, after string, found bool)
func FieldsFunc(s string, f func(rune) bool) []string
- Like
Split
but deals with runs of separators)
- Like
func Map(mapping func(rune) rune, s string) string
func ToValidUTF8(s, replacement string) string
func Trim(s, cutset string) string
Here’s some concepts to keep in mind when it comes to formatting strings (e.g., with fmt.Sprintf
). As always, there’s a relevant Go By Example Link:
%v
,%+v
, and%#v
will print stuff like{1 2}
,{field1:1 field2:2}
, andmain.myType{field1:1 field2:2}
.%T
prints the type.%t
for booleans.%d
,%b
,%c
,%x
for digit, binary, char, and hex representations of integers.%f
,%e
, and%E
for1.23
,1.23e0
,1.23E0
flavors of floats.%s
,%q
,%x
for vanilla strings, double quotes, and hex- This is a little tricky, but for clarity:
"\"string\"" -> "string", "\"string\"", 6865782074686973
. %q
verb will escape any non-printable byte sequences in a string so the output is unambiguous.%+q
will escape non-ASCII bytes while also interpreting UTF-8, so it exposes the unicode values of UTF-8 that represent non-ASCII character (i.e., you’ll see\u2318
instead of⌘
).
- This is a little tricky, but for clarity:
%p
for pointer address.- Width formatting: this is somewhat similar to Python conventions:
- By default, right justified, use
-
to left justify. - Specify the with to the left of the verb:
%6d
or%-6.2f
- There is a “space” flag:
fmt.Printf("% x\n", some_byte_slice)
will print something likebd b2 3d ...
.
- By default, right justified, use
- You can format strings with
Sprintf
:s := fmt.Sprintf("sprintf: a %s", "string")
. - You can format and print to
io.Writers
usingFprintf
:fmt.Fprintf(os.Stderr, "io: an %s\n", "error")
.