I'm just curious on which of these methods is better (or if there's an even better one that I'm missing). I'm trying to determine if the first letter and last letter of a word are the same, and there are two obvious solutions to me.
if word[:1] == word[len(word)-1:]
or
if word[0] == word[len(word)-1]
As I understand it, the first is just pulling slices of the string and doing a string comparison, while the second is pulling the character from either end and comparing as bytes.
I'm curious if there's a performance difference between the two, and if there's any "preferable" way to do this?
If by letter you mean rune, then use:
func eqRune(s string) bool {
if s == "" {
return false // or true if that makes more sense for the app
}
f, _ := utf8.DecodeRuneInString(s) // 2nd return value is rune size. ignore it.
l, _ := utf8.DecodeLastRuneInString(s) // 2nd return value is rune size. ignore it.
if f != l {
return false
}
if f == unicode.ReplacementChar {
// First and last are invalid UTF-8. Fallback to
// comparing bytes.
return s[0] == s[len(s)-1]
}
return true
}
If you mean byte, then use:
func eqByte(s string) bool {
if s == "" {
return false // or true if that makes more sense for the app
}
return s[0] == s[len(s)-1]
}
Comparing individual bytes is faster than comparing string slices as shown by the benchmark in another answer.
playground example
In Go, string
s are UTF-8 encoded. UTF-8 is a variable-length encoding.
package main
import "fmt"
func main() {
word := "世界世"
fmt.Println(word[:1] == word[len(word)-1:])
fmt.Println(word[0] == word[len(word)-1])
}
Output:
false
false
If you really want to compare a byte, not a character, then be as precise as possible for the compiler. Obviously, compare a byte, not a slice.
BenchmarkSlice-4 200000000 7.55 ns/op
BenchmarkByte-4 2000000000 1.08 ns/op
package main
import "testing"
var word = "word"
func BenchmarkSlice(b *testing.B) {
for i := 0; i < b.N; i++ {
if word[:1] == word[len(word)-1:] {
}
}
}
func BenchmarkByte(b *testing.B) {
for i := 0; i < b.N; i++ {
if word[0] == word[len(word)-1] {
}
}
}
A string is a sequence of bytes. Your method works if you know the string contains only ASCII characters. Otherwise, you should use a method that handles multibyte characters instead of string indexing. You can convert it to a rune slice to process code points or characters, like this:
r := []rune(s)
return r[0] == r[len(r) - 1]
You can read more about strings, byte slices, runes, and code points in the official Go Blog post on the subject.
To answer your question, there's no significant performance difference between the two index expressions you posted.
Here's a runnable example:
package main
import "fmt"
func EndsMatch(s string) bool {
r := []rune(s)
return r[0] == r[len(r) - 1]
}
func main() {
tests := []struct{
s string
e bool
}{
{"foo", false},
{"eve", true},
{"世界世", true},
}
for _, t := range tests {
r := EndsMatch(t.s)
if r != t.e {
fmt.Printf("EndsMatch(%s) failed: expected %t, got %t\n", t.s, t.e, r)
}
}
}
Prints nothing.