memory efficient way

2019-09-20 16:47发布

I have two examples of the similar program written on Go. Main aim of that code is sort map of structs using value in the struct.

Example with pointers

package main

import (
    "fmt"
    "sort"
)

type payload struct {
    data string
    value  float64
}

type container struct {
    counter int
    storage map[int]*payload
}

type payloadSlice []*payload

// Len is part of sort.Interface.
func (p payloadSlice) Len() int {
    return len(p)
}

// Swap is part of sort.Interface.
func (p payloadSlice) Swap(i, j int) {
    p[i], p[j] = p[j], p[i]
}

// Less is part of sort.Interface. We use count as the value to sort by
func (p payloadSlice) Less(i, j int) bool {
    return p[i].value < p[j].value
}
func main() {
    name := "special_unique_name"
    var m = map[string]container{
        name: {counter: 10, storage: map[int]*payload{
            5: {data: "epsilon", value: 55},8: {data: "theta", value: 85},4: {data: "delta", value: 48},1: {data: "alpha", value: 14},10: {data: "kappa", value: 101},
            3: {data: "gamma", value: 31},6: {data: "zeta", value: 63},2: {data: "beta", value: 26},9: {data: "iota", value: 92},7: {data: "eta", value: 79},
        }},
    }
    s := make(payloadSlice, 0, len(m[name].storage))
    for _, v := range m[name].storage {
        s = append(s, v)
    }
    sort.Sort(s)

    for _, v := range s {
        fmt.Println(name, v)
    }
}

Examples with values

package main

import (
    "fmt"
    "sort"
)

type payload struct {
    data string
    value  float64
}

type container struct {
    counter int
    storage map[int]payload
}

type payloadSlice []payload

// Len is part of sort.Interface.
func (p payloadSlice) Len() int {
    return len(p)
}

// Swap is part of sort.Interface.
func (p payloadSlice) Swap(i, j int) {
    p[i], p[j] = p[j], p[i]
}

// Less is part of sort.Interface. We use count as the value to sort by
func (p payloadSlice) Less(i, j int) bool {
    return p[i].value < p[j].value
}
func main() {
    name := "special_unique_name"
    var m = map[string]container{
        name: {counter: 10, storage: map[int]payload{
            5: {data: "epsilon", value: 55},8: {data: "theta", value: 85},4: {data: "delta", value: 48},1: {data: "alpha", value: 14},10: {data: "kappa", value: 101},
            3: {data: "gamma", value: 31},6: {data: "zeta", value: 63},2: {data: "beta", value: 26},9: {data: "iota", value: 92},7: {data: "eta", value: 79},
        }},
    }
    s := make(payloadSlice, 0, len(m[name].storage))
    for _, v := range m[name].storage {
        s = append(s, v)
    }
    sort.Sort(s)

    for _, v := range s {
        fmt.Println(name, v)
    }
}

I'd like to know 2 moments:

  1. Which example will be memory-efficient? (I guess it's a pointer way)

  2. How to measure performance of these examples, using test data with with different number of structs inside the map? Can you help me with creating Benchmark?

I suppose the size of each struct in the map will vary from 1-2kB in average.

1条回答
三岁会撩人
2楼-- · 2019-09-20 17:18

"Memory-efficient" is a pretty broad term, and can mean a couple of very different things in a garbage-collected language like Go that has separate heap and stack:

  • What uses the least memory?
  • What creates the least GC pressure?

If you want to minimize the application's footprint, you probably want to use pointers any time that you use a value in multiple scopes (e.g. multiple functions). This reduces copying, but adds overhead equal to the pointer size (8 bytes on a 64-bit system).

If you want to minimize GC pressure, you probably want to use pointers only when you need pointer semantics, or the underlying values are quite large. A pointer forces the value onto the heap, which is subject to garbage collection, while a value can be kept on the stack, which is not (when the function returns, the stack is destroyed in its entirety, which is thread-safe and requires no reference tracking).

"GC pressure" is the idea that the more things are created and destroyed on the heap, the more work the garbage collector has to do, which takes processor time away from the real work your application is doing. Every time you allocate on the heap, if there isn't space for the new value, the garbage collector will try to free space by looking for values on the heap that are no longer needed. The more you allocate on the heap, the more often GC has to run, and the longer those runs will take.

To your second question, you can (and should!) measure performance of various approaches to your specific circumstance using the benchmarking facility of the testing package. Make sure you test with realistic data and operations; microbenchmarks or benchmarks using "dummy" data types are unlikely to yield data of any value. The documentation for that package, and countless blog posts and tutorials easily found by web search, should guide you in the right direction on how to write and use benchmarks in Go.

In your specific case, bear in mind that your data type is - as far as this question is concerned - smaller than you think: 24 bytes on a 64-bit system, regardless of the length of the string. Why? Because a string, internally, is a struct containing an int for the length and a pointer to the underlying bytes. When you're trying to optimize for memory use, remember that strings, slices (but not arrays!), and maps are all very small structs containing pointers to their underlying data.

And most importantly: premature optimization is the root of all evil. You should be writing code for two things: functionality, and readability. Use pointer semantics when they deliver the functionality you need, and make intuitive sense to use. If you measure a resource problem (CPU or memory), then you should profile your application to find the sources of the problem, prioritize them, and optimize them.

Until you have measured and profiled a performance problem, you do not have a performance problem.

查看更多
登录 后发表回答