可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
In GO when I use a struct as a key for a map, there is an unicity of the keys.
For example, the following code produce a map with only one key : map[{x 1}:1]
package main
import (
"fmt"
)
type MyT struct {
A string
B int
}
func main() {
dic := make(map[MyT]int)
for i := 1; i <= 10; i++ {
dic[MyT{"x", 1}] = 1
}
fmt.Println(dic)
}
// result : map[{x 1}:1]
I Tried to do the same in Julia and I had a strange surprise :
This Julia code, similar to the GO one, produces a dictionary whith 10 keys !
type MyT
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
println(dic)
# Dict(MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1)
println(keys(dic))
# MyT[MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1)]
So what I did wrong ?
Thank you @DanGetz for the solution ! :
immutable MyT # or struct MyT with julia > 0.6
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
println(dic) # Dict(MyT("x", 1)=>1)
println(keys(dic)) # MyT[MyT("x", 1)]
回答1:
Mutable values hash by identity in Julia, since without additional knowledge about what a type represents, one cannot know if two values with the same structure mean the same thing or not. Hashing mutable objects by value can be especially problematic if you mutate a value after using it as a dictionary key – this is not a problem when hashing by identity since the identity of a mutable object remains the same even when it is modified. On the other hand, it's perfectly safe to hash immutable objects by value – since they cannot be mutated, and accordingly that is the default behavior for immutable types. In the given example, if you make MyT
immutable you will automatically get the behavior you're expecting:
immutable MyT # `struct MyT` in 0.6
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
julia> dic
Dict{MyT,Int64} with 1 entry:
MyT("x", 1) => 1
julia> keys(dic)
Base.KeyIterator for a Dict{MyT,Int64} with 1 entry. Keys:
MyT("x", 1)
For a type holding a String
and an Int
value that you want to use as a hash key, immutability is probably the right choice. In fact, immutability is the right choice more often than not, which is why the keywords introducing structural types has been change in 0.6 to struct
for immutable structures and mutable struct
for mutable structures – on the principle that people will reach for the shorter, simpler name first, so that should be the better default choice – i.e. immutability.
As @ntdef has written, you can change the hashing behavior of your type by overloading the Base.hash
function. However, his definition is incorrect in a few respects (which is probably our fault for failing to document this more prominently and thoroughly):
- The method signature of
Base.hash
that you want to overload is Base.hash(::T, ::UInt)
.
- The
Base.hash(::T, ::UInt)
method must return a UInt
value.
- If you are overloading
Base.hash
, you should also overload Base.==
to match.
So this would be a correct way to make your mutable type hash by value (new Julia session required to redefine MyT
):
type MyT # `mutable struct MyT` in 0.6
A::String
B::Int64
end
import Base: ==, hash
==(x::MyT, y::MyT) = x.A == y.A && x.B == y.B
hash(x::MyT, h::UInt) = hash((MyT, x.A, x.B), h)
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
julia> dic
Dict{MyT,Int64} with 1 entry:
MyT("x", 1) => 1
julia> keys(dic)
Base.KeyIterator for a Dict{MyT,Int64} with 1 entry. Keys:
MyT("x", 1)
This is kind of annoying to do manually, but the AutoHashEquals package automates this, taking the tedium out of it. All you need to do is prefix the type
definition with the @auto_hash_equals
macro:
using AutoHashEquals
@auto_hash_equals type MyT # `@auto_hash_equals mutable struct MyT` in 0.6
A::String
B::Int64
end
Bottom line:
If you have a type that should have value-based equality and hashing, seriously consider making it immutable.
If your type really has to be mutable, then think hard about whether it's a good idea to use as a hash key.
If you really need to use a mutable type as a hash key with value-based equality and hashing semantics, use the AutoHashEquals
package.
回答2:
You did not do anything wrong. The difference between the languages is in how they choose to hash a struct when using it as a key in the map/Dict. In go, structs are hashed by their values rather than their pointer addresses. This allows programmers to more easily implement multidimensional maps by using structs rather than maps of maps. See this blog post for more info.
Reproducing Julia's Behavior in Go
To reproduce Julia's behavior in go, redefine the map to use a pointer to MyT
as the key type:
func main() {
dic := make(map[MyT]int)
pdic := make(map[*MyT]int)
for i := 1; i <= 10; i++ {
t := MyT{"x", 1}
dic[t] = 1
pdic[&t] = 1
}
fmt.Println(dic)
fmt.Println(pdic)
}
Here, pdic
uses the pointer to a MyT
struct as its key type. Because each MyT
created in the loop has a different memory address, the key will be different. This produces the output:
map[{x 1}:1]
map[0x1040a140:1 0x1040a150:1 0x1040a160:1 0x1040a180:1 0x1040a1b0:1 0x1040a1c0:1 0x1040a130:1 0x1040a170:1 0x1040a190:1 0x1040a1a0:1]
You can play with this on play.golang.org. Unlike in Julia (see below), the way the map type is implemented go means you cannot specify a custom hashing function for a user-defined struct.
Reproducing Go's Behavior in Julia
Julia uses the function Base.hash(::K, ::UInt)
to hash keys for its Dict{K,V}
type. While it doesn't explicitly say so in the documentation, the default hashing algorithm uses the output from object_id
, as you can see in the source code. To reproduce go's behavior in Julia, define a new hash
function for your type that hashes the values of the struct:
Base.hash(t::MyT, h::Uint) = Base.hash((t.A, t.B), h)
Note that you should also define the == operator in the same way to guarantee hash(x)==hash(y)
implies isequal(x,y)
, as mentioned in the documentation.
However, the easiest way to get Julia to act like go in your example is to redefine MyT
as immutable
. As an immutable type, Julia will hash MyT
by its value rather than its object_id
. As an example:
immutable MyT
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
dic[MyT("y", 2)] = 2
println(dic) # prints "Dict(MyT("y",2)=>2,MyT("x",1)=>1)"
回答3:
Edit: Please refer to @StefanKarpinski's answer. The Base.hash
function must return a UInt
for it to be a valid hash, so my example won't work. Also there's some funkiness regarding user defined types which involves recursion.
The reason you get 10 different keys is due to the fact that Julia uses the hash
function when determining the key to a dict. In this case, I'm guessing that it's using the address of the object in memory as the key for the dictionary. If you'd like to explicitly make (A,B)
the unique key, you'll need to override the hash
function for your particular type, with something like this:
Base.hash(x::MyT) = (x.A, x.B)
That will replicate the Go behavior, with only one item in the Dict
.
Here's the documentation to the hash
function.
Hope that helps!