Scala: Can I rely on the order of items in a Set?

2019-01-18 01:44发布

This was quite an unplesant surprise:

scala> Set(1, 2, 3, 4, 5)       
res18: scala.collection.immutable.Set[Int] = Set(4, 5, 1, 2, 3)
scala> Set(1, 2, 3, 4, 5).toList
res25: List[Int] = List(5, 1, 2, 3, 4)

The example by itself suggest a "no" answer to my question. Then what about ListSet?

scala> import scala.collection.immutable.ListSet
scala> ListSet(1, 2, 3, 4, 5)
res21: scala.collection.immutable.ListSet[Int] = Set(1, 2, 3, 4, 5)

This one seems to work, but should I rely on this behavior? What other data structure is suitable for an immutable collection of unique items, where the original order must be preserved?

By the way, I do know about distict method in List. The problem is, I want to enforce uniqueness of items (while preserving the order) at interface level, so using distinct would mess up my neat design..

EDIT

ListSet doesn't seem very reliable either:

scala> ListSet(1, 2, 3, 4, 5).toList
res28: List[Int] = List(5, 4, 3, 2, 1)

EDIT2

In my search for a perfect design I tried this:

scala> class MyList[A](list: List[A]) { val values = list.distinct }
scala> implicit def toMyList[A](l: List[A]) = new MyList(l)
scala> implicit def fromMyList[A](l: MyList[A]) = l.values     

Which actually works:

scala> val l1: MyList[Int] = List(1, 2, 3)
scala> l1.values
res0: List[Int] = List(1, 2, 3)

scala> val l2: List[Int] = new MyList(List(1, 2, 3))
l2: List[Int] = List(1, 2, 3)

The problem, however, is that I do not want to expose MyList outside the library. Is there any way to have the implicit conversion when overriding? For example:

trait T { def l: MyList[_] }
object O extends T { val l: MyList[_] = List(1, 2, 3) }
scala> O.l mkString(" ")  // Let's test the implicit conversion
res7: String = 1 2 3      

I'd like to do it like this:

object O extends T { val l = List(1, 2, 3) }  // Doesn't work

标签: scala set
3条回答
迷人小祖宗
2楼-- · 2019-01-18 02:20

It is my belief that you should never rely on the order in a set. In no language.

Apart from that, have a look at this question which talks about this in depth.

查看更多
贪生不怕死
3楼-- · 2019-01-18 02:28

ListSet will always return elements in the reverse order of insertion because it is backed by a List, and the optimal way of adding elements to a List is by prepending them.

Immutable data structures are problematic if you want first in, first out (a queue). You can get O(logn) or amortized O(1). Given the apparent need to build the set and then produce an iterator out of it (ie, you'll first put all elements, then you'll remove all elements), I don't see any way to amortize it.

You can rely that a ListSet will always return elements in last in, first out order (a stack). If that suffices, then go for it.

查看更多
闹够了就滚
4楼-- · 2019-01-18 02:32

That depends on the Set you are using. If you do not know which Set implementation you have, then the answer is simply, no you cannot be sure. In practice I usually encounter the following three cases:

  1. I need the items in the Set to be ordered. For this I use classes mixing in the SortedSet trait which when you use only the Standard Scala API is always a TreeSet. It guarantees the elements are ordered according to their compareTo method (see the Ordered trat). You get a (very) small performance penalty for the sorting as the runtime of inserts/retrievals is now logarithmic, not (almost) constant like with the HashSet (assuming a good hash function).

  2. You need to preserve the order in which the items are inserted. Then you use the LinkedHashSet. Practically as fast as the normal HashSet, needs a little more storage space for the additional links between elements.

  3. You do not care about order in the Set. So you use a HashSet. (That is the default when using the Set.apply method like in your first example)

All this applies to Java as well, Java has a TreeSet, LinkedHashSet and HashSet and the corresponding interfaces SortedSet, Comparable and plain Set.

查看更多
登录 后发表回答