Should I useval or def when defining a Stream?

2019-01-18 23:40发布

In an answer to a StackOverflow question I created a Stream as a val, like so:

val s:Stream[Int] = 1 #:: s.map(_*2)

and somebody told me that def should be used instead of val because Scala Kata complains (as does the Scala Worksheet in Eclipse) that a "forward reference extends over definition of value s."

But the examples in the Stream docs use val. Which one is right?

标签: scala stream
1条回答
Deceive 欺骗
2楼-- · 2019-01-19 00:26

Scalac and the REPL are fine with that code (using val) as long as the variable is a field of a class rather than a local variable. You can make the variable lazy to satisfy Scala Kata, but you generally wouldn't want to use def in this way (that is, def a Stream in terms of itself) in a real program. If you do, a new Stream is created each time the method is invoked, so the results of previous computations (which are saved in the Stream) can never be reused. If you use many values from such a Stream, performance will be terrible, and eventually you will run out of memory.

This program demonstrates the problem with using def in this way:

// Show the difference between the use of val and def with Streams.

object StreamTest extends App {

  def sum( p:(Int,Int) ) = { println( "sum " + p ); p._1 + p._2 }

  val fibs1: Stream[Int] = 0 #:: 1 #:: ( fibs1 zip fibs1.tail map sum )
  def fibs2: Stream[Int] = 0 #:: 1 #:: ( fibs2 zip fibs2.tail map sum )

  println("========== VAL ============")
  println( "----- Take 4:" ); fibs1 take 4 foreach println
  println( "----- Take 5:" ); fibs1 take 5 foreach println

  println("========== DEF ============")
  println( "----- Take 4:" ); fibs2 take 4 foreach println
  println( "----- Take 5:" ); fibs2 take 5 foreach println
}

Here is the output:

========== VAL ============
----- Take 4:
0
1
sum (0,1)
1
sum (1,1)
2
----- Take 5:
0
1
1
2
sum (1,2)
3
========== DEF ============
----- Take 4:
0
1
sum (0,1)
1
sum (0,1)
sum (1,1)
2
----- Take 5:
0
1
sum (0,1)
1
sum (0,1)
sum (1,1)
2
sum (0,1)
sum (0,1)
sum (1,1)
sum (1,2)
3

Notice that when we used val:

  • The "take 5" didn't recompute the values computed by the "take 4".
  • Computing the 4th value in the "take 4" didn't cause the 3rd value to be recomputed.

But neither of those is true when we use def. Every use of the Stream, including its own recursion, starts from scratch with a new Stream. Since producing the Nth value requires that we first produce the values for N-1 and N-2, each of which must produce its own two predecessors and so on, the number of calls to sum() required to produce a value grows much like the Fibonacci sequence itself: 0, 0, 1, 2, 4, 7, 12, 20, 33, .... And since all of those Streams are on the heap at the same time, we quickly run out of memory.

So given the poor performance and memory issues, you generally don't want to use def in creating a Stream.

But it might be that you actually do want a new Stream each time. Let's say that you want a Stream of random integers, and each time you access the Stream you want new integers, not a replay of previously computed integers. And those previously computed values, since you don't want to reuse them, would take up space on the heap needlessly. In that case it makes sense to use def so that you get a new Stream each time and don't hold on to it, so that it can be garbage-collected:

scala> val randInts = Stream.continually( util.Random.nextInt(100) )
randInts: scala.collection.immutable.Stream[Int] = Stream(1, ?)

scala> ( randInts take 1000 ).sum
res92: Int = 51535

scala> ( randInts take 1000 ).sum
res93: Int = 51535                   <== same answer as before, from saved values

scala> def randInts = Stream.continually( util.Random.nextInt(100) )
randInts: scala.collection.immutable.Stream[Int]

scala> ( randInts take 1000 ).sum
res94: Int = 49714

scala> ( randInts take 1000 ).sum
res95: Int = 48442                   <== different Stream, so new answer

Making randInts a method causes us to get a new Stream each time, so we get new values, and the Stream can be collected.

Notice that it only makes sense to use def here because new values don't depend on old values, so randInts is not defined in terms of itself. Stream.continually is an easy way to produce such Streams: you just tell it how to make a value and it makes a Stream for you.

查看更多
登录 后发表回答