How to serialize functions in Scala?

2019-04-15 11:35发布

I'm cutting my teeth on akka-persistence and came upon the quintessential problem of object serialization. My objects (shown below) have basic types, and functions. I read this, this and this, but none has helped me in making the following serializable.

Test Util

object SerializationUtil {
  def write(obj: Any): String = {
    val temp = Files.createTempFile(null, null).toFile
    val out = new ObjectOutputStream(new FileOutputStream(temp))
    out.writeObject(obj)
    out.close()

    temp.deleteOnExit()
    temp.getAbsolutePath
  }

  def read[T](file: String) = {
    val in = new ObjectInputStream(new FileInputStream(new File(file)))
    val obj = in.readObject().asInstanceOf[T]
    in.close()
    obj
  }
}

Stats

case class Stats(
                  app: String,
                  unit: ChronoUnit,
                  private var _startupDurations: List[Long]
                ) {
  def startupDurations = _startupDurations.sorted

  def startupDurations_=(durations: List[Long]) = _startupDurations = durations

  @transient lazy val summary: LongSummaryStatistics = {
    _startupDurations.asJava.stream()
      .collect(summarizingLong(identity[Long]))
  }
}

Stats serializes just fine.

"SerializationUtil" should "(de)serialize Stats" in {
  val file = SerializationUtil.write(newStats())
  val state = SerializationUtil.read[Stats](file)

  verifyStats(state)
}

But this doesn't: case class GetStatsForOneRequest(app: String, callback: Stats => Unit)

java.io.NotSerializableException: org.scalatest.Assertions$AssertionsHelper
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)

Also tried:

trait SerializableRunnable[T] extends scala.Serializable with ((T) => Unit)

implementing the callback as an instance of SerializableRunnable, but no luck.

Ideas?

Edit:

Perhaps I should clarify the actual use case that is running into this issue to provide more context. The function is a callback from Akka HTTP route like the following:

path("stats") {
  logRequest("/stats") {
    completeWith(instanceOf[List[Stats]]) { callback =>
      requestHandler ! GetStatsRequest(callback)
    }
  }
}

The handler actor persists the request until it gets a response. It may take more than one response to construct the final output.

I did some digging and it appears that the callback implementation is a CallbackRunnable.

1条回答
贼婆χ
2楼-- · 2019-04-15 11:45

Maybe you didn't fully understand the linked articles. The problem with function serialization is that anything captured in the closure must also be serializable. What you need is Spores. Everything is explained there, but here is gist:

What is a closure?

Lambda functions in Scala can refer to variables in the outer scope without explicitly listing them as parameters. A function that does this is called a closure and the outer variables it refers to are captured. For example foo is captured in closure passed to map below:

val foo = 42
List(1,2,3).map(_ + foo)

Why is it a problem for serialization?

Looking at the example above where foo is a primitive value, you woudn't think that's a problem. But what happens when there is an enclosing class?

class C {
  val myDBconn = ...
  val foo = 42
  List(1,2,3).map(_ + foo)
}

Now (unexpectedly for many programmers) the closure captures the entire this of the non-serializable enclosing class, including myDBconn, because foo refers to the getter method this.foo.

What's the solution?

The solution is to not capture this in the closure. For example creating a local val for any value we need to capture makes the function serializable again:

class C {
  val myDBconn = ...
  val foo = 42
  {
    val localFoo = foo
    List(1,2,3).map(_ + localFoo)
  }
}

Of course, doing this manually is tedious, hence Spores.

查看更多
登录 后发表回答