How does spray.routing.HttpService dispatch reques

Disclaimer: I have no scala experience for now, so my question is connected with very basics.

Consider the following example (it may be incomplete):

import akka.actor.{ActorSystem, Props}
import akka.io.IO
import spray.can.Http
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.duration._
import akka.actor.Actor
import spray.routing._
import spray.http._

object Boot extends App {
  implicit val system = ActorSystem("my-actor-system")
  val service = system.actorOf(Props[MyActor], "my")
  implicit val timeout = Timeout(5.seconds)
  IO(Http) ? Http.Bind(service, interface = "localhost", port = 8080)
}

class MyActor extends Actor with MyService {
  def actorRefFactory = context

  def receive = runRoute(myRoute)
}

trait MyService extends HttpService {
  val myRoute =
    path("my") {
      post {
        complete {
          "PONG"
        }
      }
    }
}

My question is: what actually happens when control reaches complete block? The question seems to be too general, so let me split it.

I see creation of a single actor in the example. Does it mean that the application is single-threaded and uses only one cpu core?
What happens if I do blocking call inside complete?
If p. 1 is true and p. 2 will block, how do I dispatch requests to utilize all cpus? I see two ways: actor per request and actor per connection. The second one seems to be reasonable, but I cannot find the way to do it using spray library.
If the previous question is irrelevant, will detach directive do? And what about passing function returning Future to complete directive? What is the difference between detach and passing function returning the Future?
What is the proper way to configure number of working threads and balance requests/connections?

It would be great if you point me explanations in the official documentation. It is very extensive and I believe I am missing something.

Thank you.

It's answered here by Mathias - one of the Spray authors. Copying his reply for the reference:

In the end the only thing that really completes the request is a call to requestContext.complete. Thereby it doesn't matter which thread or Actor context this call is made from. All that matters is that it does happen within the configured "request-timeout" period. You can of course issue this call yourself in some way or another, but spray gives you a number of pre-defined constructs that maybe fit your architecture better than passing the actual RequestContext around. Mainly these are:

The complete directive, which simply provides some sugar on top of the "raw" ctx => ctx.complete(…) function literal.

The Future Marshaller, which calls ctx.complete from an future.onComplete handler.

The produce directive, which extracts a function T => Unit that can later be used to complete the request with an instance of a custom type.

Architecturally, in most cases, it's a good idea to not have the API layer "leak into" the core of your application. I.e. the application should not know anything about the API layer or HTTP. It should only deal with objects of its own domain model. Therefore passing the RequestContext directly to the application core is mostly not the best solution.

Resorting to the "ask" and relying on the Future Marshaller is an obvious, well understood and rather easy alternative. It comes with the (small) drawback that an ask comes with a mandatory timeout check itself which logically isn't required (since the spray-can layer already takes care of request timeouts). The timeout on the ask is required for technical reasons (so the underlying PromiseActorRef can be cleaned up if the expected reply never comes).

Another alternative to passing the RequestContext around is the produce directive (e.g. produce(instanceOf[Foo]) { completer => …). It extracts a function that you can pass on to the application core. When your core logic calls complete(foo) the completion logic is run and the request completed. Thereby the application core remains decoupled from the API layer and the overhead is minimal. The drawbacks of this approach are twofold: first the completer function is not serializable, so you cannot use this approach across JVM boundaries. And secondly the completion logic is now running directly in an actor context of the application core, which might change runtime behavior in unwanted ways if the Marshaller[Foo] has to do non-trivial tasks.

A third alternative is to spawn a per-request actor in the API layer and have it handle the response coming back from the application core. Then you do not have to use an ask. Still, you end up with the same problem that the PromiseActorRef underlying an ask has: how to clean up if no response ever comes back from the application core? With a re-request actor you have full freedom to implement a solution for this question. However, if you decide to rely on a timeout (e.g. via context.setReceiveTimeout) the benefits over an "ask" might be non-existent.

Which of the described solutions best fits you architecture you need to decide yourself. However, as I hopefully was able to show, you do have a couple of alternatives to choose from.

To answer some of your specific questions: There is only a single actor/handler that services the route thus if you make it block Spray will block. This means you want to either complete the route immediately or dispatch work using either of the 3 options above.

There are many examples on the web for these 3 options. The easiest is to wrap your code in a Future. Check also "actor per request" option/example. In the end your architecture will define the most appropriate way to go.

Finally, Spray runs on top of Akka, so all Akka configuration still applies. See HOCON reference.conf and application.conf for Actor threading settings.