Cats-effect and asynchronous IO specifics

For few days I have been wrapping my head around cats-effect and IO. And I feel I have some misconceptions about this effect or simply I missed its point.

First of all - if IO can replace Scala's Future, how can we create an async IO task? Using IO.shift? Using IO.async? Is IO.delay sync or async? Can we make a generic async task with code like this Async[F].delay(...)? Or async happens when we call IO with unsafeToAsync or unsafeToFuture?
What's the point of Async and Concurrent in cats-effect? Why they are separated?
Is IO a green thread? If yes, why is there a Fiber object in cats-effect? As I understand the Fiber is the green thread, but docs claim we can think of IOs as green threads.

I would appreciate some clarifing on any of this as I have failed comprehending cats-effect docs on those and internet was not that helpfull...

if IO can replace Scala's Future, how can we create an async IO task

First, we need to clarify what is meant as an async task. Usually async means "does not block the OS thread", but since you're mentioning Future, it's a bit blurry. Say, if I wrote:

Future { (1 to 1000000).foreach(println) }

it would not be async, as it's a blocking loop and blocking output, but it would potentially execute on a different OS thread, as managed by an implicit ExecutionContext. The equivalent cats-effect code would be:

for {
  _ <- IO.shift
  _ <- IO.delay { (1 to 1000000).foreach(println) }
} yield ()

(it's not the shorter version)

So,

IO.shift is used to maybe change thread / thread pool. Future does it on every operation, but it's not free performance-wise.
IO.delay { ... } (a.k.a. IO { ... }) does NOT make anything async and does NOT switch threads. It's used to create simple IO values from synchronous side-effecting APIs

Now, let's get back to true async. The thing to understand here is this:

Every async computation can be represented as a function taking callback.

Whether you're using API that returns Future or Java's CompletableFuture, or something like NIO CompletionHandler, it all can be converted to callbacks. This is what IO.async is for: you can convert any function taking callback to an IO. And in case like:

for {
  _ <- IO.async { ... }
  _ <- IO(println("Done"))
} yield ()

Done will be only printed when (and if) the computation in ... calls back. You can think of it as blocking the green thread, but not OS thread.

So,

IO.async is for converting any already asynchronous computation to IO.
IO.delay is for converting any completely synchronous computation to IO.
The code with truly asynchronous computations behaves like it's blocking a green thread.

The closest analogy when working with Futures is creating a scala.concurrent.Promise and returning p.future.

Or async happens when we call IO with unsafeToAsync or unsafeToFuture?

Sort of. With IO, nothing happens unless you call one of these (or use IOApp). But IO does not guarantee that you would execute on a different OS thread or even asynchronously unless you asked for this explicitly with IO.shift or IO.async.

You can guarantee thread switching any time with e.g. (IO.shift *> myIO).unsafeRunAsyncAndForget(). This is possible exactly because myIO would not be executed until asked for it, whether you have it as val myIO or def myIO.

You cannot magically transform blocking operations into non-blocking, however. That's not possible neither with Future nor with IO.

What's the point of Async and Concurrent in cats-effect? Why they are separated?

Async and Concurrent (and Sync) are type classes. They are designed so that programmers can avoid being locked to cats.effect.IO and can give you API that supports whatever you choose instead, such as monix Task or Scalaz 8 ZIO, or even monad transformer type such as OptionT[Task, *something*]. Libraries like fs2, monix and http4s make use of them to give you more choice of what to use them with.

Concurrent adds extra things on top of Async, most important of them being .cancelable and .start. These do not have a direct analogy with Future, since that does not support cancellation at all.

.cancelable is a version of .async that allows you to also specify some logic to cancel the operation you're wrapping. A common example is network requests - if you're not interested in results anymore, you can just abort them without waiting for server response and don't waste any sockets or processing time on reading the response. You might never use it directly, but it has it's place.

But what good are cancelable operations if you can't cancel them? Key observation here is that you cannot cancel an operation from within itself. Somebody else has to make that decision, and that would happen concurrently with the operation itself (which is where the type class gets its name). That's where .start comes in. In short,

.start is an explicit fork of a green thread.

Doing someIO.start is akin to doing val t = new Thread(someRunnable); t.start(), except it's green now. And Fiber is essentially a stripped down version of Thread API: you can do .join, which is like Thread#join(), but it does not block OS thread; and .cancel, which is safe version of .interrupt().

Note that there are other ways to fork green threads. For example, doing parallel operations:

val ids: List[Int] = List.range(1, 1000)
def processId(id: Int): IO[Unit] = ???
val processAll: IO[Unit] = ids.parTraverse_(processId)

will fork processing all IDs to green threads and then join them all. Or using .race:

val fetchFromS3: IO[String] = ???
val fetchFromOtherNode: IO[String] = ???

val fetchWhateverIsFaster = IO.race(fetchFromS3, fetchFromOtherNode).map(_.merge)

will execute fetches in parallel, give you first result completed and automatically cancel the fetch that is slower. So, doing .start and using Fiber is not the only way to fork more green threads, just the most explicit one. And that answers:

Is IO a green thread? If yes, why is there a Fiber object in cats-effect? As I understand the Fiber is the green thread, but docs claim we can think of IOs as green threads.

IO is like a green thread, meaning you can have lots of them running in parallel without overhead of OS threads, and the code in for-comprehension behaves as if it was blocking for the result to be computed.
Fiber is a tool for controlling green threads explicitly forked (waiting for completion or cancelling).