I am running Orleans in localHost Clustering mode and currently have 1 grain and a client.

// client code
for (int i = 0; i <num_scan; ++i)                    
{
    Console.WriteLine("client " + i);
    // the below call should have returned when first await is hit in foo()
    // but it doesn't work like that
    grain.foo(i);          
}

// grain code
async Task foo(int i)
{
     Console.WriteLine("grain "+i);
     await Task.Delay(2000);
}

The output of this was as below:

client 0
client 1
client 2
client 3
client 4
client 5
client 6
grain 0
client 7
client 8
client 9
client 10
grain 8
grain 7
.
.

In normal C#, the async function returns only when it hits await. In that case, the grain output should have been consecutive. As we can see above, the grain outputs are out of order. So, the Task is returning before hitting the await statement. My question is what is the difference between method call in Orleans and normal C#.

I saw this post which asks a similar question and the replies suggest that the two cases of method calls are different because we call an interface in Orleans. I would like to know, when does the method call return in Orleans.

PS: I tried the above code with await grain.foo() and it prints the grain output in order. But the problem with that approach is, await returns only when the entire foo() completes, whereas I want it to return when it hits await statement.

I'll answer in two parts:

Why it is not desirable to block until the first await on some remote call
How what you are seeing is what you should expect

From the outset: Orleans is normal C#, but the assumptions about how C# works in this case are missing some details (which are explained below). Orleans is designed for scalable, distributed systems. There is a basic assumption that if you call a method on some grain, that grain might be currently activated on a separate machine. Even if it is on the same machine, each grain runs asynchronously to other grains, often on a separate thread.

Why it is not desirable to block until the first `await` on some remote call

If one machine calls another machine, that takes some time (eg, because of the network). So if you have a thread on one machine calling into an object on another and you want to block that thread until an await statement within that object, then you're blocking that thread for a significant amount of time. The thread would have to wait for the network message to arrive on the remote machine, for it to be scheduled on the remote grain activation, for the grain to execute until the first await, and then for the remote machine to send a message back over the network to the first machine to say "hey, the first await was hit".

Blocking threads like that is not a scalable approach because the CPU is either idle while the thread is blocked or many (expensive) threads must be created in order to keep the CPU busy processing requests. Each thread has a cost in terms of pre-allocated stack space and other data structures, and switching between threads has a cost for the CPU.

So, hopefully it is clear now why it would not be desirable to block the calling thread until the remote grain hits its first await. Now, let's see how come the thread is not being blocked in Orleans.

How what you are seeing is what you should expect

Consider that your grain object is not an instance of the grain implementation class that you write, but is instead it is a 'grain reference'.

You create that grain object by using something like the following code:

var grain = grainFactory.GetGrain<IMyGrainInterface>("guest@myservice.com");

The object you get back from the GetGrain is a grain reference. It implements IMyGrainInterface, but it is not an instance of the grain class that you wrote. Instead, it is a class which Orleans generates for you. This class is a representation of the remote grain which you want to call, it's a reference to it.

So when you write some code like:

grain.foo(i);

what happens is the generated class calls into the Orleans runtime to make the foo request to the remote grain activation.

As an example, here's what the generated code might actually look like:

public Task foo(int i)
{
    return base.InvokeMethodAsync(118718866, new object[]{ i });
}

Those details are hidden from you, but you can go and find them if you look under the obj directory in your project.

So you can see that there is actually no await in the generated foo method at all! It simply asks the Orleans runtime to invoke a method with some weird integer and some object array.

On the remote end, a similar kind of generated class does the reverse: it takes your request and turns it into a direct method call on the actual grain code that you wrote. In the remote system, the thread will execute up to the first await in your grain code and then yield execution back to the scheduler, just like in "normal C#".

Aside: in RPC terms, a grain reference is roughly equivalent to a proxy object: i.e, it's an object which represents the remote object. The same code written for a traditional RPC framework like WCF or gRPC would behave in the same way as Orleans: your thread will not be blocked until the first await when a client calls a method on a server.