我怎样才能让`的await ...`和`产量return`(即一个迭代方法内)工作?(How can

2019-07-03 10:50发布

我有现有的代码,类似如下:

IEnumerable<SomeClass> GetStuff()
{
    using (SqlConnection conn = new SqlConnection(connectionString))
    using (SqlCommand cmd = new SqlCommand(sql, conn)
    {
        conn.Open();
        SqlDataReader reader = cmd.ExecuteReader();
        while (reader.Read())
        {
            SomeClass someClass = f(reader); // create instance based on returned row
            yield return someClass;
        }
    } 
}

看来我可以通过使用受益reader.ReadAsync() 但是,如果我只是修改了一行:

        while (await reader.ReadAsync())

编译器告诉我, await只能在标有方法一起使用async ,并建议我修改方法签名是:

async Task<IEnumerable<SomeClass>> GetStuff()

然而,这样做使得GetStuff()不可用,因为:

的主体GetStuff()不能是迭代器块,因为Task<IEnumerable<SomeClass>>是不迭代器接口类型。

我敢肯定,我的思念与异步编程模型的一个重要概念。

问题:

  • 我可以使用ReadAsync()在我的迭代器? 怎么样?
  • 我怎么能想到的异步模式不同,让我明白它是如何工作在这种情况?

Answer 1:

问题是你问实际上并没有太大的意义。 IEnumerable<T>是一个同步接口,并返回Task<IEnumerable<T>>是不会帮你多少,因为一些线程将不得不阻塞等待每一个项目,不管是什么。

你真的想回到的是一些异步替代IEnumerable<T>类似IObservable<T>从TPL数据流或数据流块IAsyncEnumerable<T>这是计划将被添加到C#8.0 / .NET核心3.0。 (在此期间,也有一些 库包含它。)

使用TPL数据流,要做到这一点的一种方法是:

ISourceBlock<SomeClass> GetStuff() {
    var block = new BufferBlock<SomeClass>();

    Task.Run(async () =>
    {
        using (SqlConnection conn = new SqlConnection(connectionString))
        using (SqlCommand cmd = new SqlCommand(sql, conn))
        {
            await conn.OpenAsync();
            SqlDataReader reader = await cmd.ExecuteReaderAsync();
            while (await reader.ReadAsync())
            {
                SomeClass someClass;
                // Create an instance of SomeClass based on row returned.
                block.Post(someClass);
            }
            block.Complete();
        } 
    });

    return block;
}

你可能会想加入错误处理上面的代码,但除此之外,它应该工作,这将是完全异步的。

那么你的代码的其余部分将消耗从返回的块中的项目也以异步方式,可能使用ActionBlock



Answer 2:

不,你不能目前使用异步于迭代块。 作为svick说,你需要像IAsyncEnumerable做到这一点。

如果你有返回值Task<IEnumerable<SomeClass>>这意味着该函数返回一个单一的Task ,一旦完成目标,将为您提供一个完全形成的IEnumerable(在此枚举没有空间任务异步)。 一旦任务目标完成后,调用者应该能够同步通过它在枚举返回的所有项目进行迭代。

这里是一个返回溶液Task<IEnumerable<SomeClass>> 。 你可以做这样的事情得到的异步的利益有很大一部分:

async Task<IEnumerable<SomeClass>> GetStuff()
{
    using (SqlConnection conn = new SqlConnection(""))
    {
        using (SqlCommand cmd = new SqlCommand("", conn))
        {
            await conn.OpenAsync();
            SqlDataReader reader = await cmd.ExecuteReaderAsync();
            return ReadItems(reader).ToArray();
        }
    }
}

IEnumerable<SomeClass> ReadItems(SqlDataReader reader)
{
    while (reader.Read())
    {
        // Create an instance of SomeClass based on row returned.
        SomeClass someClass = null;
        yield return someClass;
    }
}

...和用法的例子:

async void Caller()
{
    // Calls get-stuff, which returns immediately with a Task
    Task<IEnumerable<SomeClass>> itemsAsync = GetStuff();
    // Wait for the task to complete so we can get the items
    IEnumerable<SomeClass> items = await itemsAsync;
    // Iterate synchronously through the items which are all already present
    foreach (SomeClass item in items)
    {
        Console.WriteLine(item);
    }
}

在这里,你有迭代器部分,在不同的功能异步部分,它允许你同时使用异步和产量的语法。 所述GetStuff功能异步地获取数据,并将该ReadItems然后同步的数据读入一个枚举。

注意ToArray()调用。 像这样的东西是必要的,因为枚举函数执行懒洋洋地等您的异步功能可以以其他方式处置连接,并命令所有的数据被读取之前。 这是因为using块覆盖的持续时间Task执行,但你会被它迭代after的任务就完成了。

该解决方案使用ReadAsync ,但它确实使用OpenAsyncExecuteReaderAsync ,这可能让你最受益的。 根据我的经验是,将采取最长时间,并有最受益的是异步给ExecuteReader。 到时候我已经阅读了第一排, SqlDataReader拥有都已经在其他行和ReadAsync刚刚返回同步。 如果这是你的情况,以及那么你将不会被迁移到基于推送的系统,如获得显著效益IObservable<T>这将需要调用函数显著修改)。

为了说明,考虑一种替代方法,以同样的问题:

IEnumerable<Task<SomeClass>> GetStuff()
{
    using (SqlConnection conn = new SqlConnection(""))
    {
        using (SqlCommand cmd = new SqlCommand("", conn))
        {
            conn.Open();
            SqlDataReader reader = cmd.ExecuteReader();
            while (true)
                yield return ReadItem(reader);
        }
    }
}

async Task<SomeClass> ReadItem(SqlDataReader reader)
{
    if (await reader.ReadAsync())
    {
        // Create an instance of SomeClass based on row returned.
        SomeClass someClass = null;
        return someClass;
    }
    else
        return null; // Mark end of sequence
}

...和用法的例子:

async void Caller()
{
    // Synchronously get a list of Tasks
    IEnumerable<Task<SomeClass>> items = GetStuff();
    // Iterate through the Tasks
    foreach (Task<SomeClass> itemAsync in items)
    {
        // Wait for the task to complete. We need to wait for 
        // it to complete before we can know if it's the end of
        // the sequence
        SomeClass item = await itemAsync;
        // End of sequence?
        if (item == null) 
            break;
        Console.WriteLine(item);
    }
}

在这种情况下, GetStuff用枚举的,其中在所述可枚举的每个项目的是,将呈现一个任务立即返回SomeClass完成时的对象。 这种方法有一些缺陷。 首先,枚举返回同步因此在返回我们居然不知道有多少行的结果,这就是为什么我做了一个无限序列的时间。 这是完全合法的,但它有一定的副作用。 我需要使用null信号的有用数据的结束任务的无限序列。 其次,你要小心你如何迭代它。 你需要向前遍历它,你需要遍历到下一行之前等待每一行。 您还必须只有在所有任务完成后处置迭代器,使已经完工后方可使用它的GC不会收集连接。 由于这些原因,这不是一个安全的解决方案,我必须强调,我包括它说明,以帮助回答你的第二个问题。



Answer 3:

Speaking strictly to async iterator's (or there possibility) within the context of a SqlCommand in my experience I've noticed that the synchronous version of the code vastly outperforms it's async counterpart. In both speed and memory consumption.

Perhaps, take this observation with a grain of salt as the scope of the testing was limited to my machine and local SQL Server instance.

Don't get me wrong, the async/await paradigm within the .NET environment is phenomenally simple, powerful and useful given the right circumstances. After much toiling however, I'm not convinced database access is a proper use case for it. Unless of course you're needing to execute several commands simultaneously, in which case you can simply use TPL to fire off the commands in unison.

My preferred approach rather is to take the following considerations:

  • Keep the units of SQL work small, simple and compose-able (i.e. make your SQL executions "cheap").
  • Avoid doing work on the SQL Server that can be push upstream to the app-level. A perfect example of this is sorting.
  • Most importantly, test your SQL code at scale and review Statistics IO output/execution plan. A query which runs quickly at 10k record, may (and probably will) behave entirely differently when there a 1M records.

You could make the argument that in certain reporting scenarios, some of the above requirements just aren't possible. However, in the context of reporting services is asynchronous-ity (is that even a word?) really needed?

There's a fantastic article by Microsoft evangelist Rick Anderson about this very topic. Mind you it's old (from 2009) but still very relevant.



文章来源: How can I make `await …` work with `yield return` (i.e. inside an iterator method)?