Best practices to parallelize using async workflow

Lets say I wanted to scrape a webpage, and extract some data. I'd most likely write something like this:

let getAllHyperlinks(url:string) =
    async {  let req = WebRequest.Create(url)
             let! rsp = req.GetResponseAsync()
             use stream = rsp.GetResponseStream()             // depends on rsp
             use reader = new System.IO.StreamReader(stream)  // depends on stream
             let! data = reader.AsyncReadToEnd()              // depends on reader
             return extractAllUrls(data) }                    // depends on data

The let! tells F# to execute the code in another thread, then bind the result to a variable, and continue processing. The sample above uses two let statements: one to get the response, and one to read all the data, so it spawns at least two threads (please correct me if I'm wrong).

Although the workflow above spawns several threads, the order of execution is serial because each item in the workflow depends on the previous item. Its not really possible to evaluate any items further down the workflow until the other threads return.

Is there any benefit to having more than one let! in the code above?

If not, how would this code need to change to take advantage of multiple let! statements?

标签： f# asynchronous async-workflow

2条回答

The star\"

2楼-- · 2020-07-07 08:14

I was writing an answer but Brian beat me to it. I fully agree with him.

I'd like to add that if you want to parallelize synchronous code, the right tool is PLINQ, not async workflows, as Don Syme explains.

0人赞添加讨论(0) 举报

爷的心禁止访问

3楼-- · 2020-07-07 08:29

The key is we are not spawning any new threads. During the whole course of the workflow, there are 1 or 0 active threads being consumed from the ThreadPool. (An exception, up until the first '!', the code runs on the user thread that did an Async.Run.) "let!" lets go of a thread while the Async operation is at sea, and then picks up a thread from the ThreadPool when the operation returns. The (performance) advantage is less pressure against the ThreadPool (and of course the major user advantage is the simple programming model - a million times better than all that BeginFoo/EndFoo/callback stuff you otherwise write).

0人赞添加讨论(0) 举报

Best practices to parallelize using async workflow

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间