F# MailboxProcessor questions

2019-03-16 21:32发布

问题:

I've created a console program using the code from http://fssnip.net/3K. And I found that

  1. I'd to add "System.Console.ReadLine() |> ignore" at the end to wait for the finish of threads. Is it possible to tell all the MailBoxProcessors are done and the program can exit itself?

  2. I tried to change the test url "www.google.com" to something invalid url and I got the following output. Is it possible to avoid the "outputting race"?

     http://www.google.co1m crawled by agent 1.  
     AgAAAent gent 3 is done.  
     gent 2 is done.  
     5 is done.  
     gent 4 is done.  
     Agent USupervisor RL collector is done.  
     is done.  
     1 is done.

[Edit]

The last output/crawling is still terminated after using Tomas's update http://fssnip.net/65. The following is the output of the program after I changed the "limit" to 5 and added some debugging messages. The last line shows the truncated URL. Is it a way to detect if all the crawlers finish their execution?

[Main] before crawl
[Crawl] before return result
http://news.google.com crawled by agent 1.
[supervisor] reached limit
http://www.gstatic.com/news/img/favicon.ico crawled by agent 5.
Agent 2 is done.
[supervisor] reached limit
Agent 5 is done.
http://www.google.com/imghp?hl=en&tab=ni crawled by agent 3.
[supervisor] reached limit
Agent 3 is done.
http://www.google.com/webhp?hl=en&tab=nw crawled by agent 4.
[supervisor] reached limit
Agent 4 is done.
http://news.google.com/n

I changed the main code to

printfn "[Main] before crawl"
crawl "http://news.google.com" 5
|> Async.RunSynchronously
printfn "[Main] after crawl"

However, the last printfn "[Main] after crawl" is never executed, unless I add a Console.Readline() at the end.

[Edit 2]

The code runs fine under fsi. However it will have the same problem if it was run using fsi --use:Program.fs --exec --quiet

回答1:

I created a snippet that extends the previous one with the two features you asked about: http://fssnip.net/65.

  1. To solve this, I added Start message that carries AsyncReplyChannel<unit>. When the supervisor agent starts, it waits for this message and saves the reply channel for later use. When it completes, it sends a reply using this channel.

    The function that starts the agent returns asynchronous workflow that waits for the reply. You can then call crawl using Async.RunSynchronously, which will complete when the supervisor agent completes.

  2. To avoid race when printing, you need to synchronize all prints. The easiest way to do this is to write a new agent :-). The agent receives strings and prints them to the output one by one (so that they cannot be interleaved). The snippet hides the standard printfn function with a new implementation that sends strings to the agent.