I have been trying to learn F# for the past couple of day and I keep running into something that perplexes me. My "learning project" is a screen scraper for some data I'm kind of interested in manipulating.
In F# PowerPack there is a call Stream.AsyncReadToEnd. I did not want to use the PowerPack just for that single call so I took a look at how they did it.
module Downloader =
open System
open System.IO
open System.Net
open System.Collections
type public BulkDownload(uriList : IEnumerable) =
member this.UriList with get() = uriList
member this.ParalellDownload() =
let Download (uri : Uri) = async {
let UnblockViaNewThread f = async {
do! Async.SwitchToNewThread()
let res = f()
do! Async.SwitchToThreadPool()
return res }
let request = HttpWebRequest.Create(uri)
let! response = request.AsyncGetResponse()
use responseStream = response.GetResponseStream()
use reader = new StreamReader(responseStream)
let! contents = UnblockViaNewThread (fun() -> reader.ReadToEnd())
return uri, contents.ToString().Length }
this.UriList
|> Seq.cast
|> Seq.map Download
|> Async.Parallel
|> Async.RunSynchronously
They have that function UnblockViaNewThread. Is that really the only way to asynchronously read the response stream? Isn't creating a new thread really expensive (I've seen the "~1mb of memory" thrown around all over the place). Is there a better way to do this? Is this what's really happenening in every Async*
call (one that I can let!
)?
EDIT: I follow Tomas' suggestions and actually came up with something independent of F# PowerTools. Here it is. This really needs error handling, but it asynchronous requests and downloads a url to a byte array.
namespace Downloader
open System
open System.IO
open System.Net
open System.Collections
type public BulkDownload(uriList : IEnumerable) =
member this.UriList with get() = uriList
member this.ParalellDownload() =
let Download (uri : Uri) = async {
let processStreamAsync (stream : Stream) = async {
let outputStream = new MemoryStream()
let buffer = Array.zeroCreate<byte> 0x1000
let completed = ref false
while not (!completed) do
let! bytesRead = stream.AsyncRead(buffer, 0, 0x1000)
if bytesRead = 0 then
completed := true
else
outputStream.Write(buffer, 0, bytesRead)
stream.Close()
return outputStream.ToArray() }
let request = HttpWebRequest.Create(uri)
let! response = request.AsyncGetResponse()
use responseStream = response.GetResponseStream()
let! contents = processStreamAsync responseStream
return uri, contents.Length }
this.UriList
|> Seq.cast
|> Seq.map Download
|> Async.Parallel
|> Async.RunSynchronously
override this.ToString() = String.Join(", ", this.UriList)