I'm trying to make a program like IDM, that can download parts of the file simultaneously.
The tool i'm using to achieve this is TPL in C# .Net4.5
But I'm having a problem when using Tasks
to make the operation parallel.
The sequence function is functioning well and it is downloading the files correctly.
The parallel function using Tasks is working until something weird happens:
I've created 4 tasks, with Factory.StartNew()
, in each task the start position and the end position are given, the task will download these files, then it'll return it in byte[], and everything is going well, the tasks are working fine, but at some point, the executing freezes and that's it, the program stops and nothing else happens.
the implementation of the parallel function:
static void DownloadPartsParallel()
{
string uriPath = "http://mschnlnine.vo.llnwd.net/d1/pdc08/PPTX/BB01.pptx";
Uri uri = new Uri(uriPath);
long l = GetFileSize(uri);
Console.WriteLine("Size={0}", l);
int granularity = 4;
byte[][] arr = new byte[granularity][];
Task<byte[]>[] tasks = new Task<byte[]>[granularity];
tasks[0] = Task<byte[]>.Factory.StartNew(() => DownloadPartOfFile(uri, 0, l / granularity));
tasks[1] = Task<byte[]>.Factory.StartNew(() => DownloadPartOfFile(uri, l / granularity + 1, l / granularity + l / granularity));
tasks[2] = Task<byte[]>.Factory.StartNew(() => DownloadPartOfFile(uri, l / granularity + l / granularity + 1, l / granularity + l / granularity + l / granularity));
tasks[3] = Task<byte[]>.Factory.StartNew(() => DownloadPartOfFile(uri, l / granularity + l / granularity + l / granularity + 1, l));//(l / granularity) + (l / granularity) + (l / granularity) + (l / granularity)
arr[0] = tasks[0].Result;
arr[1] = tasks[1].Result;
arr[2] = tasks[2].Result;
arr[3] = tasks[3].Result;
Stream localStream;
localStream = File.Create("E:\\a\\" + Path.GetFileName(uri.LocalPath));
for (int i = 0; i < granularity; i++)
{
if (i == granularity - 1)
{
for (int j = 0; j < arr[i].Length - 1; j++)
{
localStream.WriteByte(arr[i][j]);
}
}
else
for (int j = 0; j < arr[i].Length; j++)
{
localStream.WriteByte(arr[i][j]);
}
}
}
the DownloadPartOfFile function implementation:
public static byte[] DownloadPartOfFile(Uri fileUrl, long from, long to)
{
int bytesProcessed = 0;
BinaryReader reader = null;
WebResponse response = null;
byte[] bytes = new byte[(to - from) + 1];
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(fileUrl);
request.AddRange(from, to);
request.ReadWriteTimeout = int.MaxValue;
request.Timeout = int.MaxValue;
if (request != null)
{
response = request.GetResponse();
if (response != null)
{
reader = new BinaryReader(response.GetResponseStream());
int bytesRead;
do
{
byte[] buffer = new byte[1024];
bytesRead = reader.Read(buffer, 0, buffer.Length);
if (bytesRead == 0)
{
break;
}
Array.Resize<byte>(ref buffer, bytesRead);
buffer.CopyTo(bytes, bytesProcessed);
bytesProcessed += bytesRead;
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + ",Downloading" + bytesProcessed);
} while (bytesRead > 0);
}
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
finally
{
if (response != null) response.Close();
if (reader != null) reader.Close();
}
return bytes;
}
I tried to solve it by setting int.MaxValue to the reading timeout, writing reading timeout, and timeout, that's why the program freezes, if i didn't do that, an exception of timeout will occur while in function DownloadPartsParallel
so is there a solution, or any other advice that may help, thanks.
I would use
HttpClient.SendAsync
rather thanWebRequest
(see "HttpClient is Here!").I would not use any extra threads. The
HttpClient.SendAsync
API is naturally asynchronous and returns an awaitableTask<>
, there is no need to offload it to a pool thread withTask.Run
/Task.TaskFactory.StartNew
(see this for a detailed discussion).I would also limit the number of parallel downloads with
SemaphoreSlim.WaitAsync()
. Below is my take as a console app (not extensively tested):OK, here's how I would do what you're attempting. This is basically the same idea, just implemented differently.
Note that I glossed over a lot of stuff here, such as:
So you've got a long way to go before I would use this in production. But it should give you an idea of where to start.