multiple parallel execution of WebClient as Task (

2019-08-08 09:20发布

问题:

i am testing parallel execution of IWebDriver vs WebClient . (if there's performance differance and how big it is)

before i managed to do so , i had problem with simple WebClient- Parallel invocation .

seems that it has not been executed, i did put a brake point on the AgilityPacDocExtraction at the specific line of WebClient.DownloadString(URL)

but the program exits instead of debug Step Into could show yeald string .

the plan was to have single method for all actions needed to be taken, via a "mode" selector for each action, then using a simple foreach that will iterate on all available Enum values - modes

the main exeutions :

   static void Main(string[] args)
   {
        EnumForEach<Action>(Execute);
        Task.WaitAll();
   }
   public static void EnumForEach<Mode>(Action<Mode> Exec)
   {

            foreach (Mode mode in Enum.GetValues(typeof(Mode)))
            {
                Mode Curr = mode;

                Task.Factory.StartNew(() => Exec(Curr) );
            }

   }

mode / Action selector

    enum Action
    {
        Act1, Act2
    }

the actual execution

    static  BrowsresFactory.IeEngine IeNgn = new BrowsresFactory.IeEngin();
    static string 
        FlNm = Environment.CurrentDirectory,
        URL = "",
        TmpHtm ="";


   static void Execute(Action Exc)
   {


        switch (Exc)
        {
            case Action.Act1:
                break;

            case Action.Act2:
                URL  = "UrlofUrChoise here...";
                FlNm += "\\TempHtm.htm";
                TmpHtm = IeNgn.AgilityPacDocExtraction(URL).GetElementbyId("Dv_Main").InnerHtml;
                File.WriteAllText(FlNm, TmpHtm);
                break;

        }
     }

class that hold WebClient and IWebDriver (by selenium) not included here so it will not take some more room in this post and allso not relevent for now.

class BrowsresFactory
{
    public class IeEngine
{

    private WebClient WC = new WebClient();
    private string tmpExtractedPageValue = "";
    private HtmlAgilityPack.HtmlDocument retAglPacHtmDoc = new HtmlAgilityPack.HtmlDocument();

    public HtmlAgilityPack.HtmlDocument AgilityPacDocExtraction(string URL)
    {
                WC.Encoding = Encoding.GetEncoding("UTF-8");
                tmpExtractedPageValue = WC.DownloadString(URL); //<--- tried to break here
                retAglPacHtmDoc.LoadHtml(tmpExtractedPageValue);
                return retAglPacHtmDoc;
    }
}
}

the problem is that i cant see any content in the file that was supposed to be alterd via value extracted from the WebClient , plus when in debug mode i couldn't step into the line commented in above code. what am i doing Wrong here ?

回答1:

The function Download(url, htmlDictionary) is not defined in the above code, one possible version is:

private static void Download(string url, ConcurrentDictionary<string, string> htmlDictionary)
{
    using (var webClient = new SmartWebClient())
    {
        htmlDictionary.TryAdd(url, webClient.DownloadString(url));
    }
}

... the above codes seems a copy from another Stack Overflow post. For reference see Retrieve a string containing html Document source using Task parallel



回答2:

I have managed to solve the issue by making a use of WebClient which I think requires less resources than WebDriver and if thats true it also means that takes less time.

This is the code :

public void StartEngins()
{
    const string URL_Dollar = "URL_Dollar";
    const string URL_UpdateUsersTimeOut = "URL_UpdateUsersTimeOut";


    var urlList = new Dictionary<string, string>();
    urlList.Add(URL_Dollar, "http://bing.com");
    urlList.Add(URL_UpdateUsersTimeOut, "http://localhost:..../.......aspx");


    var htmlDictionary = new ConcurrentDictionary<string, string>();
    Parallel.ForEach(
                    urlList.Values,
                    new ParallelOptions { MaxDegreeOfParallelism = 20 },
                    url => Download(url, htmlDictionary)
                    );
    foreach (var pair in htmlDictionary)
    {
        ///Process(pair);
        MessageBox.Show(pair.Value);
    }
}

public class SmartWebClient : WebClient
{
    private readonly int maxConcurentConnectionCount;

    public SmartWebClient(int maxConcurentConnectionCount = 20)
    {

        this.maxConcurentConnectionCount = maxConcurentConnectionCount;
    }

    protected override WebRequest GetWebRequest(Uri address)
    {
        var httpWebRequest = (HttpWebRequest)base.GetWebRequest(address);
        if (httpWebRequest == null)
        {
            return null;
        }

        if (maxConcurentConnectionCount != 0)
        {
            httpWebRequest.ServicePoint.ConnectionLimit = maxConcurentConnectionCount;
        }

        return httpWebRequest;
    }

}