C# library similar to HtmlUnit

2020-03-06 03:26发布

问题:

I need to write standalone application which will "browse" external resource. Is there lib in C# which automatically handles cookies and supports JavaScript (through JS is not required I believe)? The main goal is to keep session alive and submitting forms so I could pass multistep registration process or "browse" web site after login. I reviewed Html Agility Pack but it looks like it doesn't contain functionality I need - form submitting or cookie support.

Thanks, Artem.

回答1:

Look at Data Extracting SDK, which allows to post data via HtmlProcessor class. Also you can add your work item here if it is missed in the library.



回答2:

If you're interested in writing your own version of htmlunit for C#, the IKVM project may be of help. http://www.ikvm.net/



回答3:

The HtmlAgilityPack is specifically for parsing HTML. You can use the WebRequest class in the .NET Framework to handle communication and cookies.

See my blog entry on Web scraping in .NET. This won't answer all your questions, but will get you part of the way there.



回答4:

http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.aspx Try you WebBrowser class and work with DOM there



回答5:

Selenium, it uses the actual browsers, but is a cross browser platform. Depends on wether or not you can have an actual browser running - it works by injecting javascript via a proxy into the browser. http://seleniumhq.org/support/