I need to download images from a website, and I have the login name and password, but if i just use URL to download the image, it will throw a exception: there is no value in session.
I think I need to login the website before I can programmatically download the image.
Do you have any solutions ? Thanks in advance !
I'd like to mention HtmlUnit. It is a headless browser with Javascript for Java.
In simple circumstances you can use a URLConnection
with the URL and stream the contents down. More generally I'd strongly advise you use Apache HttpClient since you'll need to do authentication and possibly receive and send cookies to the server. Read the user guide regarding Authentication and Methods, particularly Get.
Use the HTTP Client libraries in order to write a spider for content access.
I would suggest to record the HTTP traffic for login and content access and then rebuild the communication using the library, if you want to stick with Java.
There are other libraries as well for other languages like Perl:LWP.
Although the java.net package provides basic functionality for accessing resources via HTTP, it doesn't provide the full flexibility or functionality needed by many applications. HttpClient seeks to fill this void by providing an efficient, up-to-date, and feature-rich package implementing the client side of the most recent HTTP standards and recommendations.
Designed for extension while providing robust support for the base HTTP protocol, HttpClient may be of interest to anyone building HTTP-aware client applications such as web browsers, web service clients, or systems that leverage or extend the HTTP protocol for distributed communication.
HTTPClient
HTTPClient Authentication