我解析有些歌词网站,我有错误从标题。
URL,我给它(举例):
http://www.azlyrics.com/lyrics/linkinpark/intheend.html
class GetLyrics extends AsyncTask<String, Void, String> {
protected String doInBackground(String... urls) {
String url = urls[0];
String output;
output = "If you see this, some kind of error has occupied";
try {
Document document = Jsoup.connect(url).post(); //I dont know how it works, its google
document.outputSettings(new Document.OutputSettings().prettyPrint(false));//makes html() preserve linebreaks and spacing
document.select("br").append("\\n");
Elements lyrics = document.select("b + br + br + div"); //Search for lyrics <div> tag, that after <b> and 2 <br> tags
String s = lyrics.html().replaceAll("\\\\n", "\n"); //Google again
output = Jsoup.clean(s, "", Whitelist.none(), new Document.OutputSettings().prettyPrint(false));
output = output.replace("\n\n", "\n");
output = output.substring(4); //Remove first enters
}
catch (HttpStatusException e) {
System.err.println("404 error: " + e);
System.err.println("Check your input data");
output = "An 404 error has occurred, more info:\n" + e + "\nCheck your input data";
Log.d("LyricFinder", e.toString());
}
catch (Exception e) {
System.err.println("Some error: " + e);
output = "An uknown error has occurred\nCheck your internet connection";
Log.d("LyricFinder", e.toString());
}
return output;
}
protected void onPostExecute(String lyrics) {
lyricsOutput.setText(lyrics);
}
}
和日志:
D/LyricFinder: java.io.IOException: unexpected end of stream on Connection{www.azlyrics.com:80, proxy=DIRECT@ hostAddress=85.17.159.246 cipherSuite=none protocol=http/1.1} (recycle count=0)
在Eclipse控制台项目的代码工作完美(但没有这样的AsyncTask:/)
我新手,我正与互联网和jsoup第一份工作。