Which is the simplest method to get html code from a webview? I have tried several methods from stackoverflow and google, but can't find an exact method. Please mention an exact way.
public class htmldecoder extends Activity implements OnClickListener,TextWatcher
{
TextView txturl;
Button btgo;
WebView wvbrowser;
TextView txtcode;
ImageButton btcode;
LinearLayout llayout;
int flagbtcode;
public void onCreate(Bundle savedInstanceState)
{
super.onCreate(savedInstanceState);
setContentView(R.layout.htmldecoder);
txturl=(TextView)findViewById(R.id.txturl);
btgo=(Button)findViewById(R.id.btgo);
btgo.setOnClickListener(this);
wvbrowser=(WebView)findViewById(R.id.wvbrowser);
wvbrowser.setWebViewClient(new HelloWebViewClient());
wvbrowser.getSettings().setJavaScriptEnabled(true);
wvbrowser.getSettings().setPluginsEnabled(true);
wvbrowser.getSettings().setJavaScriptCanOpenWindowsAutomatically(true);
wvbrowser.addJavascriptInterface(new MyJavaScriptInterface(),"HTMLOUT");
//wvbrowser.loadUrl("http://www.google.com");
wvbrowser.loadUrl("javascript:window.HTMLOUT.showHTML('<html>'+document.getElementsByTagName('html')[0].innerHTML+'</html>');");
txtcode=(TextView)findViewById(R.id.txtcode);
txtcode.addTextChangedListener(this);
btcode=(ImageButton)findViewById(R.id.btcode);
btcode.setOnClickListener(this);
}
public void onClick(View v)
{
if(btgo==v)
{
String url=txturl.getText().toString();
if(!txturl.getText().toString().contains("http://"))
{
url="http://"+url;
}
wvbrowser.loadUrl(url);
//wvbrowser.loadData("<html><head></head><body><div style='width:100px;height:100px;border:1px red solid;'></div></body></html>","text/html","utf-8");
}
else if(btcode==v)
{
ViewGroup.LayoutParams params1=wvbrowser.getLayoutParams();
ViewGroup.LayoutParams params2=txtcode.getLayoutParams();
if(flagbtcode==1)
{
params1.height=200;
params2.height=220;
flagbtcode=0;
//txtcode.setText(wvbrowser.getContentDescription());
}
else
{
params1.height=420;
params2.height=0;
flagbtcode=1;
}
wvbrowser.setLayoutParams(params1);
txtcode.setLayoutParams(params2);
}
}
public class HelloWebViewClient extends WebViewClient {
@Override
public boolean shouldOverrideUrlLoading(WebView view, String url) {
view.loadUrl(url);
return true;
}
/*@Override
public void onPageFinished(WebView view, String url)
{
// This call inject JavaScript into the page which just finished loading.
wvbrowser.loadUrl("javascript:window.HTMLOUT.processHTML('<head>'+document.getElementsByTagName('html')[0].innerHTML+'</head>');");
}*/
}
class MyJavaScriptInterface
{
@SuppressWarnings("unused")
public void showHTML(String html)
{
txtcode.setText(html);
}
}
public void afterTextChanged(Editable s) {
// TODO Auto-generated method stub
}
public void beforeTextChanged(CharSequence s, int start, int count,
int after) {
// TODO Auto-generated method stub
}
public void onTextChanged(CharSequence s, int start, int before, int count) {
wvbrowser.loadData("<html><div"+txtcode.getText().toString()+"></div></html>","text/html","utf-8");
}
}
try using HttpClient as Sephy said:
Actually this question has many answers. Here are 2 of them :
This way your grab the html through javascript. Not the prettiest way but when you have your javascript interface, you can add other methods to tinker it.
The option you choose also depends, I think, on what you intend to do with the retrieved html...
Android will not let you do this for security concerns. An evil developer could very easily steal user-entered login information.
Instead, you have to catch the text being displayed in the webview before it is displayed. If you don't want to set up a response handler (as per the other answers), I found this fix with some googling:
This is a lot of code, and you should be able to copy/paster it, and at the end of it
str
will contain the same html drawn in the webview. This answer is from Simplest way to correctly load html from web page into a string in Java and it should work on Android as well. I have not tested this and did not write it myself, but it might help you out.Also, the URL this is pulling is hardcoded, so you'll have to change that.
I would suggest instead of trying to extract the HTML from the WebView, you extract the HTML from the URL. By this, I mean using a third party library such as JSoup to traverse the HTML for you. The following code will get the HTML from a specific URL for you
One touch point I found that needs to be put in place is "hidden" away in the Proguard configuration. While the HTML reader invokes through the javascript interface just fine when debugging the app, this works no longer as soon as the app was run through Proguard, unless the HTML reader function is declared in the Proguard config file, like so:
Tested and confirmed on Android 2.3.6, 4.1.1 and 4.2.1.
I suggest to try out some Reflection approach, if you have time to spend on the debugger (sorry but I didn't have).
Starting from the
loadUrl()
method of theandroid.webkit.WebView
class:http://grepcode.com/file/repository.grepcode.com/java/ext/com.google.android/android/2.2_r1.1/android/webkit/WebView.java#WebView.loadUrl%28java.lang.String%2Cjava.util.Map%29
You should arrive on the
android.webkit.BrowserFrame
that call thenativeLoadUrl()
native method:http://grepcode.com/file/repository.grepcode.com/java/ext/com.google.android/android/2.2_r1.1/android/webkit/BrowserFrame.java#BrowserFrame.nativeLoadUrl%28java.lang.String%2Cjava.util.Map%29
The implementation of the native method should be here:
http://gitorious.org/0xdroid/external_webkit/blobs/a538f34148bb04aa6ccfbb89dfd5fd784a4208b1/WebKit/android/jni/WebCoreFrameBridge.cpp
Wish you good luck!