I have gotten some reports from users of crashes when try use my application on Verizon's 4G/LTE.
Looking at the stack trace, it looks like Android's HttpClient.execute() implementation is throwing an OOM. This happens only on 4G/LTE devices, specifically HTC Thunderbolt, and only when on 4G/LTE. WiFi, 3G, UMTS are OK. Also works fine on Sprint's WiMax 4G stuff works fine.
Two questions:
What's the best way to get the attention of Android devs about this? Any better options than reporting on http://code.google.com/p/android/issues?
Any ideas on how I can work around this? I don't have a 4G device myself and I can't get this happen in the emulator so I need to make some educated guesses here. I can try to catch the OOM in my code and attempt to cleanup and force GC, but I'm not sure if that's a good idea. Comments or other suggestions?
Here's what my code's doing:
HttpParams params = this.getHttpParams(); // returns params
ClientConnectionManager cm = new ThreadSafeClientConnManager(params, this.getHttpSchemeRegistry() );
DefaultHttpClient httpClient = new DefaultHttpClient( cm, params );
HttpResponse response = null;
request = new HttpGet( url );
try {
response = httpClient.execute(request); // <-- OOM on 4G/LTE. OK otherwise
int statusCode = response.getStatusLine().getStatusCode();
Log.i("fetcher", "execute returned, http status " + statusCode );
...
Here's the crashing stack trace:
E/dalvikvm-heap(11639): Out of memory on a 2055696-byte allocation. I/dalvikvm(11639): "Thread-16" prio=5 tid=9 RUNNABLE I/dalvikvm(11639): | group="main" sCount=0 dsCount=0 s=N obj=0x48563070 self=0x3c4340 I/dalvikvm(11639): | sysTid=11682 nice=0 sched=0/0 cgrp=default handle=3948760 I/dalvikvm(11639): | schedstat=( 208709711 74005130 214 )
I/dalvikvm(11639): at org.apache.http.impl.io.AbstractSessionInputBuffer.init(AbstractSessionInputBuffer.java:~79) I/dalvikvm(11639): at org.apache.http.impl.io.SocketInputBuffer.(SocketInputBuffer.java:93) I/dalvikvm(11639): at org.apache.http.impl.SocketHttpClientConnection.createSessionInputBuffer(SocketHttpClientConnection.java:83) I/dalvikvm(11639): at org.apache.http.impl.conn.DefaultClientConnection.createSessionInputBuffer(DefaultClientConnection.java:170) I/dalvikvm(11639): at org.apache.http.impl.SocketHttpClientConnection.bind(SocketHttpClientConnection.java:106) I/dalvikvm(11639): at org.apache.http.impl.conn.DefaultClientConnection.openCompleted(DefaultClientConnection.java:129) I/dalvikvm(11639): at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:173) I/dalvikvm(11639): at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:164) I/dalvikvm(11639): at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:119) I/dalvikvm(11639): at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:348) I/dalvikvm(11639): at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:555) I/dalvikvm(11639): at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:487) I/dalvikvm(11639): at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:465) I/dalvikvm(11639): at com.myapplication.Fetcher.trySourceFetch(Fetcher.java:205) I/dalvikvm(11639): at com.myapplication.Fetcher.run(Fetcher.java:298) I/dalvikvm(11639): at java.lang.Thread.run(Thread.java:1102) I/dalvikvm(11639): E/dalvikvm(11639): Out of memory: Heap Size=24171KB, Allocated=23142KB, Bitmap Size=59KB, Limit=21884KB E/dalvikvm(11639): Extra info: Footprint=24327KB, Allowed Footprint=24519KB, Trimmed=348KB W/dalvikvm(11639): threadid=9: thread exiting with uncaught exception (group=0x40025b38)
That is not indicated by the stack trace you have on the issue. Of course, you didn't provide the whole stack trace on the issue.
The odds of this being a pure Android bug are small, though not zero.
Here are some other possibilities, in no particular order:
There is no problem with
execute()
per se, but that you are simply running out of memory, and the stack traces you have encountered are simply demonstrating thatexecute()
is stressing your heap.The problem is in some modifications that HTC made to Android for the Thunderbolt, possibly only taking effect when on the LTE network.
The problem is somehow caused by the Verizon LTE network itself (e.g., some proxy of theirs sending back screwball information that is causing HttpClient to have a conniption).
First, I'd use existing tools (e.g., dumping HPROF and examining with Eclipse MAT) to confirm that you don't have a memory leak in general that the Thunderbolt/LTE combo just seems to be tripping over.
Next, I recommend that you come up with some way to consistently reproduce the error. That could be your existing app with a series of steps to follow, or it could be a dedicated app (e.g., log the URL that triggers the OOM, then create a tiny app that just does that HttpClient request). I wish DeviceAnywhere had a Thunderbolt, but it doesn't look like it. I'll put some feelers out and see if I can get some help on that front.
In terms of working around it, as a stopgap, you can detect that you're running on a Thunderbolt via
android.os.Build
data, and perhaps that you're on LTE viaConnectivityManager
(I'm guessing LTE would list as WiMAX, but that's just a guess), and warn users about the problems with that combo.Beyond that, you can try changing up your HttpClient usage a bit and see if it has an effect, such as:
AndroidHttpClient
a shot as a drop-in replacementThreadSafeClientConnManager
I'm sorry that I don't have a "magic bullet" answer for you here.
UPDATE
Now that I have the full stack trace, looking through the source code is...illuminating, somewhat.
The problem appears to be that:
is returning that 2MB or so value that is triggering the OOM. That's an awfully big buffer, particularly for the Dalvik GC engine, which can get fragmented (yes, there's that word again).
params
here is theHttpParams
. You seem to be creating those yourself viagetHttpParams()
. For example,AndroidHttpClient
sets that to 8192:If you are setting the socket buffer size yourself, try reducing it. If not, try setting it to 8192 and see if that helps.
here's the fix: https://review.source.android.com/22852
in the meantime, URLConnection is immune. it's only HttpClient that has this problem.
if you're a developer wanting to test this kind of failure, you can use "adb shell setprop" to set, say, "net.tcp.buffersize.wifi" so that the maximum read/write socket buffer sizes are huge when your device is on wifi. something like the following would be a real stress test:
it's this kind of configuration change that exercises the HttpClient bug. i don't know what the exact values on the Thunderbolt are, but someone with the device could find out using "adb shell getprop | grep buffersize".
Maybe this will help: