I'm trying to encode h264 video on Android for real-time video streaming using MediaCodec but dequeueOutputBuffer keeps taking very long (actually it's very fast sometimes but very slow at other times, see log output below). I've seen it go even up to 200ms for the output buffer to be ready. Is there something I'm doing wrong with my code or do you think this is an issue with the OMX.Nvidia.h264.encoder?
Maybe I need to downsample the image from 1280x720 to something smaller? Or maybe I need to dequeue and queue more input buffers while I'm waiting for the output buffer? (There are 6 input and 6 output buffers available). I'm using Android API 19, so I can't use the asynchronous MediaCodec processing method. I'm actually streaming an image from a Google Project Tango tablet, so my other suspicion is that perhaps the Tango's background operations are taking too long and causing the encoder to be slow. Any thoughts on what might be slowing this down so much?
01-20 23:36:30.728 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 0.400666ms.
01-20 23:36:30.855 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 94.290667ms.
01-20 23:36:30.880 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 0.57ms.
01-20 23:36:30.929 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 4.878417ms.
01-20 23:36:31.042 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 77.495417ms.
01-20 23:36:31.064 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 0.3225ms.
01-20 23:36:31.182 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 74.777583ms.
01-20 23:36:31.195 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 0.23ms.
01-20 23:36:31.246 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 17.243583ms.
01-20 23:36:31.350 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 80.14725ms.
01-20 23:36:31.373 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 2.493834ms.
01-20 23:36:31.421 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 13.273ms.
01-20 23:36:31.546 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 93.543667ms.
01-20 23:36:31.576 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 5.309334ms.
01-20 23:36:31.619 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 13.402583ms.
01-20 23:36:31.686 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 22.5485ms.
01-20 23:36:31.809 2920-3014/com.... D/StreamingThread: dequeueOutputBuffer took 91.392083ms.
My relevant code is as follows:
public class StreamingThread extends Thread {
...
// encoding
private MediaCodec mVideoEncoder = null;
private ByteBuffer[] mEncoderInputBuffers = null;
private ByteBuffer[] mEncoderOutputBuffers = null;
private NV21Convertor mNV21Converter = null;
public static native VideoFrame getNewFrame();
public StreamingThread()
{
this.setPriority(MAX_PRIORITY);
}
@Override
public void run()
{
Looper.prepare();
init();
Looper.loop();
}
private void init()
{
mHandler = new Handler() {
public void handleMessage(Message msg) {
// process incoming messages here
switch(msg.what)
{
case HAVE_NEW_FRAME: // new frame has arrived (signaled from main thread)
processBufferedFrames();
break;
case CLOSE_THREAD:
close();
break;
default:
Log.e(LOGTAG, "received unknown message!");
}
}
};
try {
...
// set up video encoding
final String mime = "video/avc"; // H.264/AVC
listAvailableEncoders(mime); // (this creates some debug output only)
String codec = "OMX.Nvidia.h264.encoder"; // instead, hard-code the codec we want to use for now
mVideoEncoder = MediaCodec.createByCodecName(codec);
if(mVideoEncoder == null)
Log.e(LOGTAG, "Media codec " + codec + " is not available!");
// TODO: change, based on what we're streaming...
int FRAME_WIDTH = 1280;
int FRAME_HEIGHT = 720;
// https://github.com/fyhertz/libstreaming/blob/ac44416d88ed3112869ef0f7eab151a184bbb78d/src/net/majorkernelpanic/streaming/hw/EncoderDebugger.java
mNV21Converter = new NV21Convertor();
mNV21Converter.setSize(FRAME_WIDTH, FRAME_HEIGHT);
mNV21Converter.setEncoderColorFormat(MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Planar);
mNV21Converter.setColorPanesReversed(true);
mNV21Converter.setYPadding(0);
MediaFormat format = MediaFormat.createVideoFormat(mime, FRAME_WIDTH, FRAME_HEIGHT);
format.setInteger(MediaFormat.KEY_FRAME_RATE, 25);
format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 10);
format.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Planar);
// TODO: optimize bit rate
format.setInteger(MediaFormat.KEY_BIT_RATE, 250000); // 4 Million bits/second = 0.48 Megabytes/s
mVideoEncoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
mVideoEncoder.start();
mEncoderInputBuffers = mVideoEncoder.getInputBuffers();
mEncoderOutputBuffers = mVideoEncoder.getOutputBuffers();
Log.d(LOGTAG, "Number of input buffers " + mEncoderInputBuffers.length);
Log.d(LOGTAG, "Number of output buffers " + mEncoderOutputBuffers.length);
initialized = true;
} catch (Exception e) {
e.printStackTrace();
}
}
private void close()
{
Looper.myLooper().quit();
mVideoEncoder.stop();
mVideoEncoder.release();
mVideoEncoder = null;
}
private void processBufferedFrames()
{
if (!initialized)
return;
VideoFrame frame = getNewFrame();
try {
sendTCPFrame(frame);
} catch (Exception e) {
e.printStackTrace();
}
}
private void sendTCPFrame(VideoFrame frame)
{
long start = System.nanoTime();
long start2 = System.nanoTime();
int inputBufferIndex = -1;
while((inputBufferIndex = mVideoEncoder.dequeueInputBuffer(-1)) < 0 ) { // -1: wait indefinitely for the buffer
switch(inputBufferIndex) {
default:
Log.e(LOGTAG, "dequeueInputBuffer returned unknown value: " + inputBufferIndex);
}
}
// fill in input (raw) data:
mEncoderInputBuffers[inputBufferIndex].clear();
long stop2 = System.nanoTime();
Log.d(LOGTAG, "dequeueInputBuffer took " + (stop2 - start2) / 1e6 + "ms.");
start2 = System.nanoTime();
byte[] pixels = mNV21Converter.convert(frame.pixels);
stop2 = System.nanoTime();
Log.d(LOGTAG, "mNV21Converter.convert took " + (stop2-start2)/1e6 + "ms.");
start2 = System.nanoTime();
mEncoderInputBuffers[inputBufferIndex].put(pixels);
stop2 = System.nanoTime();
Log.d(LOGTAG, "mEncoderInputBuffers[inputBufferIndex].put(pixels) took " + (stop2 - start2) / 1e6 + "ms.");
start2 = System.nanoTime();
//mVideoEncoder.queueInputBuffer(inputBufferIndex, 0, pixels.length, 0, 0);
//mVideoEncoder.queueInputBuffer(inputBufferIndex, 0, pixels.length, System.nanoTime() / 1000, 0);
mVideoEncoder.queueInputBuffer(inputBufferIndex, 0, pixels.length, System.nanoTime(), 0);
stop2 = System.nanoTime();
Log.d(LOGTAG, "queueInputBuffer took " + (stop2 - start2) / 1e6 + "ms.");
start2 = System.nanoTime();
// wait for encoded data to become available:
int outputBufferIndex = -1;
MediaCodec.BufferInfo bufInfo = new MediaCodec.BufferInfo();
long timeoutUs = -1;//10000; // microseconds
while((outputBufferIndex = mVideoEncoder.dequeueOutputBuffer(bufInfo, timeoutUs)) < 0 ) { // -1: wait indefinitely for the buffer
Log.i(LOGTAG, "dequeueOutputBuffer returned value: " + outputBufferIndex);
switch(outputBufferIndex) {
case MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:
// output buffers have changed, move reference
mEncoderOutputBuffers = mVideoEncoder.getOutputBuffers();
break;
case MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:
// Subsequent data will conform to new format.
//MediaFormat format = codec.getOutputFormat();
Log.e(LOGTAG, "dequeueOutputBuffer returned INFO_OUTPUT_FORMAT_CHANGED ?!");
break;
case MediaCodec.INFO_TRY_AGAIN_LATER:
Log.w(LOGTAG, "dequeueOutputBuffer return INFO_TRY_AGAIN_LATER");
break;
default:
Log.e(LOGTAG, "dequeueOutputBuffer returned unknown value: " + outputBufferIndex);
}
}
stop2 = System.nanoTime();
Log.d(LOGTAG, "dequeueOutputBuffer took " + (stop2 - start2) / 1e6 + "ms.");
// output (encoded) data available!
Log.d(LOGTAG, "encoded buffer info: size = " + bufInfo.size + ", offset = " + bufInfo.offset + ", presentationTimeUs = " + bufInfo.presentationTimeUs + ", flags = " + bufInfo.flags);
ByteBuffer encodedData = mEncoderOutputBuffers[outputBufferIndex];
final int sizeOfImageData = bufInfo.size;
long stop = System.nanoTime();
Log.d(LOGTAG, "Encoding image took " + (stop-start)/1e6 + "ms.");
start = System.nanoTime();
// assemble header:
...
encodedData.rewind();
// copy (!) raw image data to "direct" (array-backed) buffer:
ByteBuffer imageBuffer = ByteBuffer.allocateDirect(encodedData.remaining());
imageBuffer.put(encodedData); // TODO: can this copy be avoided?
stop = System.nanoTime();
Log.d(LOGTAG, "Preparing content for streaming took " + (stop - start) / 1e6 + "ms.");
// do streaming via TCP
...
mVideoEncoder.releaseOutputBuffer(outputBufferIndex, false);
}
// see http://developer.android.com/reference/android/media/MediaCodecInfo.html
private void listAvailableEncoders(String mimeType)
{
Log.d(LOGTAG, "Available encoders for mime type " + mimeType + ":");
for (int i = 0; i < MediaCodecList.getCodecCount(); i++) {
MediaCodecInfo codec = MediaCodecList.getCodecInfoAt(i);
if (!codec.isEncoder())
continue;
String[] types = codec.getSupportedTypes();
for (int j = 0; j < types.length; j++) {
//if (types[j].equalsIgnoreCase(mimeType)) {
String msg = "- name: " + codec.getName() + ", supported color formats for " + mimeType + ":";
MediaCodecInfo.CodecCapabilities cap = codec.getCapabilitiesForType(mimeType);
for(int k = 0; k < cap.colorFormats.length; ++k) msg = msg + " " + cap.colorFormats[k];
Log.d(LOGTAG, msg);
// break;
//}
}
}
}
Yes, there is something wrong with your code - you are waiting synchronously for the current frame to be output from the encoder before proceeding with the next frame. Most hardware codecs have a bit more latency than you would expect, and in order to get the proper throughput as the encoder is capable of, you need to use it asynchronously.
That is, after sending one input buffer for encoding, you should not wait for the encoded output buffer, but only check if there is output. You should then go on and input the next buffer, and again check for any available output. Only once you don't get an input buffer immediately, you can start waiting for output. This way, there's always more than one input buffer available for the encoder to start working on, to keep it busy to actually achieve the frame rate that it is capable of.
(If you are ok with requiring Android 5.0, you could take a look at
MediaCodec.setCallback
, which makes it easier to work with asynchronously.)There are even some codecs (mainly decoders though, if my memory serves me correctly) that won't even output the first buffer until you have passed more than a few input buffers.