Android Decode raw h264 stream with MediaCodec

2019-02-25 11:49发布

问题:

I have troubles with decoding and drawing raw h264 data with MediaCodec on the TextureView. I recieve the raw data in byte arrays, each of the array is NAL unit (starts with 0x00 0x00 0x00 0x01), also there are SPS and PPS NAL units in constant intervals. When new data arrives, I'm putting it into LinkedBlockingQueue:

public void pushData(byte[] videoBuffer) {
    dataQueue.add(videoBuffer);

    if (!decoderConfigured) {
        // we did not receive first SPS NAL unit, we want to throw away all data until we do
        if (dataQueue.peek() != null && checkIfParameterSet(dataQueue.peek(), SPSID)) {

            // SPS NAL unit is followed by PPS NAL unit, we wait until both are present at the
            // start of the queue
            if (dataQueue.size() == 2) {

                // iterator will point head of the queue (SPS NALU),
                // iterator.next() will point PPS NALU
                Iterator<byte[]> iterator = dataQueue.iterator();

                String videoFormat = "video/avc";
                MediaFormat format = MediaFormat.createVideoFormat(videoFormat, width, height);
                format.setString("KEY_MIME", videoFormat);
                format.setByteBuffer("csd-0", ByteBuffer.wrap(concat(dataQueue.peek(), iterator.next())));

                try {
                    decoder = MediaCodec.createDecoderByType(videoFormat);
                } catch (IOException e) {
                    e.printStackTrace();
                }

                decoder.configure(format, mOutputSurface, null, 0);
                decoder.start();

                inputBuffer = decoder.getInputBuffers();

                decoderConfigured = true;
            }
        } else {
            // throw away the data which appear before first SPS NALU
            dataQueue.clear();
        }
    }
}

As you can see, there is also decoder configuration here. It is done when the first SPS+PPS show up in the queue. The main part running in while loop:

private void work() {
    while(true) {
         if (decoderConfigured) {
            byte[] chunk = dataQueue.poll();
            if (chunk != null) {
                // we need to queue the input buffer with SPS and PPS only once
                if (checkIfParameterSet(chunk, SPSID)) {
                    if (!SPSPushed) {
                        SPSPushed = true;
                        queueInputBuffer(chunk);
                    }
                } else if (checkIfParameterSet(chunk, PPSID)) {
                    if (!PPSPushed) {
                        PPSPushed = true;
                        queueInputBuffer(chunk);
                    }
                } else {
                    queueInputBuffer(chunk);
                }
            }

            int decoderStatus = decoder.dequeueOutputBuffer(mBufferInfo, TIMEOUT_USEC);
            if (decoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
                // no output available yet
                if (VERBOSE) Log.d(TAG, "no output from decoder available");
            } else if (decoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                // not important for us, since we're using Surface
                if (VERBOSE) Log.d(TAG, "decoder output buffers changed");
            } else if (decoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                MediaFormat newFormat = decoder.getOutputFormat();
                if (VERBOSE) Log.d(TAG, "decoder output format changed: " + newFormat);
            } else if (decoderStatus < 0) {
                throw new RuntimeException(
                        "unexpected result from decoder.dequeueOutputBuffer: " + decoderStatus);
            } else { // decoderStatus >= 0
                if (VERBOSE) Log.d(TAG, "surface decoder given buffer " + decoderStatus +
                        " (size=" + mBufferInfo.size + ")");
                if ((mBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                    if (VERBOSE) Log.d(TAG, "output EOS");
                }

                boolean doRender = (mBufferInfo.size != 0);

                try {
                    if (doRender && frameCallback != null) {
                        Log.d(TAG, "Presentation time passed to frameCallback: " + mBufferInfo.presentationTimeUs);
                        frameCallback.preRender(mBufferInfo.presentationTimeUs);
                    }
                    decoder.releaseOutputBuffer(decoderStatus, doRender);

                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        }
    }
}

And the queueInputBuffer looks like this:

private void queueInputBuffer(byte[] data) {
    int inIndex = decoder.dequeueInputBuffer(TIMEOUT_USEC);
    if (inIndex >= 0) {
        inputBuffer[inIndex].clear();
        inputBuffer[inIndex].put(data, 0, data.length);
        decoder.queueInputBuffer(inIndex, 0, data.length, System.currentTimeMillis() * 1000, 0);
    }
}

The class that wraps up this mechanics runs on separate thread, similarly to MoviePlayer from grafika. Also the FrameCallback is SpeedControlCallback from grafika.

The result preview is corrupted. When the camera (the video source) is still, it's quite fine, but when it's moving, tearing, pixelation and artifacts are showing up. When I save the raw video data to the file and play it on desktop with ffplay, it seems alright.

When I was looking for solution, I found out that the problem may be caused by invalid presentation time. I tried to fixed it (you can see in code, I was providing System time along with use of preRender()) with no luck. But I'm not really sure if the glitching is caused by theese timestamps.

Can someone help me solving this problem?

UPDATE 1

Like fadden suggested, I have tested my player against data created by MediaCodec itself. My code captured camera preview, encoded it and saved it into the file. I did this earlier with my target device's camera feed, so I could just switch the data source. The file based on phone's camera preview does not show any artifacts in playback. So the conclusion would be that the raw data coming from the target device's camera is processed (or passed to decoder) incorectly or it is incompatibile with MediaCodec (as fadden suggested may be the case).

The next thing I did was to compare NAL units of both video streams. The video encoded by MediaCodec looks like this:

0x00, 0x00, 0x00, 0x01, 0x67, 0xNN, 0xNN ...
0x00, 0x00, 0x00, 0x01, 0x65, 0xNN, 0xNN ...
0x00, 0x00, 0x00, 0x01, 0x21, 0xNN, 0xNN ...
0x00, 0x00, 0x00, 0x01, 0x21, 0xNN, 0xNN ...
.
. 
.    
0x00, 0x00, 0x00, 0x01, 0x21, 0xNN, 0xNN ...

The first NALU occurs only once, at the beginning of the stream, then comes the second (with 0x65) and then multiple with 0x21. Then again 0x65, multiple 0x21 and so on.

However the target device's camera gives me this:

0x00, 0x00, 0x00, 0x01, 0x67, 0xNN, 0xNN ...
0x00, 0x00, 0x00, 0x01, 0x68, 0xNN, 0xNN ...
0x00, 0x00, 0x00, 0x01, 0x61, 0xNN, 0xNN ...
0x00, 0x00, 0x00, 0x01, 0x61, 0xNN, 0xNN ...
.
. 
.    
0x00, 0x00, 0x00, 0x01, 0x61, 0xNN, 0xNN ...

And this whole sequence is repeated continuously during in the stream.