Video rendering is broken MediaCodec H.264 stream

2019-01-14 20:08发布

问题:

I am implementing a decoder using MediaCodec Java API for decoding live H.264 remote stream. I am receiving H.264 encoded data from native layer using a callback (void OnRecvEncodedData(byte[] encodedData)), decode and render on Surface of TextureView. My implementation is completed (retrieving encoded streams using callback, decode and rendering etc). Here is my decoder class:

public class MediaCodecDecoder extends Thread implements MyFrameAvailableListener {

    private static final boolean VERBOSE = true;
    private static final String LOG_TAG = MediaCodecDecoder.class.getSimpleName();
    private static final String VIDEO_FORMAT = "video/avc"; // h.264
    private static final long mTimeoutUs = 10000l;

    private MediaCodec mMediaCodec;
    Surface mSurface;
    volatile boolean m_bConfigured;
    volatile boolean m_bRunning;
    long startMs;

    public MediaCodecDecoder() {
        JniWrapper.SetFrameAvailableListener(this);
    }

    // this is my callback where I am receiving encoded streams from native layer 
    @Override
    public void OnRecvEncodedData(byte[] encodedData) {
        if(!m_bConfigured && bKeyFrame(encodedData)) {
            Configure(mSurface, 240, 320, encodedData);
        }
        if(m_bConfigured) {
            decodeData(encodedData);
        }
    }

    public void SetSurface(Surface surface) {
        if (mSurface == null) {
            mSurface = surface;
        }
    }

    public void Start() {
        if(m_bRunning)
            return;
        m_bRunning = true;
        start();
    }

    public void Stop() {
        if(!m_bRunning)
            return;
        m_bRunning = false;
        mMediaCodec.stop();
        mMediaCodec.release();
    }

    private void Configure(Surface surface, int width, int height, byte[] csd0) {
        if (m_bConfigured) {
            Log.e(LOG_TAG, "Decoder is already configured");
            return;
        }
        if (mSurface == null) {
            Log.d(LOG_TAG, "Surface is not available/set yet.");
            return;
        }
        MediaFormat format = MediaFormat.createVideoFormat(VIDEO_FORMAT, width, height);
        format.setByteBuffer("csd-0", ByteBuffer.wrap(csd0));
        try {
            mMediaCodec = MediaCodec.createDecoderByType(VIDEO_FORMAT);
        } catch (IOException e) {
            Log.d(LOG_TAG, "Failed to create codec: " + e.getMessage());
        }

        startMs = System.currentTimeMillis();
        mMediaCodec.configure(format, surface, null, 0);
        if (VERBOSE) Log.d(LOG_TAG, "Decoder configured.");

        mMediaCodec.start();
        Log.d(LOG_TAG, "Decoder initialized.");

        m_bConfigured = true;
    }

    @SuppressWarnings("deprecation")
    private void decodeData(byte[] data) {
        if (!m_bConfigured) {
            Log.e(LOG_TAG, "Decoder is not configured yet.");
            return;
        }
        int inIndex = mMediaCodec.dequeueInputBuffer(mTimeoutUs);
        if (inIndex >= 0) {
            ByteBuffer buffer;
            if (Build.VERSION.SDK_INT < Build.VERSION_CODES.LOLLIPOP) {
                buffer = mMediaCodec.getInputBuffers()[inIndex];
                buffer.clear();
            } else {
                buffer = mMediaCodec.getInputBuffer(inIndex);
            }
            if (buffer != null) {
                buffer.put(data);
                long presentationTimeUs = System.currentTimeMillis() - startMs;
                mMediaCodec.queueInputBuffer(inIndex, 0, data.length, presentationTimeUs, 0);
            }
        }
    }

    private static boolean bKeyFrame(byte[] frameData) {
        return ( ( (frameData[4] & 0xFF) & 0x0F) == 0x07);
    }

    @Override
    public void run() {
        try {
            MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
            while(m_bRunning) {
                if(m_bConfigured) {
                    int outIndex = mMediaCodec.dequeueOutputBuffer(info, mTimeoutUs);
                    if(outIndex >= 0) {
                        mMediaCodec.releaseOutputBuffer(outIndex, true);
                    }
                } else {
                    try {
                        Thread.sleep(10);
                    } catch (InterruptedException ignore) {
                    }
                }
            }
        } finally {
            Stop();
        }
    }
}

Now the problem is - the streams is being decoded and rendered on surface but the video is not clear. It seems like the frames are broken and scene is distorted/dirty. The movement is broken and square shaped fragments everywhere (I am really sorry as I don't have the screenshot right now).

About my streams - its H.264 encoded and consists of I frames and P frames only (there is no B frame). Every I frame has SPS + PPS + payload structure. The color format used during encoding (using FFMPEG in native layer) is YUV420 planner. The sent length of data from native layer is okay (width * height * (3 / 2)).

During configure() I just set the csd-0 value with SPS frame. The frame used for configuration was an I frame (SPS + PPS + payload) - the prefix was a SPS frame, so I think the configuration was successful. Note that, I didn't set the csd-1 value with PPS frame (is it a problem?).

Every frame has preceding start codes (0x00 0x00 0x00 0x01) for both p-frame and I-frame (for I-frame the start code is present both infront of SPS and PPS frame).

Moreover, I am setting the presentation timestamp as System.currrentTimeMillis() - startTime for every frame which is increasing order for every new frame. I think this shouldn't cause any problem (Correct me if I am wrong).

My device is Nexus 5 from Google with Android version 4.4.4 and chipset is Qualcomm MSM8974 Snapdragon 800. I am using Surface for decoding, so I think there should not be any device specific color format mismatch issues.

I can also provide my TextureView code if needed.

What might be the cause of my incorrect decoding/rendering? Thanks in advance!

EDIT 1

I tried by manually passing my codec-specific data(SPS and PPS bytes) during configuration. But this didn't make any change :(

byte[] sps  = {0x00, 0x00, 0x00, 0x01, 0x67, 0x4d, 0x40, 0x0c, (byte) 0xda, 0x0f, 0x0a, 0x68, 0x40, 0x00, 0x00, 0x03, 0x00, 0x40, 0x00, 0x00, 0x07, (byte) 0xa3, (byte) 0xc5, 0x0a, (byte) 0xa8};
format.setByteBuffer("csd-0", ByteBuffer.wrap(sps));

byte[] pps = {0x00, 0x00, 0x00, 0x01, 0x68, (byte) 0xef, 0x04, (byte) 0xf2, 0x00, 0x00};
format.setByteBuffer("csd-1", ByteBuffer.wrap(pps));

I also tried by trimming the start codes (0x00, 0x00, 0x00, 0x01) but no progress!

EDIT 2

I tried with hardware accelerated {{TextureView}} as it is mentioned in official documentation (though I didn't find any H/W acceleration code in sample project of MediaCodec-textureView). But still no progress. Now I commented the H/W acceleration code snippet.

EDIT 3

The screenshots are avilable now:

EDIT 4

For further clarification, this is my H.264 encoded I-frame hex stream format:

00 00 00 01 67 4d 40 0c da 0f 0a 68 40 00 00 03 00 40 00 00 07 a3 c5 0a a8 00 00 00 01 68 ef 04 f2 00 00 01 06 05 ff ff 69 dc 45 e9 bd e6 d9 48 b7 96 2c d8 20 d9 23 ee ef 78 32 36 34 20 2d 20 63 6f 72 65 20 31 34 36 20 2d 20 48 2e 32 36 34 2f 4d 50 45 47 2d 34 20 41 56 43 20 63 6f 64 65 63 20 2d 20 43 6f 70 79 6c 65 66 74 20 32 30 30 33 2d 32 30 31 35 20 2d 20 68 74 74 70 3a 2f 2f 77 77 77 2e 76 69 64 65 6f 6c 61 6e 2e 6f 72 67 2f 78 32 36 34 2e 68 74 6d 6c 20 2d 20 6f 70 74 69 6f 6e 73 3a 20 63 61 62 61 63 3d 31 20 72 65 66 3d 31 20 64 65 62 6c 6f 63 6b 3d 31 3a 30 3a 30 20 61 6e 61 6c 79 73 65 3d 30 78 31 3a 30 78 31 20 6d 65 3d 68 65 78 20 73 75 62 6d 65 3d 30 20 70 73 79 3d 31 20 70 73 79 5f 72 64 3d 31 2e 30 30 3a 30 2e 30 30 20 6d 69 78 65 64 5f 72 65 66 3d 30 20 6d 65 5f 72 61 6e 67 65 3d 31 36 20 63 68 72 6f 6d 61 5f 6d 65 3d 31 20 74 72 65 6c 6c 69 73 3d 30 20 38 78 38 64 63 74

And this is a P-frame:

00 00 00 01 41 9a 26 22 df 76 4b b2 ef cf 57 ac 5b b6 3b 68 b9 87 b2 71 a5 9b 61 3c 93 47 bc 79 c5 ab 0f 87 34 f6 40 6a cd 80 03 b1 a2 c2 4e 08 13 cd 4e 3c 62 3e 44 0a e8 97 80 ec 81 3f 31 7c f1 29 f1 43 a0 c0 a9 0a 74 62 c7 62 74 da c3 94 f5 19 23 ff 4b 9c c1 69 55 54 2f 62 f0 5e 64 7f 18 3f 58 73 af 93 6e 92 06 fd 9f a1 1a 80 cf 86 71 24 7d f7 56 2c c1 57 cf ba 05 17 77 18 f1 8b 3c 33 40 18 30 1f b0 19 23 44 ec 91 c4 bd 80 65 4a 46 b3 1e 53 5d 6d a3 f0 b5 50 3a 93 ba 81 71 f3 09 98 41 43 ba 5f a1 0d 41 a3 7b c3 fd eb 15 89 75 66 a9 ee 3a 9c 1b c1 aa f8 58 10 88 0c 79 77 ff 7d 15 28 eb 12 a7 1b 76 36 aa 84 e1 3e 63 cf a9 a3 cf 4a 2d c2 33 18 91 30 f7 3c 9c 56 f5 4c 12 6c 4b 12 1f c5 ec 5a 98 8c 12 75 eb fd 98 a4 fb 7f 80 5d 28 f9 ef 43 a4 0a ca 25 75 19 6b f7 14 7b 76 af e9 8f 7d 79 fa 9d 9a 63 de 1f be fa 6c 65 ba 5f 9d b0 b0 f4 71 cb e2 ea d6 dc c6 55 98 1b cd 55 d9 eb 9c 75 fc 9d ec

I am pretty sure about my stream's correctness as I successfully rendered using ffmpeg decoding and GLSurfaceview with OpenGLES 2.0.

回答1:

I took H.264 dump both from native layer and Java layer and found the dump of native layer was played perfectly but the Java layer's dump was played as broken as the decoded stream. The problem was - during passing encoded streams from native layer to Java, the encoded stream was not passed properly(corrupted) and it was because of my buggy implementation (sorry to who were following this thread for this inconvenience).

Moreover I was passing I-frame's payload only to decoder which resulted in broken rendering. Now I am passing complete NAL unit (SPS + PPS + payload) and everything is okay now :)