可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I've been using FFmpeg.AutoGen https://github.com/Ruslan-B/FFmpeg.AutoGen wrapper to decode my H264 video for sometime with great success and now have to add AAC audio decoding (previous I was using G711 and NAudio for this).

I have the AAC stream decoding using avcodec_decode_audio4, however the output buffer or frame is in floating point format FLT and I need it to be in S16. For this I have found unmanaged examples using swr_convert and FFmpeg.AutoGen does have this function P/Invoked as;

[DllImport(SWRESAMPLE_LIBRARY, EntryPoint="swr_convert", CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Ansi)]
public static extern int swr_convert(SwrContext* s, byte** @out, int out_count, byte** @in, int in_count);

My trouble is that I can't find a successful way of converting/fixing/casting my managed byte[] in to a byte** to provide this as the destination buffer.

Has anyone doing this before?

My non-working code...

packet.ResetBuffer(m_avFrame->linesize[0]*2);

fixed (byte* pData = packet.Payload)
{
    byte** src = &m_avFrame->data_0;
    //byte** dst = *pData;
    IntPtr d = new IntPtr(pData);

    FFmpegInvoke.swr_convert(m_pConvertContext, (byte**)d.ToPointer(), packet.Length, src, (int)m_avFrame->linesize[0]);
}

Thanks for any help.

Cheers

Dave

回答1:

The function you are trying to call is documented here: http://www.ffmpeg.org/doxygen/2.0/swresample_8c.html#a81af226d8969df314222218c56396f6a

The out_arg parameter is declare like this:

uint8_t* out_arg[SWR_CH_MAX]

That is an length SWR_CH_MAX array of byte arrays. Your translation renders that as byte** and so forces you to use unsafe code. Personally I think I would avoid that. I would declare the parameter like this:

[MarshalAs(UnmanagedType.LPArray)]
IntPtr[] out_arg

Declare the array like this:

IntPtr[] out_arg = new IntPtr[channelCount];

I am guessing that the CH in SWR_CH_MAX is short-hand for channel.

Then you need to allocate memory for the output buffer. I'm not sure how you want to do that. You could allocate one byte array per channel and pin those arrays to get hold of a pointer to pass down to the native code. That would be my preferred approach because upon return you'd have your channels in nice managed arrays. Another way would be a call to Marshal.AllocHGlobal.

The input buffer would need to be handled in the same way.

I would not use the automated pinvoke translation that you are currently using. It seems he'll bent on forcing you to use pointers and unsafe code. Not massively helpful. I'd translate it by hand.

I'm sorry not to give more specific details but it's a little hard because your question did not contain any information about the types used in your code samples. I hope the general advice is useful.

回答2:

Thanks to @david-heffernan answer I've managed to get the following working and I'm posting as an answer as examples of managed use of FFmpeg are very rare.

fixed (byte* pData = packet.Payload)
{
    IntPtr[] in_buffs = new IntPtr[2];
    in_buffs[0] = new IntPtr(m_avFrame->data_0);
    in_buffs[1] = new IntPtr(m_avFrame->data_1);
    IntPtr[] out_buffs = new IntPtr[1];
    out_buffs[0] = new IntPtr(pData);

    FFmpegInvoke.swr_convert(m_pConvertContext, out_buffs, m_avFrame->nb_samples, in_buffs, m_avFrame->nb_samples);
}

In in the complete context of decoding a buffer of AAC audio...

    protected override void DecodePacket(MediaPacket packet)
    {
        int frameFinished = 0;


        AVPacket avPacket = new AVPacket();
        FFmpegInvoke.av_init_packet(ref avPacket);
        byte[] payload = packet.Payload;
        fixed (byte* pData = payload)
        {
            avPacket.data = pData;
            avPacket.size = packet.Length;
            if (packet.KeyFrame)
            {
                avPacket.flags |= FFmpegInvoke.AV_PKT_FLAG_KEY;
            }

            int in_len = packet.Length;


            int count = FFmpegInvoke.avcodec_decode_audio4(CodecContext, m_avFrame, out frameFinished, &avPacket);

            if (count != packet.Length)
            {
            }

            if (count < 0)
            {
                throw new Exception("Can't decode frame!");
            }
        }
        FFmpegInvoke.av_free_packet(ref avPacket);

        if (frameFinished > 0)
        {
            if (!mConverstionContextInitialised)
            {
                InitialiseConverstionContext();
            }

            packet.ResetBuffer(m_avFrame->nb_samples*4); // need to find a better way of getting the out buff size

            fixed (byte* pData = packet.Payload)
            {
                IntPtr[] in_buffs = new IntPtr[2];
                in_buffs[0] = new IntPtr(m_avFrame->data_0);
                in_buffs[1] = new IntPtr(m_avFrame->data_1);
                IntPtr[] out_buffs = new IntPtr[1];
                out_buffs[0] = new IntPtr(pData);

                FFmpegInvoke.swr_convert(m_pConvertContext, out_buffs, m_avFrame->nb_samples, in_buffs, m_avFrame->nb_samples);
            }

            packet.Type = PacketType.Decoded;

            if (mFlushRequest)
            {
                //mRenderQueue.Clear();
                packet.Flush = true;
                mFlushRequest = false;
            }

            mFirstFrame = true;
        }
    }