Decoding Opus audio data

2019-03-08 00:18发布

问题:

I am trying to decode an Opus file back to raw 48 kHz. However I am unable to find any sample code to do that.

My current code is this:

void COpusCodec::Decode(unsigned char* encoded, short* decoded, unsigned int len)
{
     int max_size=960*6;//not sure about this one

     int error;
     dec = opus_decoder_create(48000, 1, &error);//decode to 48kHz mono

     int frame_size=opus_decode(dec, encoded, len, decoded, max_size, 0);
}

The argument "encoded" might be larger amounts of data, so I think I have to split it into frames. I am not sure how I could do that.

And with being a beginner with Opus, I am really afraid to mess something up.

Could anybody perhaps help?

回答1:

I think the opus_demo.c program from the source tarball has what you want.

It's pretty complicated though, because of all the unrelated code pertaining to

  • encoding, parsing encoder parameters from command line arguments
  • artificial packet loss injection
  • random framesize selection/changing on-the-fly
  • inband FEC (meaning decoding into two buffers, toggling between the two)
  • debug and verification
  • bit-rate statistics reporting

Removing all these bits is a very tedious job, as it turns out. But once you do, you end up with pretty clean, understandable code, see below.

Note that I

  • kept the 'packet-loss' protocol code (even though packet loss won't happen reading from a file) for reference
  • kept the code that verifies the final range after decoding each frame

Mostly because it doesn't seem to complicate the code, and you might be interested in it.

I tested this program in two ways:

  • aurally (by verifying that a mono wav previously encoded using opus_demo was correctly decoded using this stripped decoder). The test wav was ~23Mb, 2.9Mb compressed.
  • regression tested alongside the vanilla opus_demo when called with ./opus_demo -d 48000 1 <opus-file> <pcm-file>. The resultant file had the same md5sum checksum as the one decoded using the stripped decoder here.

MAJOR UPDATE I C++-ified the code. This should get you somewhere using iostreams.

  • Note the loop on fin.readsome now; this loop could be made 'asynchronous' (i.e. it could be made to return, and continue reading when new data arrives (on the next invocation of your Decode function?)[1]
  • I have cut the dependencies on opus.h from the header file
  • I have replaced "all" manual memory management by standard library (vector, unique_ptr) for exception safety and robustness.
  • I have implemented an OpusErrorException class deriving from std::exception which is used to propagate errors from libopus

See all the code + Makefile here: https://github.com/sehe/opus/tree/master/contrib

[1] for true async IO (e.g. network or serial communinication) consider using Boost Asio, see e.g. http://www.boost.org/doc/libs/1_53_0/doc/html/boost_asio/overview/networking/iostreams.html

Header File

// (c) Seth Heeren 2013
//
// Based on src/opus_demo.c in opus-1.0.2
// License see http://www.opus-codec.org/license/
#include <stdexcept>
#include <memory>
#include <iosfwd>

struct OpusErrorException : public virtual std::exception
{
    OpusErrorException(int code) : code(code) {}
    const char* what() const noexcept;
private:
    const int code;
};

struct COpusCodec
{
    COpusCodec(int32_t sampling_rate, int channels);
    ~COpusCodec();

    bool decode_frame(std::istream& fin, std::ostream& fout);
private:
    struct Impl;
    std::unique_ptr<Impl> _pimpl;
};

Implementation File

// (c) Seth Heeren 2013
//
// Based on src/opus_demo.c in opus-1.0.2
// License see http://www.opus-codec.org/license/
#include "COpusCodec.hpp"
#include <vector>
#include <iomanip>
#include <memory>
#include <sstream>

#include "opus.h"

#define MAX_PACKET 1500

const char* OpusErrorException::what() const noexcept
{
    return opus_strerror(code);
}

// I'd suggest reading with boost::spirit::big_dword or similar
static uint32_t char_to_int(char ch[4])
{
    return static_cast<uint32_t>(static_cast<unsigned char>(ch[0])<<24) |
        static_cast<uint32_t>(static_cast<unsigned char>(ch[1])<<16) |
        static_cast<uint32_t>(static_cast<unsigned char>(ch[2])<< 8) |
        static_cast<uint32_t>(static_cast<unsigned char>(ch[3])<< 0);
}

struct COpusCodec::Impl
{
    Impl(int32_t sampling_rate = 48000, int channels = 1)
    : 
        _channels(channels),
        _decoder(nullptr, &opus_decoder_destroy),
        _state(_max_frame_size, MAX_PACKET, channels)
    {
        int err = OPUS_OK;
        auto raw = opus_decoder_create(sampling_rate, _channels, &err);
        _decoder.reset(err == OPUS_OK? raw : throw OpusErrorException(err) );
    }

    bool decode_frame(std::istream& fin, std::ostream& fout)
    {
        char ch[4] = {0};

        if (!fin.read(ch, 4) && fin.eof())
            return false;

        uint32_t len = char_to_int(ch);

        if(len>_state.data.size())
            throw std::runtime_error("Invalid payload length");

        fin.read(ch, 4);
        const uint32_t enc_final_range = char_to_int(ch);
        const auto data = reinterpret_cast<char*>(&_state.data.front());

        size_t read = 0ul;
        for (auto append_position = data; fin && read<len; append_position += read)
        {
            read += fin.readsome(append_position, len-read);
        }

        if(read<len)
        {
            std::ostringstream oss;
            oss << "Ran out of input, expecting " << len << " bytes got " << read << " at " << fin.tellg();
            throw std::runtime_error(oss.str());
        }

        int output_samples;
        const bool lost = (len==0);
        if(lost)
        {
            opus_decoder_ctl(_decoder.get(), OPUS_GET_LAST_PACKET_DURATION(&output_samples));
        }
        else
        {
            output_samples = _max_frame_size;
        }

        output_samples = opus_decode(
                _decoder.get(), 
                lost ? NULL : _state.data.data(),
                len,
                _state.out.data(),
                output_samples,
                0);

        if(output_samples>0)
        {
            for(int i=0; i<(output_samples)*_channels; i++)
            {
                short s;
                s=_state.out[i];
                _state.fbytes[2*i]   = s&0xFF;
                _state.fbytes[2*i+1] = (s>>8)&0xFF;
            }
            if(!fout.write(reinterpret_cast<char*>(_state.fbytes.data()), sizeof(short)* _channels * output_samples))
                throw std::runtime_error("Error writing");
        }
        else
        {
            throw OpusErrorException(output_samples); // negative return is error code
        }

        uint32_t dec_final_range;
        opus_decoder_ctl(_decoder.get(), OPUS_GET_FINAL_RANGE(&dec_final_range));

        /* compare final range encoder rng values of encoder and decoder */
        if(enc_final_range!=0
                && !lost && !_state.lost_prev
                && dec_final_range != enc_final_range)
        {
            std::ostringstream oss;
            oss << "Error: Range coder state mismatch between encoder and decoder in frame " << _state.frameno << ": " <<
                    "0x" << std::setw(8) << std::setfill('0') << std::hex << (unsigned long)enc_final_range <<
                    "0x" << std::setw(8) << std::setfill('0') << std::hex << (unsigned long)dec_final_range;

            throw std::runtime_error(oss.str());
        }

        _state.lost_prev = lost;
        _state.frameno++;

        return true;
    }
private:
    const int _channels;
    const int _max_frame_size = 960*6;
    std::unique_ptr<OpusDecoder, void(*)(OpusDecoder*)> _decoder;

    struct State
    {
        State(int max_frame_size, int max_payload_bytes, int channels) :
            out   (max_frame_size*channels),
            fbytes(max_frame_size*channels*sizeof(decltype(out)::value_type)),
            data  (max_payload_bytes)
        { }

        std::vector<short>         out;
        std::vector<unsigned char> fbytes, data;
        int32_t frameno   = 0;
        bool    lost_prev = true;
    };
    State _state;
};

COpusCodec::COpusCodec(int32_t sampling_rate, int channels)
    : _pimpl(std::unique_ptr<Impl>(new Impl(sampling_rate, channels)))
{
    //
}

COpusCodec::~COpusCodec()
{
    // this instantiates the pimpl deletor code on the, now-complete, pimpl class
}

bool COpusCodec::decode_frame(
        std::istream& fin,
        std::ostream& fout)
{
    return _pimpl->decode_frame(fin, fout);
}

test.cpp

// (c) Seth Heeren 2013
//
// Based on src/opus_demo.c in opus-1.0.2
// License see http://www.opus-codec.org/license/
#include <fstream>
#include <iostream>

#include "COpusCodec.hpp"

int main(int argc, char *argv[])
{
    if(argc != 3)
    {
        std::cerr << "Usage: " << argv[0] << " <input> <output>\n";
        return 255;
    }

    std::basic_ifstream<char> fin (argv[1], std::ios::binary);
    std::basic_ofstream<char> fout(argv[2], std::ios::binary);

    if(!fin)  throw std::runtime_error("Could not open input file");
    if(!fout) throw std::runtime_error("Could not open output file");

    try
    {
        COpusCodec codec(48000, 1);

        size_t frames = 0;
        while(codec.decode_frame(fin, fout))
        {
            frames++;
        }

        std::cout << "Successfully decoded " << frames << " frames\n";
    }
    catch(OpusErrorException const& e)
    {
        std::cerr << "OpusErrorException: " << e.what() << "\n";
        return 255;
    }
}


回答2:

libopus provides an API for turning opus packets into chunks of PCM data, and vice-versa.

But to store opus packets in a file, you need some kind of container format that stores the packet boundaries. opus_demo is, well, a demo app: it has its own minimal container format for testing purposes that is not documented, and thus files produced by opus_demo should not be distributed. The standard container format for opus files is Ogg, which also provides support for metadata and sample-accurate decoding and efficient seeking for variable-bitrate streams. Ogg Opus files have the extension ".opus".

The Ogg Opus spec is at https://wiki.xiph.org/OggOpus.

(Since Opus is also a VoIP codec, there are uses of Opus that do not require a container, such as transmitting Opus packets directly over UDP.)

So firstly you should encode your files using opusenc from opus-tools, not opus_demo. Other software can produce Ogg Opus files too (I believe gstreamer and ffmpeg can, for example) but you can't really go wrong with opus-tools as it's the reference implementation.

Then, assuming your files are standard Ogg Opus files (that can be read by, say, Firefox), what you need to do is: (a) extract opus packets from the Ogg container; (b) pass the packets to libopus and get raw PCM back.

Conveniently, there's a library called libopusfile that does precisely this. libopusfile supports all of the features of Ogg Opus streams, including metadata and seeking (including seeking over an HTTP connection).

libopusfile is available at https://git.xiph.org/?p=opusfile.git and https://github.com/xiph/opusfile. The API is documented here, and opusfile_example.c (xiph.org | github) provides example code for decoding to WAV. Since you're on windows I should add there are prebuilt DLLs on the downloads page.