I'm trying to put together a "cool demo" for a career day at my daughter's jr. high in 5 days and so I'm trying to use the echoprint library to perform over the air (OTA) audio recognition. I've never really gone much farther than "hello world" in C++ and I am trying to use C++/CLI to wrap the echoprint codegen library so I can call it from C#. Here's my header file:
// echoprint-cli.h
#pragma once
#include "Codegen.h";
using namespace System;
namespace echoprintcli {
public ref class CodegenCLI
{
public:
String^ getCodeString(array<float>^ buffer, unsigned int samples, int start_offset);
};
}
Here's my implementation:
#include "stdafx.h"
#include <msclr\marshal_cppstd.h>
#include "echoprint-cli.h"
using namespace System;
using namespace System::Runtime::InteropServices;
using namespace msclr::interop;
namespace echoprintcli {
String^ CodegenCLI::getCodeString(array<float>^ buffer, unsigned int samples, int start_offset){
String^ result = String::Empty;
if(buffer->Length > 0){
GCHandle h = GCHandle::Alloc(buffer, System::Runtime::InteropServices::GCHandleType::Pinned);
try{
float* pcm = (float*)(void*)h.AddrOfPinnedObject();
Codegen* codegen = new Codegen(pcm, samples, start_offset); //System.AccessViolationException here
std::string code;
try{
code = codegen->getCodeString();
}finally{
delete codegen;
}
result = marshal_as<String^>(code);
}
finally{
h.Free();
}
}
return result;
}
}
I'm using the XNA Microphone class to record the audio. It returns an array of byte[], so I'm converting the bytes to float and then passing it thru my wrapper to the Codegen class like this (C#):
var mic = Microphone.Default;
Log(String.Format("Using '{0}' as audio input...", mic.Name));
var buffer = new byte[mic.GetSampleSizeInBytes(TimeSpan.FromSeconds(22))];
int bytesRead = 0;
string fileName = String.Empty;
try
{
mic.Start();
try
{
Log(String.Format("{0:HH:mm:ss} Start recording audio stream...", DateTime.Now));
while (bytesRead < buffer.Length)
{
Thread.Sleep(1000);
var bytes = mic.GetData(buffer, bytesRead, (buffer.Length - bytesRead));
Log(String.Format("{0:HH:mm:ss} Saving {1} bytes to stream...", DateTime.Now, bytes));
bytesRead += bytes;
}
Log(String.Format("{0:HH:mm:ss} Finished recording audio stream...", DateTime.Now));
}
finally
{
mic.Stop();
}
Func<byte, float> convert = (b) => System.Convert.ToSingle(b);
var converter = new Converter<byte, float>(convert);
float[] pcm = Array.ConvertAll<byte, float>(buffer, converter);
Log(String.Format("{0:HH:mm:ss} Generating audio fingerprint...", DateTime.Now));
var codeg = new CodegenCLI();
String code = codeg.getCodeString(pcm, (uint)pcm.Length, 0);
But when my C++/CLI method (getCodeString) calls into the native method I get Sysetem.AccessViolationException.
The entire source code is available as a VS 2010 SP1 or VS 11 solution on github: https://github.com/developmentalmadness/echoprint-net/tree/3c48d3783136188bfa213d3e9fd1ebea0f151bed
That URL should point to the revision that's currently experiencing the problem.
EDIT I tried the suggestion here: AccessViolation, when calling C++-DLL from C++/CLI
#include "stdafx.h"
#include <msclr\marshal_cppstd.h>
#include "echoprint-cli.h"
using namespace System;
using namespace System::Runtime::InteropServices;
using namespace msclr::interop;
namespace echoprintcli {
String^ CodegenCLI::getCodeString(array<float>^ buffer, unsigned int samples, int start_offset){
String^ result = String::Empty;
IntPtr p = Marshal::AllocHGlobal(buffer->Length * sizeof(float));
try{
pin_ptr<float> pcm = static_cast<float*>(p.ToPointer());
Codegen* codegen = new Codegen(pcm, samples, start_offset); // System.AccessViolationException here
std::string code;
try{
code = codegen->getCodeString();
}finally{
delete codegen;
}
result = marshal_as<String^>(code);
}
finally{
Marshal::FreeHGlobal(p);
}
return result;
}
}
And I still get the access violation, but after crashing the debugger dropped me into the native code (I have no idea how to get there myself). And it bombs inside the ctor. The pointer (pcm) has an address an a value of 0.0000000 but I can't figure out how to debug into the code myself other than to show the source here:
Codegen::Codegen(const float* pcm, unsigned int numSamples, int start_offset) {
if (Params::AudioStreamInput::MaxSamples < (uint)numSamples)
throw std::runtime_error("File was too big\n");
Whitening *pWhitening = new Whitening(pcm, numSamples); //System.AccessViolationException
Without being able to debug, I can only assume to follow down the stack two steps:
Whitening::Whitening(const float* pSamples, uint numSamples) :
_pSamples(pSamples), _NumSamples(numSamples) {
Init();
}
And I imagine it bombs in the Init() method somewhere:
void Whitening::Init() {
int i;
_p = 40;
_R = (float *)malloc((_p+1)*sizeof(float));
for (i = 0; i <= _p; ++i) { _R[i] = 0.0; }
_R[0] = 0.001;
_Xo = (float *)malloc((_p+1)*sizeof(float));
for (i = 0; i < _p; ++i) { _Xo[i] = 0.0; }
_ai = (float *)malloc((_p+1)*sizeof(float));
_whitened = (float*) malloc(sizeof(float)*_NumSamples);
}
As promised on the EchoNest forum, here is my way of doing it. You can have it easier and go without CLI if you modify codegen.dll and provide a suitable exported function.
To
main.cxx
in codegen, add the following method:Now on the C# side, you can simply do this:
Now you only need this special buffer of floats for the first parameter. You mention already having one, but as a bonus for everyone who has audio data of another format, below is a method for converting pretty much any audio file into the correct buffer of floats. Requirement is the BASS.NET audio library: