I am using NAudio to attempt to convert Linear16 PCM wav files that come out of a 3rd party Text-To-Speech API to G711 8-bit 8-khz MULAW that will work as a telephony prompt. Using techniques found in the library authors documentation and some stack overflow posts and specifically following suggestion to do a 2 step conversion.
dynamic foo = JsonConvert.DeserializeObject<dynamic>(result);
byte[] decoded = Convert.FromBase64String(foo.audioContent.ToString());
WaveFormat newFormat = new WaveFormat(8000, 16, 1);
WaveFormat mulaw = WaveFormat.CreateMuLawFormat(8000, 1);
using (MemoryStream mem = new MemoryStream(decoded))
using (WaveFileReader reader = new WaveFileReader(mem))
using (var conversionStream = new WaveFormatConversionStream(newFormat, reader))
using (var convStream2 = new WaveFormatConversionStream(mulaw, conversionStream))
{
WaveFileWriter.CreateWaveFile("voiceprompt_downsample_8bit-8khz.wav", convStream2);
File.WriteAllBytes("voiceprompt_raw.wav", decoded);
}
Unfortunately the resulting audio quality of the converted file is pretty degraded (which is to be expected to a degree). However if I take the exact same source file that I am running through the code above and submit it to the converter at g711.org and select the "BroadWorks Classic (8Khz, Mono, u-law)" option the resulting audio sounds much better (especially note that it is not clipping/crushing the S's in words like "access" and "password" in some of our prompts).
I have confirmed that both audio files (the one I convert with NAudio and the one I generated using g711.org) play fine as prompts through our telephony system.
Wondering if anyone out there with NAudio experience has any suggestions about what I can do differently in NAudio to get the output quality of the converted file to match what I am getting out of the g711.org site?