DialogFlow StreamingDetectIntentResponse doesn'

2019-08-21 01:30发布

I have successfully used Grpc in Unity and sent request to Dialog flow and received response. You can check the details here

However the whole returned result is the following only

{ "queryResult": { "languageCode": "ja" } }

The expected response id, query text, etc are not returned. When testing in console.dialogflow.com I get the following result

{ "responseId": "cdf8003e-6599-4a28-9314-f4462c36e21b", "queryResult": { "queryText": "おはようございます", "speechRecognitionConfidence": 0.92638445, "languageCode": "ja" } }

However when I tried in console.dialogflow.com and didn't say anything I got

{ "queryResult": { "languageCode": "ja" } }

So perhaps the InputAudio encoding is wrong somehow.

Here's how I do it

var serializedByteArray = convertToBytes(samples);
request.InputAudio = Google.Protobuf.ByteString.CopyFrom(serializedByteArray);

And convert to bytes is like the following

public static byte[] convertToBytes(float[] audio)
{
    List<byte> bytes = new List<byte>();

    foreach (float audioI in audio) {
        bytes.AddRange(BitConverter.GetBytes(audioI));
    }

    return bytes.ToArray();
}

The audio source is define as follows where sampleRate is 16000

audioSource.clip = Microphone.Start(null, true, 30, sampleRate);

I made sure to set sample rate hz properly.

queryInput.AudioConfig.SampleRateHertz = sampleRate;

Edit:

I have logged the recorded bytes from unity to a file (have all the bytes streamed appended together) and have written a console application to test the binary generated but using DetectIntent rather than streaming detect intent.

GoogleCredential credential = GoogleCredential.FromJson(privateKey);

var url = "dialogflow.googleapis.com";

Grpc.Core.Channel channel = new Grpc.Core.Channel(url, credential.ToChannelCredentials());


var client = SessionsClient.Create(channel);


CallOptions options = new CallOptions();

DetectIntentRequest detectIntentRequest = new DetectIntentRequest();
detectIntentRequest.Session = "projects/projectid/agent/sessions/" + "detectIntent";
QueryInput queryInput = new QueryInput();
queryInput.AudioConfig = new InputAudioConfig();
queryInput.AudioConfig.LanguageCode = "ja";
queryInput.AudioConfig.SampleRateHertz = sampleRate;//must be between 8khz and 48khz
queryInput.AudioConfig.AudioEncoding = AudioEncoding.Linear16;

detectIntentRequest.QueryInput = queryInput;

detectIntentRequest.InputAudio = Google.Protobuf.ByteString.CopyFrom(File.ReadAllBytes("D:\\temp\\audio.bytes"));
 var response = client.DetectIntent(detectIntentRequest);
        Console.WriteLine(response.ToString());
        Console.WriteLine(response.ResponseId);
Console.Read();

I still get this (and empty response.ResponseId)

{ "queryResult": { "languageCode": "ja" } }

Thanks for advance.

1条回答
SAY GOODBYE
2楼-- · 2019-08-21 02:08

Finally found the answer. The way I converted the datasource float to linear16 byte array was obviously wrong. Here's the code that worked Credits to that post on unity forum.

https://forum.unity.com/threads/writing-audiolistener-getoutputdata-to-wav-problem.119295/#post-899142

public static byte[] convertToBytes(float[] dataSource)
{
    var intData = new Int16[dataSource.Length];
    //converting in 2 steps : float[] to Int16[], //then Int16[] to Byte[]

    var bytesData = new Byte[dataSource.Length * 2];
    //bytesData array is twice the size of
    //dataSource array because a float converted in Int16 is 2 bytes.

    var rescaleFactor = 32767; //to convert float to Int16

    for (var i = 0; i < dataSource.Length; i++)
    {
        intData[i] = (short)(dataSource[i] * rescaleFactor);
        var byteArr = new byte[2];
        byteArr = BitConverter.GetBytes(intData[i]);
        byteArr.CopyTo(bytesData, i * 2);
    }

    return bytesData;
}
查看更多
登录 后发表回答