Creating a UWP DLL using Windows::Media::SpeechSyn

2019-08-18 07:54发布


I am currently trying to develop a speech synthesis UWP DLL using the namespace Windows::Media::SpeechSynthesis. I read this documentation and the Microsoft page dedicated to the namespace. I tried to implement the namespace in code.

Header file

#pragma once

#include <stdio.h>
#include <string>
#include <iostream>
#include <ppltasks.h>

using namespace Windows::Media::SpeechSynthesis;
using namespace Windows::UI::Xaml::Controls;
using namespace Windows::UI::Xaml::Media;
using namespace Windows::Media::Playback;

namespace SDKTemplate
    class TextToSpeechDll
           __declspec( dllexport ) void ttsInitialize();

           MediaElement ^media;

Cpp file

#include "stdafx.h"
#include "Dll2.h"

using namespace SDKTemplate;
using namespace Platform;
using namespace Concurrency;

void TextToSpeechDll::ttsInitialize()
    SpeechSynthesizer ^synth = ref new SpeechSynthesizer();
    // The object for controlling the speech synthesis engine (voice).
    synth = ref new SpeechSynthesizer();
    // The string to speak.
    String^ text = "Hello World";

    // Generate the audio stream from plain text.
    task<SpeechSynthesisStream ^> speakTask = create_task( synth->SynthesizeTextToStreamAsync( text ) );
    speakTask.then( [this, text]( task<SpeechSynthesisStream^> synthesisStreamTask )
        SpeechSynthesisStream ^speechStream = synthesisStreamTask.get();
        // Send the stream to the media object.
        // media === MediaElement XAML object.
        media->AutoPlay = true;
        media->SetSource( speechStream, speechStream->ContentType );
    } );

I can load the DLL file and the function I exported. However, when I try to call the function, I get following error

I tried the example on the Microsoft page but it some how doesn't work and I can't figure out why. I also tested the Windows Universal Samples available on Github which is an UWP app regrouping Text-To-Speech and Speech Recognition.

Has someone experienced a similar issue? Shouldn't I use an XAML element when I don't have an interface?

Edit 1

I modified the header file regarding the exportation of the function as suggested by @Peter Torr - MSFT

#pragma once

#include <stdio.h>
#include <string>
#include <iostream>
#include <ppltasks.h>

using namespace Windows::Media::SpeechSynthesis;
using namespace Windows::UI::Xaml::Controls;
using namespace Windows::UI::Xaml::Media;
using namespace Windows::Media::Playback;

namespace SDKTemplate
   public ref class TextToSpeechDll sealed
         void ttsInitialize();

         MediaElement ^media = ref new MediaElement();

However, when I compile, I'm getting a new error on this line

speakTask.then( [this]( task<SpeechSynthesisStream^> synthesisStreamTask )

I researched this error and if I understood it correctly it comes from the importation of the DLL function.

In addition, I call the function like this


Which brings us here

void NxWindowsTtsUwpDll::ttsInitialize()
   int retVal = 0;
      retVal = _ttsInitialize();
   catch( ... )
      printf( "Exception in ttsInitialize\n" );
   //return retVal;


In MainPage,I initialize the dll file and invoke "ttsInitialize" function like below code.

 TextToSpeechDll* gf = new TextToSpeechDll();

And In Dll.h file,I initialize MediaElement like below code,and other code as same as you.

MediaElement^ media = ref new MediaElement();

When I run the project,it works. You can try it and if you still have issues,please show the details that you initialize your dll file.


I found an answer to my question. Instead of using MediaElement, I used MediaPlayer. Now it works, but I still need to figure out how to make the engine speak without limiting it in time. The Sleep( 3000 ) means that the voice will speak for 3 seconds. However, if the sentence is longer than 3 seconds, it will be cut. Here is the code of the program.

int TextToSpeechUwpDll::ttsSpeak( const char* text )
   SpeechSynthesizer ^speak = ref new SpeechSynthesizer();
   MediaPlayer ^player = ref new MediaPlayer;

   int wchars_num = MultiByteToWideChar( CP_ACP, 0, text, -1, NULL, 0 );
   wchar_t* texts = new wchar_t[wchars_num];
   MultiByteToWideChar( CP_ACP, 0, text, -1, texts, wchars_num );
   String ^sentence = ref new String( texts );

   task<SpeechSynthesisStream ^> speakTask = create_task( speak->SynthesizeTextToStreamAsync( sentence ) );
   speakTask.then( [player, sentence]( SpeechSynthesisStream ^speechStream )
      player->Source = MediaSource::CreateFromStream( speechStream, speechStream->ContentType );
      player->AutoPlay = false;
      Sleep( 3000 );
   } );

   return true;