Speech synthesis and recognition were both introduced in .NET 3.0. They both live in System.Speech.dll. In the past, I already talked about speech synthesis in the context of ASP.NET Web Form applications, this time, I’m going to talk about speech recognition.
.NET has in fact two APIs for that:
System.Speech, which comes with the core framework;
Microsoft.Speech, offering a somewhat similar API, but requiring a separate download: Microsoft Speech Platform Software Development Kit.
I am going to demonstrate a technique that makes use of HTML5 features, namely, Data URIs and the getUserMedia API and also ASP.NET Client Callbacks, which, if you have been following my blog, should know that I am a big fan of.
First, because we have two APIs that we can use, let’s start by creating an abstract base provider class:
It basically features one method, RecognizeFromWav, which takes a physical path and returns a SpeechRecognitionResult (code coming next). For completeness, it also implements the Dispose Pattern, because some provider may require it.
In a moment we will be creating implementations for the built-in .NET provider as well as Microsoft Speech Platform.
The SpeechRecognition property refers to our Web Forms control, inheriting from HtmlGenericControl, which is the one that knows how to instantiate one provider or the other:
The control exposes some custom properties:
Culture: an optional culture name, such as “pt-PT” or “en-US”; if not specified, it defaults to the current culture in the server machine;
Mode: one of the two providers: Desktop (for System.Speech) or Server (for Microsoft.Speech, of Microsoft Speech Platform);
SampleRate: a sample rate, by default, it is 44100;
Grammars: an optional collection of additional grammar files, with extension .grxml (Speech Recognition Grammar Specification), to add to the engine;
Choices: an optional collection of choices to recognize, if we want to restrict the scope, such as “yes”/”no”, “red”/”green”, etc.
The mode enumeration looks like this:
The SpeechRecognition control also has an event, SpeechRecognized, which allows overriding the detected phrases. Its argument is this simple class that follows the regular .NET event pattern:
Which in turn holds a SpeechRecognitionResult:
This class receives the phrase that the speech recognition engine understood plus an array of additional alternatives, in descending order.
You need to add an assembly-level attribute, WebResourceAttribute, possibly in AssemblyInfo.cs, of course, replacing MyNamespace for your assembly’s default namespace:
This attribute registers a script file with some content-type so that it can be included in a page by the RegisterClientScriptResource method.
And here is it:
OK, let’s move on the the provider implementations; first, Desktop:
What this provider does is simply receive the location of a .WAV file and feed it to SpeechRecognitionEngine, together with some parameters of SpeechRecognition (Culture, AudioRate, Grammars and Choices)
Finally, the code for the Server (Microsoft Speech Platform Software Development Kit) version:
As you can see, it is very similar to the Desktop one. Keep in mind, however, that for this provider to work you will have to download the Microsoft Speech Platform SDK, the Microsoft Speech Platform Runtime and at least one language from the Language Pack.
Here is a sample markup declaration:
If you want to add specific choices, add the Choices attribute to the control declaration and separate the values by commas:
Or add a grammar file:
By the way, grammars are not so difficult to create, you can find a good documentation in MSDN: http://msdn.microsoft.com/en-us/library/ee800145.aspx.
And that’s it. You start recognition by calling startRecording(), get results in onSpeechRecognized() (or any other function set in the OnClientSpeechRecognized property) and stop recording with stopRecording(). The values passed to onSpeechRecognized() are those that may have been filtered by the server-side SpeechRecognized event handler.
A final word of advisory: because generated sound files may become very large, do keep the recordings as short as possible.
Of course, this offers several different possibilities, I am looking forward to hearing them from you!
This page will list all of my book reviews, in no particular order.
Right now, I am reviewing Microsoft Azure Development Cookbook Second Edition, just give me some more days, because it’s a long one.
Microsoft Azure Development Cookbook Second Edition
(Review not yet available)
Windows Presentation Foundation 4.5 Cookbook
NHibernate 2 Beginner's Guide
NHibernate 3.0 Cookbook
Visual Studio 2012 and .NET 4.5 Expert Development Cookbook
NHibernate in Action Book
Instant StyleCop Code Analysis How-to Review
Programming Entity Framework
Pro WF: Windows Workflow in .NET 3.5
I have already talked about SignalR in this blog. I think it is one of the most interesting technologies that Microsoft put out recently, not because it is something substantially new – AJAX, long polling and server-sent events have been around for quite some time -, but because of how easy and extensive they made it.
Most of the examples of SignalR usually are about chat. You know that I have been digging into HTML5 lately, and I already posted on media acquisition using HTML5’s getUserMedia API. This time, however, I’m going to talk about video streaming!
I wrote a simple ASP.NET Web Forms control that leverages SignalR to build a real-time communication channel, and this channel transmits images as Data URIs. The source of the feed comes from getUserMedia and is therefore available in all modern browsers, except, alas, one that shall remain unnamed. Any browser, however, can be used to display the streaming feed.
So, the architecture is quite simple:
One SignalR Hub for the communication channel;
One ASP.NET Web Forms control, that renders an HTML5 VIDEO tag that is used to display the video being acquired, on any compatible browser.
Here are some nice screenshots of my now famous home wall, where one of the browser instances, Chrome, is broadcasting to Firefox, Opera and Safari. Unfortunately, IE is not supported.
So, all we need is a control declaration on an ASP.NET Web Forms page, which could look like this:
The non-ordinary properties that the VideoStreaming control supports are:
Interval: the rate in milliseconds that the control broadcasts that the video is being broadcast; do not set it to less than 200, from my experience, it will not work well;
Source: if the control is set as a streaming source, or just as a receiver (default);
And a simple CANVAS element for Target:
The reason for the two main streaming modes, Target and Event are flexibility: with Target, you directly draw the streamed picture on a CANVAS or IMG (it will detect automatically what to use), which needs to be already present, and with Event, you can do your own custom processing.
The source for the VideoStreaming control is this:
Renders a VIDEO tag element with the specified Width and Height;
Register a SignalR hub;
Removes any eventual VIDEO-related attribute that may be present on the control declaration;
Of course, we now need the hub code, which is as simple as it could be, just a method that takes the image as a Data URI and its original dimensions and broadcasts it to the world:
And finally the simple enumerations used by the VideoStreaming control:
And there you have it! Just drop the VideoStreaming control on an ASPX page and you’re done! If you want to have it broadcast, you need to set its Source property to true, for example, on the containing page:
That’s it. Very simple video streaming without the need for any plugin or server. Hope you enjoy it!
I wanted to be able to capture a snapshot and to upload it in AJAX style. Of course, if you know me you can guess I wanted this with ASP.NET Web Forms and client callbacks!
OK, so I came up with this markup:
The properties and events worth notice are:
Width & Height: should be self explanatory;
PictureTaken: a handler for a server-side .NET event that is raised when the picture is sent to the server asynchronously.
startCapture: when called, starts displaying the output of the camera in real time;
stopCapture: pauses the update of the camera;
takeSnapshot: takes a snapshot of the current camera output (as visible on screen) and sends it asynchronously to the server, raising the PictureTaken event.
When the startCapture method is called, the PictureSnapshot control (or, better, its underlying VIDEO tag) starts displaying whatever the camera is pointing to (if we so authorize it).
Here is an example of one of my home walls… yes, I deliberately got out of the way!
OK, enough talk, the code for PictureSnapshot looks like this:
As you can see, it inherits from WebControl. This is the simplest class in ASP.NET that allows me to output the tag I want, and also includes some nice stuff (width, height, style, class, etc). Also, it implements ICallbackEventHandler for the client callback stuff.
You may find it strange that I am setting explicitly the WIDTH and HEIGHT attributes, that is because VIDEO requires it in the tag, not just on the STYLE. In the way, I am removing any CROSSORIGIN, SRC, MUTED, PRELOAD, LOOP, AUTOPLAY, MEDIAGROUP, POSTER and CONTROLS attributes that may be present in the markup, because we don’t really want them.
The PictureTakenEventArgs class:
Finally, a simple event handler for PictureTaken:
Pictures arrive as instances of the Image class, and you can do whatever you want with it.
As always, feedback is greatly appreciated!
Continuing with my quest for reusable, no dependencies, Web Forms AJAX controls, this time I wanted a replacement for the venerable UpdatePanel control. Specifically, I wanted to address the following issues:
Allow the partial update of a region on my page, including all of its controls;
Have the ability to only send data for controls living inside my control, not everything else on the page (read, __VIEWSTATE);
I ended up with a CallbackPanel control, which is what I am going to talk about. Here is its declaration:
The CallbackPanel control supports some properties:
SendAllData: if all the data in the form should be sent, including viewstate, or just the data for the controls inside the CallbackPanel (default is false);
For causing an update, we call its callback function, passing it a parameter and an optional context:
The most important property in CallbackPanel is the server-side event, OnCallback: this is raised whenever the callback function is called:
This event receives a CallbackEventArgs argument, which is nothing more than:
And finally, the code for the CallbackPanel itself:
Again, it is implementing ICallbackEventHandler, for client callbacks, but this time it is inheriting from Panel, which is a nice container for other controls. The rest should be self-explanatory, I guess. If you have questions, do send them to me!
As always, hope you like it!
Set up a control that renders an AUDIO tag;
Generate a voice sound from the passed text parameter on the server and save it into an in-memory stream;
Convert the stream’s contents to a Data URI;
Return the generated Data URI to the client as the response to the client callback.
So, first of all, my markup looks like this:
As you can see, the SpeechSynthesizer control features a few optional properties:
- Age: the age for the generated voice (default is the one of the system’s default language);
- Gender: gender of the generated voice (same default as per Age);
- Culture: the culture of the generated voice (system default);
- Rate: the speaking rate, from –10 (fastest) to 10 (slowest), where the default is 0 (normal rate);
- Ssml: if the text is to be considered SSML or not (default is false);
- Volume: the volume %, between 0 and 100 (default);
- VoiceName: the name of a voice that is installed on the server machine.
The Age, Gender and Culture properties and the VoiceName are exclusive, you either specify one or the other. If you want to know the voices installed on your machine, have a look at the GetInstalledVoices method. If no property is specified, the speech will be synthesized with the operating system’s default (Microsoft Anna on Windows 7, Microsoft Dave, Hazel and Zira on Windows 8, etc). By the way, you can get additional voices, either commercially or for free, just look them up in Google.
Without further delay, here is the code:
As you can see, the SpeechSynthesizer control inherits from HtmlGenericControl, this is the simplest out-of-the-box class that will allow me to render my tag of choice (in this case, AUDIO); by the way, this class requires that I decorate it with a ConstructorNeedsTagAttribute, but you don’t have to worry about it. It implements ICallbackEventHandler for the client callback mechanism. I make sure that all of AUDIO’s attributes are removed from the output, because I don’t want them around.
Inside of it, I have an instance of the SpeechSynthesizer class, the one that will be used to do the actual work. Because this class is disposable, I make sure it is disposed at the end of the control’s life cycle. Based on the parameters being supplied, I either call the SelectVoiceByHints or the SelectVoice methods. One thing to note is, we need to set up a synchronization context, because the SpeechSynthesizer works asynchronously, so that we can wait for its result.
A full markup example would be:
And that’s it! Have fun with speech on your web apps!
A lot has changed with regard to events and event handling since the old days of HTML. Let’s have a brief look at it.
I have been playing recently with HTML5, and one thing that I got to understand really well was the new upload mechanisms available. Specifically, I wanted to understand how SkyOneDrive, Google Drive, Dropbox, etc, all support dropping files from the local machine, and how to use it in an ASP.NET Web Forms (sorry!) project, and I got it!
So, I want to have a panel (a DIV) that allows dropping files from the local machine; I want to be able to filter these files by their content type, maximum length, or by any custom condition. If the dropped files pass all conditions, they are sent asynchronously (read, AJAX) and raise a server-side event. After that, if they are multimedia files, I can preview them on the client-side.
My markup looks like this:
The UploadPanel control inherits from the Panel class, so it can have any of its properties. In this case, I am setting a specific width, height and border, so that it is easier to target.
Out of the box, it supports the following validations:
MaximumFiles: The maximum number of files to drop; if not set, any number is accepted;
MaximumLength: The maximum length of any individual file, in bytes; if not set, any file size is accepted;
ContentTypes: A comma-delimited list of content types; can take a precise content type, such as “image/gif” or a content type part, such as “image/”; if not specified, any content type is accepted.
When a file is dropped into the UploadPanel, the following client-side events are raised:
OnValidationFailure: Raised whenever any of the built-in validations (maximum files, maximum length and content types) fails, for any of the dropped files;
OnBeforeUpload: Raised before the upload starts, and after all built-in validations (maximum files, maximum length and content types) succeed; this gives developers a chance to analyze the files to upload and to optionally cancel the upload, or to add additional custom parameters that will be posted together with the files;
OnUploadFailure: Raised if the upload fails for some reason;
OnUploadCanceled: Raised if the upload is canceled;
OnUploadProgress: Raised possibly several times as the file is being uploaded, providing an indication of the total upload progress;
OnUploadSuccess: Raised when the upload terminates successfully;
OnUploadComplete: Raised when the upload completes, either successfully or not;
OnPreviewFile: Raised for each multimedia file uploaded (images, videos, sound), to allow previewing it on the client-side; the handler function receives the file as a data URL.
The Upload event takes a parameter of type UploadEventArgs which looks like this:
This class receives a list of all the files uploaded and also any custom properties assigned on the OnBeforeUpload event. It also allows the return of a response string, that will be received by the OnUploadSuccess or OnUploadComplete client-side events:
Finally, the code:
The UploadPanel class inherits from Panel and implements ICallbackEventHandler, for client callbacks. If you are curious, the __CALLBACKID, __CALLBACKPARAM, __EVENTTARGET and __EVENTARGUMENT are required for ASP.NET to detect a request as a callback, but only __CALLBACKID needs to be set with the unique id of the UploadPanel control.
HTML5 offers a lot of exciting new features. Stay tuned for some more examples of its integration with ASP.NET!
I have talked about client callbacks in the past, and even provided a general-purpose control for invoking code on the server-side. This time, I will provide two more examples:
Client Callbacks are probably the less known (and I dare say, less loved) of all the AJAX options in ASP.NET, which also include the UpdatePanel, Page Methods and Web Services. The reason for that, I believe, is it’s relative complexity:
Dynamically register function that calls the above reference;
So, here’s what I want to do:
Have a DOM element which exposes a method that is executed server side, passing it a string and returning a string;
Have a server-side event that handles the client-side call;
Have two client-side user-supplied callback functions for handling the success and error results.
The control itself looks like this:
And the event argument class:
You will notice two properties on the CallbackControl:
Async: indicates if the call should be made asynchronously or synchronously (the default);
SendAllData: indicates if the callback call will include the view and control state of all of the controls on the page, so that, on the server side, they will have their properties set when the Callback event is fired.
The CallbackEventArgs class exposes two properties:
Argument: the read-only argument passed to the client-side function;
Result: the result to return to the client-side callback function, set from the Callback event handler.
An example of an handler for the Callback event would be:
Finally, in order to fire the Callback event from the client, you only need this:
The syntax of the callback function is:
arg: some string argument;
context: some context that will be passed to the callback functions (success or failure);
callbackSuccessFunction: some function that will be called when the callback succeeds;
callbackFailureFunction: some function that will be called if the callback fails for some reason.
Give it a try and see if it helps!