Contents tagged with Speech
-
Building Custom Activities Using the Core API
The Speech Server API is interesting to play around with. And understanding how Speech Server works behind the scenes is invaluable in debugging. But the real value of learning the API comes when you decide to build your own custom activities....
-
Getting Started with the Core API
With the introduction of Voice Response Workflows in Speech Server 2007, Microsoft has greatly simplified voice-enabled application development. The entire process is relatively painless; even downright enjoyable. And for most applications it is all you'll need to build outstanding applications. ...
-
New Blog: Speaking From the Edge
I've started a new blog over on GotSpeech.net called Speaking From the Edge. It covers topics related to voice-enabled application development including Microsoft Speech Server.
I'm not abandoning this site however. I will continue to post here as well. I'm also putting some thought to re-launching this blog to focus on other topics I'm interested in.
-
Executing a Speech Server Workflow via the API
In my previous post I outlined a basic framework for using the Core API for Speech Server 2007. Today I'll outline how to mix both the API and Workflow models by calling out to a workflow using the API and returning control back when it is complete.
If you are interested in the complete project code you may download it here.
I'm starting with the same basic framework from my last post. To this I'm adding a simple Voice Response Workflow with a single Statement activity. Rather than calling the synthesizer from the API, I'm going to use the Statement activity inside the workflow. We're going to pass in the text we want it to play at runtime.
1) The first thing we need to do is add a new Voice Response Workflow to the project. Too this we'll add a single Statement activity. Because we've established the call using the API there is no AnswerCall, MakeCall, or DisconnectCall activities in this workflow.
2) Now that we have our really-big-and-complex workflow ready, we can start adding some code to set the prompt. The first thing we need to do is add a handler for the TurnStarting event. This is where we will assign the Main Prompt property for the activity.
3) Next we need to add a property to pass in our input parameters too. We'll call this MyPrompt. The resulting code behind should look like the following:
using System;
using System.ComponentModel;
using System.ComponentModel.Design;
using System.Collections;
using System.Diagnostics;
using System.Drawing;
using System.Workflow.ComponentModel.Compiler;
using System.Workflow.ComponentModel.Serialization;
using System.Workflow.ComponentModel;
using System.Workflow.ComponentModel.Design;
using System.Workflow.Runtime;
using System.Workflow.Activities;
using System.Workflow.Activities.Rules;
using Microsoft.SpeechServer.Dialog;namespace VoiceResponseWorkflowApplication2
{
public sealed partial class Workflow1: SpeechSequentialWorkflowActivity
{
public Workflow1()
{
InitializeComponent();
}private string _myPrompt;
public string MyPrompt
{
get { return _myPrompt; }
set { _myPrompt = value; }
}private void statementActivity1_TurnStarting(object sender, TurnStartingEventArgs e)
{
statementActivity1.MainPrompt.SetText(MyPrompt);
}
}
}4) Now we need to add some code to kick off the workflow. We'll do this from within the OpenCompleted event handler in Class1.cs. This code establishes an input parameter Dictionary<>, instantiates the workflow object, and starts the workflow. We'll add a handler for the WorkflowCompleted event so that we can cleanup the call once the workflow is done.
Dictionary<string, object> inputParam = new Dictionary<string, object>();
inputParam.Add("MyPrompt", "HelloWorld");
myWorkflow = SpeechSequentialWorkflowActivity.CreateWorkflow(_host, typeof(Workflow1), inputParam);
myWorkflow.WorkflowRuntime.WorkflowCompleted += new EventHandler<WorkflowCompletedEventArgs>(WorkflowRuntime_WorkflowCompleted);
SpeechSequentialWorkflowActivity.Start(myWorkflow);One interesting item here is the inputParam object. The way this works is that you pass in parameters to the workflow and it assigns the values to the corresponding public properties of the workflow. If you pass an input parameter for which there is no property you will get an exception.
The complete Class1.cs:
using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.SpeechServer ;
using Microsoft.SpeechServer.Dialog;
using System.Workflow.Runtime;namespace VoiceResponseWorkflowApplication2
{
public class Class1 : IHostedSpeechApplication
{
private IApplicationHost _host;
private WorkflowInstance myWorkflow;public void Start(IApplicationHost host)
{
if (host != null)
{
_host = host;
_host.TelephonySession.CurrentUICulture = System.Globalization.CultureInfo.GetCultureInfo("en-US");// Dial and outbound call (make sure you change these numbers :-)
_host.TelephonySession.OpenCompleted += new EventHandler<AsyncCompletedEventArgs>(TelephonySession_OpenCompleted);
_host.TelephonySession.OpenAsync("7813062200", "8887006263");
}
else
{
throw new ArgumentNullException("host");
}
}void TelephonySession_OpenCompleted(object sender, AsyncCompletedEventArgs e)
{
if (e.Error != null)
{
_host.TelephonySession.Close();
}
else
{
Dictionary<string, object> inputParam = new Dictionary<string, object>();
inputParam.Add("MyPrompt", "HelloWorld");
myWorkflow = SpeechSequentialWorkflowActivity.CreateWorkflow(_host, typeof(Workflow1), inputParam);
myWorkflow.WorkflowRuntime.WorkflowCompleted += new EventHandler<WorkflowCompletedEventArgs>(WorkflowRuntime_WorkflowCompleted);
SpeechSequentialWorkflowActivity.Start(myWorkflow);
}
}void WorkflowRuntime_WorkflowCompleted(object sender, WorkflowCompletedEventArgs e)
{
_host.TelephonySession.Close();
}public void Stop(bool immediate)
{}
public void OnUnhandledException(Exception exception)
{
if (exception != null)
{
_host.TelephonySession.LoggingManager.LogApplicationError(100, "An unexpected exception occurred: {0}", exception.Message);
}
else
{
_host.TelephonySession.LoggingManager.LogApplicationError(100, "An unknown exception occurred: {0}", System.Environment.StackTrace);
}_host.OnCompleted();
}
}
} -
Getting Started with the Speech Server 2007 API
Speech Server 2007 has a really cool Windows Workflow based programming model that lets you quickly build interactive voice response applications. For many applications it is all you will ever need.
Sometimes however you find the workflow model just isn't the right fit. If you're looking for really fine-grained control over the application, or you simply prefer to work in code, then the Core API is what you need.
Unfortunately figuring out you want to use the API is a lot easier than figuring out how to start using it. There is very little documentation and no Visual Studio project templates or samples included with Speech Server.
I'll do my best to give a brick-simple explanation of how to get your first core API project started. You can also download the zipped project files.
1) First you'll need to create a new Voice Response Workflow Application. We'll use the project that gets generated as our foundation.
2) When asked for the application resources you'll want to uncheck everything.
3) Open up the Class1.cs file and remove all of the references to the VoiceResponseWorkflow1 class. The resulting class should look like the following (I removed the comments in the code for brevity):
using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.SpeechServer ;
using Microsoft.SpeechServer.Dialog;namespace VoiceResponseWorkflowApplication1
{
public class Class1 : IHostedSpeechApplication
{
private IApplicationHost _host;public void Start(IApplicationHost host)
{
if (host != null)
{
_host = host;
_host.TelephonySession.CurrentUICulture = System.Globalization.CultureInfo.GetCultureInfo("en-US");
}
else
{
throw new ArgumentNullException("host");
}
}
public void Stop(bool immediate)
{
}public void OnUnhandledException(Exception exception)
{
if (exception != null)
{
_host.TelephonySession.LoggingManager.LogApplicationError(100, "An unexpected exception occurred: {0}", exception.Message);
}
else
{
_host.TelephonySession.LoggingManager.LogApplicationError(100, "An unknown exception occurred: {0}", System.Environment.StackTrace);
}_host.OnCompleted();
}
}
}That's all folks. Class1.cs is now the starting point of your Core API application. As a further example, lets take the project and add some code to turn it into an outbound dialing "Hello World" application.
using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.SpeechServer ;
using Microsoft.SpeechServer.Dialog;namespace VoiceResponseWorkflowApplication1
{
public class Class1 : IHostedSpeechApplication
{
private IApplicationHost _host;public void Start(IApplicationHost host)
{
if (host != null)
{
_host = host;
_host.TelephonySession.CurrentUICulture = System.Globalization.CultureInfo.GetCultureInfo("en-US");
// Dial and outbound call (make sure you change these numbers :-)
_host.TelephonySession.OpenCompleted += new EventHandler<AsyncCompletedEventArgs>(TelephonySession_OpenCompleted);
_host.TelephonySession.OpenAsync("7813062200", "8887006263");
}
else
{
throw new ArgumentNullException("host");
}
}void TelephonySession_OpenCompleted(object sender, AsyncCompletedEventArgs e)
{
if (e.Error != null)
{
_host.TelephonySession.Close();
}
else
{
_host.TelephonySession.Synthesizer.SpeakCompleted += new EventHandler<Microsoft.SpeechServer.Synthesis.SpeakCompletedEventArgs>(Synthesizer_SpeakCompleted);
_host.TelephonySession.Synthesizer.SpeakAsync("Hello World", Microsoft.SpeechServer.Synthesis.SynthesisTextFormat.PlainText);
}
}void Synthesizer_SpeakCompleted(object sender, Microsoft.SpeechServer.Synthesis.SpeakCompletedEventArgs e)
{
_host.TelephonySession.Close();
}
public void Stop(bool immediate)
{
}public void OnUnhandledException(Exception exception)
{
if (exception != null)
{
_host.TelephonySession.LoggingManager.LogApplicationError(100, "An unexpected exception occurred: {0}", exception.Message);
}
else
{
_host.TelephonySession.LoggingManager.LogApplicationError(100, "An unknown exception occurred: {0}", System.Environment.StackTrace);
}_host.OnCompleted();
}
}
} -
VoiceXML on Speech Server
Yesterday I posted about an issue with Speech Server and Vista. One reader named Bill asked a question in the comments. My response was a bit long for a comment so I decided to turn it into a separate post instead.
Hey Marc, are you using Microsoft Speech Server with VXML? If so, what hardware are you using on it? Also, does MSS support CCXML?
-BillYes, I'm using quite a bit of VoiceXML. Most of the applications I work on are written to run against the Nuance Voice Platform. I've been using VXML so that I could run them against either platform (or any other platform for that matter).
There are some issues that I ran into where I was using Nuance specific properties (example) that Microsoft doesn't have VXML equivalents for. In those cases I needed to write them using the Speech Server managed model.
The key thing to keep in mind is that Microsoft has implemented the VXML spec pretty much verbatim. So as long as your application is pure VXML you should be fine.
I haven't put Speech Server through any sizing tests so I'm not sure what the hardware requirements will be in the end. That said, my development machine is a DELL D830 with 4GB of RAM running Vista Ultimate. In the lab I'm using a DELL 1950 with 4GB of RAM running Windows Server 2003. In both cases I'm using a Dialogic DMG2000 gateway.
As for CCXML, they don't support it and I don't see that changing. I actually think CCXML is going to go the way of SALT. With only Voxeo supporting a real CCXML implementation I don't think there is going to be a lot of call for it. Also, everything you would want to do with CCXML can be done using Speech Server's Managed API. This is just a guess on my part, I don't have any inside knowledge as to what Microsoft's roadmap looks like.
-
Notiva
Around 18 months ago I started a new position with Parlance Corporation. I’m proud to say I've delivered my first product - Notiva.
Essentially Notiva is an outbound messaging service which gives developers the ability to add voice, email, and SMS messaging to any application, infrastructure, or architecture with just a few lines of code.
It has been a while since I've last rolled out a completely new product. This has been, without question, one of the most rewarding products I've every worked on. Frankly it has been an absolute blast to work on this.
I'm really excited about Notiva and where it is headed. We're working on the final touches of a full application build on Notiva now. We've already had one partner integrate it into an existing product already.
If you would like to give it a try you can drop me an email or check out www.notiva.com.
-
Straight VoiceXML vs. Windows Workflow
There is an interesting post over on GotSpeech.NET (VXML vs. Workflow for Speech Server 2007) that compares speech devolvement using VoiceXML vs. the Windows Workflow model available in Speech Server 2007.
Given that most of my work building applications in C# and ASP.NET for the Nuance Voice Platform (NVP) I've got quite a lot more experience with VXML than Workflow (or SALT). I partially disagree with him when he sites a "longer development cycle" with VXML. It is all about familiarity with the language and platform. But for the most part I think he makes good points.
I think the Workflow model is interesting but I'm wary of tying myself to a single voice platform. I much prefer the flexibility of moving between Nuance, Microsoft, Voxeo, etc. as needed. Each platform brings a different strength to the table and it seems like a bad idea to limit my options at this point.
-
VoiceXML with Visual Studio
Every so often I'm surprised by the incredible flexibility built into Visual Studio 2005.
I've been writing a lot of VoiceXML lately and I was really missing the intellisense that I've become so used to. On a whim I tried opening a VoiceXML document in Visual Studio and much to my surprise it worked!
It turns out that Visual Studio is capable of understanding the syntax of a document based on it's DOCTYPE. In my case it saw <!DOCTYPE vxml PUBLIC "-//W3C//DTD VOICEXML 2.1//EN" "http://www.w3.org/TR/voicexml21/vxml.dtd"> and was able to automatically give me basic intellisense and syntax checking for VoiceXML version 2.1.
As an example, create a new XML document and insert the following:
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE vxml PUBLIC "-//W3C//DTD VOICEXML 2.1//EN" "http://www.w3.org/TR/voicexml21/vxml.dtd">
<vxml version="2.1">
</vxml>You'll notice that the last element (</vxml>) gives you a warning. Hovering over it tells you not only that your missing an element but what the valid elements might be!
This is all very cool if you ask me...