Dragon NaturallySpeaking vs Microsoft Speech Recognition

I recently heard from a large reseller of Dragon NaturallySpeaking who thinks ScanSoft Nuance should start giving away Dragon NaturallySpeaking Preferred - as a response to the competitive threat from the next version of Microsoft speech recognition, to be released with Windows Vista. The theory being that now is the time to get the general market "hooked" on DNS. While it's certainly an easy approach to take, it suffers from a few problems.

 

Switching costs for non-macro versions of speech recognition systems are not very high; just a few minutes of training, and an import of custom vocabularies. The primary speech user interface remains essentially the same – more on this later. In other markets, competitors against Microsoft might be able to rely on users knee-jerk dislike of “M$”, but it doesn't appear that ScanSoft has earned any customer loyalty. The second problem is that speech recognition is one of the remaining product domains where quality really matters. 98% recognition accuracy might be great from a speech recognition research perspective, but it's really just not very good from a user's perspective. Lots of work still needs be done. And I expect a users will go towards the product with greater accuracy and more responsiveness.

 

Let me instead offer my own suggestions for how Nuance can compete.

 

Nuance needs to open up. While I understand that they have significant intellectual property protection issues, listening and talking to customers can only help. Allowing employees to have their own weblogs and to participate on online discussion forums would go a long way towards reducing the enmity many of its most active users feel towards the company. Microsoft has at least 5 speech based weblogs (Sprague WebLog, Robert Brown, SpeechLeadJen's Weblog, Rob's Rhapsody). In contrast, the silence from Nuance is deafening.

 

And of course the product needs to be improved. Some of the needed changes are architectural; dictation support for Word and Rich Text edit controls is no longer sufficient. Certainly RTF had its use back in the day, but the world has moved on. Adding support for the Windows Text Services Framework would make it easier for application vendors to have built-in free support for speech recognition. Without this application vendors have a choice to make when developing their applications. If they support the TSF they automatically support Windows speech-recognition, tablet input, virtual keyboard input and whatever other devices Windows ends up supporting. Or they can build specifically for DNS - an easy choice I would think for any application vendor.

 

The GUI needs improvement. This is well covered territory, but specifically the command browser is in desperate need of a mercy killing and I would follow up with a redesign for the DragonBar, throw in some new icons, and allow users to change the font size of the correction menu.

 

API - $2000 for the documentation of an ActiveX based API is just silly. This is an area where Nuance needs to meet Microsoft's pricing. With Windows Vista not only will users get a free speech recognition product, but developers will get a free speech API. A native .Net API, and more exposed functionality would help.

 

Innovation - this is needed on two different levels. Real improvements are needed in accuracy and responsiveness, but more than just Nuance need to take a lead in speech user interfaces. As I mentioned before the basic command structure for Dragon NaturallySpeaking and Microsoft Speech Recognition are quite similar. Users trained in one will really have no difficulty migrating to the other. But there's plenty of room for improvement in speech user interfaces.

 

Speech user interfaces, are in essence, the command line all over again - the user is expected to know in advance how to use the system.

One of the design philosophies at Applied Recognition is that the speech user interface is backed up with a graphical user interface - there's no need to guess what the command might be. In a sense this is a start towards an inductive user interface for speech. Innovation in this area from Nuance is needed to improve the user experience for its customers, and make it more difficult for its customers to switch to competing products.

Published Wednesday, October 26, 2005 4:16 PM by swein

Comments

Wednesday, August 02, 2006 12:11 PM by Jan Rychter

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

Couldn't agree more, especially with the "opening up" part. While working on an interface between NaturallySpeaking and XEmacs I was rather surprised to find so little information available and no way whatsoever to talk to the people behind  the product.

Nuance should realize that these kinds of addons and interfaces will actually drive the sales of their products. I'm not there to compete with them, I just want to give people more ways of using NaturallySpeaking.

I really, really hope this will change.

Sunday, October 15, 2006 7:03 PM by Mark

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

In my opinion, Dragon has been resting on its laurels for the last two versions. In fact, I upgraded to Dragon 9 and consequently had to uninstall and revert to Dragon 8 because of several "showstopper" bugs and absolutely zero new features or perceivable increase in recognition accuracy.

I was appalled to find that I had to spend $10 just to report the bugs or ask if there was a workaround. I've never purchased a software product that did not include free tech support.

I was further surprised to find that I had to spend $2000 if I wanted documentation on their SDK. That boggled my mind.

Behavior like this can only occur in monopolistic situations, which Dragon has enjoyed in the speech recognition market. I'm very excited for the competition that Vista speech will offer Dragon.

Sunday, December 24, 2006 2:24 PM by Peter Maddern

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

It'll be interesting to compare accuracy in a scientific way. One way to do this would be for each user to dictate the "Rainbow Passage" with Dragon and with WSR using the same microphone with USB sound adaptor (if you use one of those) as input and measure percentage accuracy in each programme. For statistical soundness, repeat the dicatation say 3 times in each programme and measure the average. Here's the procedure I use:-

http://speechempoweredcomputing.co.uk/Newsletter/?p=69

I might do it as when I get around to installing Vista.

Whay i can say is I get 98%+ accuracy already with Dragon NaturallySpeaking.

Peter

www.speechempoweredcomputing.co.uk

Peter

Wednesday, September 03, 2008 6:02 PM by Walter Healy

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

I have used Dragon Dictate since at least version 6.  [I am a very poor typist and eagle help I can get.]

I just purchased version 10 and found it to be less accurate with more glitches than version 9, which in turn was less accurate and slower than version 8.

Over the last several days I have reinstalled version 8 on my PC's and it is relatively quick and accurate.  I am much happier.

I have not tried to Vista voice-recognition. Is it better than Dragon version 8?

I'm also troubled at how Nuance purchased and then suppressed the IBM voice-recognition product.

Does anyone know a current user blog that discusses these voice-recognition issues?

Tuesday, July 21, 2009 9:33 AM by bambara

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

I'm currently using the Dragon version 10 to write this. I found it has become much more accurate after I've used it for a while. It seems that, as Nuance has stated, the program becomes much more adept at to the user after being used for a while.sure, it makes a few mistakes here and there, but the more you use it the better it becomes and the better you become at using it. At least, that was the case with me.

Wednesday, April 21, 2010 1:22 AM by Bill Latam

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

sdk API - $2000? I'd like to know more things like:

Are there videos included in this sdk?

What apps. out there have been made using this DNS sdk?

Vista speech engine is embeded? why is that?

I see man-machine comunication through voice a bit far into the future?

Speech recognition technology monopoly? why?

we're still tied up to mouse and keyboard?

the 1 million $ question: why use a speech recognition system that is less accuarate than a human? This tool should, naturally surpass human skill...to justify its mass use.. fast and accurate is the human formula...

Saturday, July 10, 2010 5:00 AM by Mark Phillipson

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

For some people with repetitive strain injury speech recognition can be the difference between continuing work and having to change your job.

Wednesday, December 01, 2010 4:42 AM by su

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

Couldn't agree more, especially with the "opening up" part. While working on an interface between NaturallySpeaking and XEmacs I was rather surprised to find so little information available and no way whatsoever to talk to the people behind  the product.

Nuance should realize that these kinds of addons and interfaces will actually drive the sales of their products. I'm not there to compete with them, I just want to give people more ways of using NaturallySpeaking.

I really, really hope this will change...

Wednesday, June 29, 2011 8:41 AM by bhuvanram

# re: Dragon NaturallySpeaking vs Microsoft Speech Recognition

So what u guys are saying is dragon ie good at accuracy.. but i tried microsoft SDK 5.1 thats too weak in accuracy..do we have nay open source for this voice recognition

Leave a Comment

(required) 
(required) 
(optional)
(required)