Plugged In: Good word — Voice recognition worth a try

By Joseph Kashi, for the Redoubt Reporter

Spring seems to be in the air, at least sporadically, and this week we have some new technology and photo suggestions that may put a “spring” in your step, just in time for Soldotna’s St. Patrick’s Day parade.

I was really late writing this week’s column due to the press of other matters, so it’s a really good thing that I purchased Dragon NaturallySpeaking version 10 and installed it on my home computer.

This week’s column is “written” using direct voice recognition into my computer. I did not have time to “train” the software installation to more perfectly recognize my own voice and accent — I’m using it straight out of the box. To give you an idea of the accuracy of NaturallySpeaking, neither I, nor my editor, will correct any typos or odd grammar in this section of my column. Any errors in this section are voice transcription problems.

Speech recognition has been around since about 1997 when IBM shipped a version of voice-recognition for its OS/2 operating system. Speech recognition never quite caught on because earlier versions were not very accurate, required substantial training, or somewhat difficult to correct mis-recognized text, and tended to be somewhat slow on older computer hardware, it never quite caught on. For many people, myself included, typing tended to be faster.

Version 10 of Nuance’s Dragon NaturallySpeaking Professional corrects all those problems and is a true pleasure to set up and use, even when you have a “distinctive” accent like myself. NaturallySpeaking did not require any significant training to recognize my voice almost perfectly. In fact, they’re probably fewer typos in this week’s column because I’ve been dictating it rather than typing it. The newest version of NaturallySpeaking can actually be said for many different English accents, including East Coast English, British English, Spanish, accented English, Australian accents etc.

Be sure to carefully correct any voice-recognition text, however. Several years ago, I dictated a lengthy brief for the court using an earlier version of NaturallySpeaking. I thought I carefully corrected that brief, but Judge Harold Brown, then our senior Superior Court Justice at Kenai, was sufficiently astute that he inquired on the record whether I had dictated that brief using voice recognition software. Obviously, I fail to catch some necessary corrections. Luckily, editing, particularly inserting text, is very easy. Just place your cursor where you wish to insert a word, a phrase or a sentence and began talking.

One of the more productive features of NaturallySpeaking Professional is its capability of recording a voice macro command. For example, I could easily create a command that would bring up letterhead already dressed to an appropriate party or attorney or bring up standard form contract clauses when drafting lengthy legal documents. To get to that point requires quite a bit of substantive work, mostly identifying appropriate form clauses, such as arbitration requirements, and then setting them up as a voice macro command.

Voice recognition software tends to work best with a very clean digital audio signal and with a very fast computer. Although Nuance packages an analog headset/microphone with NaturallySpeaking software, I found that a USB Digital Signal Processing (DSP) headset results in a cleaner digital input that’s recognize more accurately. The DSP headset includes hardware that is optimized for converting analog voice into a digital USB signal.

A fast computer also helps reduce the delay between the spoken word and its appearance on your computer screen. Nuance appears to use an approach pioneered by IBM in the late 1990s — a statistical model of how frequently specific words are within reasonable proximity to each other in average or more sophisticated speech patterns. This linguistic model then corrects any ambiguous voice-recognition by finding the best statistical match between clearly recognized words in close proximity with each other. That definitely increases recognition accuracy, but requires some additional computer performance. Although a slower computer will still get the job done, you may find it frustrating.

The professional version of Dragon NaturallySpeaking costs $199 and works with both 32-bit and 64-bit versions of Windows XP, Windows XP x64, Windows Vista and Windows 7. If you do a great deal of typing and are comfortable dictating at a reasonably fast clip, then NaturallySpeaking should be very productive technology in your office or business. In fact, writing this week’s overdue column is moving along so quickly that it feels as though I’m slacking off.

Editor’s note: This ends the unedited demonstration of the voice-recognition software.

Southern peninsula photo project

Regular readers may recall a discussion a few weeks ago about the major photo and computer technology project being implemented in the south peninsula schools by the Homer Downtown Rotary Club, the Soldotna Rotary and the Taipei Twin Rotary Club of Taiwan.

Since then, we’ve had the opportunity to test the fast new computer hardware that’s required for any heavy-duty photo processing. Here’s what we found to be cost-effective, and it’s certainly both less expensive and faster than an off-the-shelf Dell computer. Several local computer stores can assemble and configure a comparable computer at a price that beats a comparable, yet slower, Dell or HP system.

AMD’s Phenom II X4 955 is one of the fastest yet least-expensive quad-core computer CPUs on the market. We used 4-gigabyte RAM along with a quad-core 955 in a Gigabyte-brand GA-MA785GPMT-UD2H system board. Gigabyte system boards continue to enjoy a good reputation for highly reliable products at a fair price. These fairly inexpensive but highly rated system boards include an ATI HD4200 video chipset and 128-megabyte video RAM on the system board, along with the usual high-speed Ethernet connection, RAID disk array hardware, many USB ports and on-board audio. I found the on-board video to be somewhat slower than an expensive plug-in video graphics-processing card, but this seemed like a reasonable trade-off.

We installed both a 750-gigabyte Western Digital Caviar Black hard disk as a boot and program drive along with a second Western Digital Caviar Black hard disk for photo and video storage. Because video and modern photo files take up so much space, especially when you have 100 or more students using the same computer, we also installed a second 1,000-gigabyte hard disk solely for data storage.

Losing that much data would truly be a tragedy, particularly for the students and their communities, so Rotary also provided a third hard disk, a 1,000-gigabyte Western Digital drive in an ESATA external enclosure to be used for data backup.

Photo and video editing requires an accurate, preferably quite large LCD color monitor. We purchased ViewSonic VS 2323 23-inch monitors from Costco for $169 each. We purchased mainly Logitech accessories, including desktop microphones, speakers systems and WebCams. Logitech products generally reliably do their job without fuss and at a fair price.

Because no single printer does everything equally well or equally economically, we purchased a number of printers for the computers used at the main photo project sites. Large prints up to 24 inches wide will be printed on an HP DesignJet 130r, the newest model of a venerable large-format printer line. This is a well-proven, economical printer that prints up to 24 inches wide on roll paper.

The recently introduced DesignJet 130r includes automatic roll feed and an automatic paper cutter as standard equipment, at a remarkably low price of $1,250, plus shipping. HP even includes a full set of large-capacity ink cartridges. The DesignJet 130 feeds roll paper more reliably when used on an HP printer stand, so Rotary also purchased a stand and print bin for each DesignJet 130r.

Finally, for lighter-duty daily printing needs up to 13-by-19 prints, we acquired Canon Pixma 9000 Mark II sheet-fed printers. These are not only economical to purchase and operate, but have an excellent reputation for quality. Amazon had the best price, at $415 each plus free shipping.

Next week, I’ll discuss some surprising results and excellent bargains found when buying 80 new cameras for six quite different schools.

Local attorney Joe Kashi received his bachelor’s and master’s degrees from MIT and his law degree from Georgetown University. He has published many articles about computer technology, law practice and digital photography in national media since 1990. Many of his technology and photography articles can be accessed through his Web site, www.kashilaw.com.

1 Comment

Filed under computers, Plugged in, technology

One Response to Plugged In: Good word — Voice recognition worth a try

  1. Thanks for the info. I have a friend who is chronicalling her battle with cancer. She uses a private blog/newsletter to keep friends up to date. The treatments are beginning to cause fatigue. So, typing is becoming tedious. I am seriously considering giving her a voice recognition system. This blog has made the decision easier as to which kind.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s