Expanding the Use of Speech Recognition Technology

Vol. 13 •Issue 10 • Page 21
Expanding the Use of Speech Recognition Technology

If the health care industry is going to make a significant dent in the estimated $6 billion annual transcription expenditure, speech recognition must expand beyond a few departments. Having taken firm root in radiology, the technology is now poised to grow across the entire enterprise. In fact, speech recognition is extending horizontally across medical disciplines and vertically across health care organizations of all sizes, from large teaching hospitals to small physician groups.

This expanding adoption partly results from technological advances that have not only improved accuracy, broadened medical vocabularies and reduced training time, but also given users far greater flexibility than in the past. On the input side, speech recognition software provides good results from dictation via telephones–the dominant dictate device—as well as from PC microphones and mobile devices. On the completion side, final editing of recognized documents can be automatically delegated to transcriptionists, or providers can retain complete end-to-end control of their documentation through convenient, voice-driven self-editing. This flexibility minimizes disruption to the existing workflows of hospitals, clinics and groups.

The market drivers for expanded adoption of speech recognition are equally compelling. Despite the billions spent on labor and the efficiencies derived from current generation digital dictation systems, many providers face persistent backlogs because of growing demands for patient reporting. Medical transcriptionists (MTs) are in short supply, so more labor resources are not the answer. Neither is outsourcing, which only defers the long-run problem and does not address substantial transcription cost reduction. Indeed, transcription service firms face many of the same pressures and are exploring speech recognition to help them expand their productivity levels.

Technology must clearly play a significant role if the situation is to improve dramatically. Realization of this fact along with understanding that the technology can also contribute to better quality documentation are fueling the strong interest in speech recognition. Gartner predicts that “2003 will be a breakout year for continuous speech recognition” in health care.1

A Framework for Evaluating ROI

With its high impact potential for reducing medical transcription costs, speech recognition is a return on investment (ROI)-driven technology. All information technology (IT) investments are being subjected to rigorous financial analysis today, so it pays to take a more systematic approach to identifying all the relevant cost components in the health information management (HIM) department as well as the appropriate measurement for each. In addition to direct transcription labor cost, both inside and outsourced, other key categories include:

•Overhead costs in the department, such as supervisors, dictation equipment, phones, etc. For at-home workers, consideration must be given to remote connectivity costs.

•Distribution costs, such as printers, fax machines and all the ongoing costs and people associated with the process.

•Other overhead allocations from the organization, including space cost, utilities, etc.

Of course, several of these costs must be allocated, and it is helpful to determine ways to apply all costs on a per line, per minute or per MT basis. From there, you can determine which cost elements are impacted by implementing speech recognition. With the projected productivity gains, you can apply the overall cost metrics to obtain your savings.

There are other variables worth examining that can be affected by speech recognition in ways that provide additional economic benefits or important qualitative improvements. A few to consider:

Quality of finished reports. One HIM director using speech recognition believes her department now produces more accurate documents because editing involves more active reading than straight typing.

Time to bill drop. For instance, Evansville Surgery Center used a complete dictation and speech recognition system implementation to produce surgical reports the same day as the surgery, permitting bills to be generated three to four days faster than previously. Evansville’s days in receivables have been reduced significantly. This kind of benefit is not often taken into account when evaluating the impact of productivity gains in documentation, but it can be a very meaningful contribution that HIM can make to the institution’s financial health–especially given the major cash and collection problems that hospitals face. For an outpatient clinic or group, the impact can often be even more visibly meaningful.

Reduction in physician time. This desirable benefit can result from wider use of voice-driven templates for physicians using PC-based recognition. “Normal” text blocks, form fill-in and similar approaches are far easier to perform with speech and the microphone. Use of such templates where appropriate can reduce the actual time spent dictating.

Promotion of electronic health records. Some caregivers such as physical therapists and social workers, as well as some outpatient providers, are handwriting extensive and important reports. These disciplines have shown early adoption and interest in speech recognition, particularly with self-editing. The professionals reduce their overall time on documentation, often freeing up evenings normally devoted to the task, while their organizations gain fully electronic records.

Our work developing ROI analyses for many customers shows a range of anticipated returns. One western hospital looking to put 150 users on speech recognition with 30 on self-editing, and with an annual transcription cost of $1.3 million, would see an 18 to 19 month payback period on reasonable assumptions for speech recognition productivity. On the other hand, a Midwest health system that has already achieved 25 percent productivity gains with our speech recognition in non-radiology disciplines sees the payback for adding 50 more physicians in less than one year. Many radiology departments have already shown such nine to 12 month returns.

In short, speech recognition can be a very attractive IT investment at varied levels of assumptions.

Productivity Drives Savings

So if productivity gains are the drivers of powerful cost savings leading to solid returns, what are users experiencing in the real world? First, it’s important to emphasize that in radiology, the decision looks to be an easy one. Considerable evidence suggests substantial productivity gains in this high volume and demanding discipline. Children’s National Medical Center in Washington, DC, saw its report turnaround drop to minutes from up to 48 hours through aggressive use of speech recognition with self-editing. That experience mimics Southern Hills Medical Center’s, which also relies on real-time, self-editing to generate rapid turnaround from the prior 40-hour average. Boca Raton Community Hospital achieved a turnaround reduction from 12 to 24 hours to less than 5 hours with a mixture of self-editing and transcriptionist editing, showing that both modes can be successfully mixed.

In the more diverse enterprise setting, speech recognition is also performing well. We have spent time measuring gains with a variety of our non-radiology customers. Baseline productivity data is obtained for each transcriptionist, measured by the ratio of transcribe time to dictate time, which we feel eliminates problems with line count definitions in different organizations. Then a software program measures the same ratio for those transcriptionists doing editing of speech recognized reports. Rolling four-week productivity readings are taken.

Orillia Soldier’s Memorial Hospital in Canada has been consistently running at 20 percent overall gain, with all five MTs using speech showing positive productivity. Forrest General Hospital has achieved mid-40 percent gains, with the top MT over 50 percent. Evansville Surgery Center, an outpatient example, has steady total productivity at 30 percent. The Department of Health and Human Services for Outagamie County, Wisconsin, has successfully implemented in a mental health setting, and MTs there are regularly realizing mid-20 percent level gains. The best performer with speech recognition has achieved north of 60 percent.

A note on the top performer statistics–first, it is not necessarily the weakest pre-speech MT who achieves these gains. Second, I believe these MTs point the way to the upside potential of this technology, and we need to evaluate them to see what it is about their habits, acceptance and usage that contributes to such high productivity improvements. This is important because, as with so many valuable health care technologies, speech recognition is as much about effective change management as it is about the software itself.

These examples show both the diversity of settings in which speech recognition is being deployed and the fact that MT overall productivity gains of 25 percent to more than 40 percent are being realized. These numbers can be worked back into the ROI models described earlier to predict your organization’s reductions in the various cost categories outlined.

It should be stressed again that giving providers complete control through speech recognition with self-editing will alter the productivity gains. Because self-editing bypasses the transcription step, significant gains are derived. We see customers actively seeking to increase the mix of self-edit to MT edit work to optimize gains and fit different physician work preferences.

Lessons Learned and Best Practice Suggestions

Now that there is reasonably significant experience with speech recognition, it is possible to begin looking at opportunities to enhance the productivity gains. Our consulting group recently undertook a small study to determine how much additional MT productivity could be had by minimizing such dictation habits as:

•Restarting sentences.

•Filling pause time with non-dictation words.

•Dictating format changes and not including basic punctuation.

These habits not only slow normal transcription, they also disrupt speech recognition. Altering them can thus have a multiplier effect on productivity. On average, the study found that an MT’s productivity could improve some 22 percent with minor modifications to dictation practices.

Based on this preliminary study, as well as our experiences with hundreds of speech recognition deployments, I can offer some additional suggestions:

Let physicians know they are using speech recognition and enlist them as partners. With telephone-based recognition, it is possible for doctors to use speech without knowing it, and some institutions deploy exactly this way. However, we think the benefit of awareness of the impact of small dictation habit changes to overall turnaround time improvement is worth the slight disruption to the doctor.

Build speech recognition volume quickly. Transcription editing and self-editing are like exercise: you cannot do it occasionally to reap the benefits. We have seen a direct correlation between transcription productivity gains and regular volume usage. One of our users has examined this issue specifically and finds an eight-week ramp to productivity, though this can vary by individual.

Mainstream the technology. Select dictators who produce high report volumes and represent visible departments such as cardiology. Not only does this help build the volume for fast ramping of productivity, it also helps produce “champions” and clear successes. Non-using physicians will invariably begin asking why they too cannot have this technology, smoothing your path to adoption.

Speech Recognition’s Upside for HIM

While the impact of speech recognition is sometimes feared in HIM departments, there is great potential for new opportunities stemming from the technology. For example, hospitals may be able to use the available resource generated by productivity gains to offer transcription services, turning the department into a source of revenue. A natural constituency for such services would be the office work of the physicians who practice at the hospital. Another potential from speech recognition is the opening of whole new positions for MTs that involve dealing with high-value clinical data. Many of these jobs are outlined in the American Health Information Management Association’s (AHIMA) Vision 2006 document.2

The strong ROI of speech recognition promises to make this a high impact technology in the industry for years to come.


1. Gartner Inc., W. Rishel, B. Hieb, J. Klein, K. Kleinberg, “Predictions for Underlying Technologies in Healthcare,” Dec. 12, 2002.

2..American Health Information Management Association, “Evolving HIM Careers: Seven Roles for the Future,” 1999.

Donald T. Fallati is senior vice president of marketing and strategic planning at Dictaphone Corp.