Healthcare costs in the United States account for a significant share of the GNP. The affordability of healthcare is of great concern to many Americans. Technological innovations offer an important leverage to reduce healthcare costs.
Many Healthcare institutions require doctors to keep accurate and detailed records concerning diagnosis and treatment of patients. Motivation for keeping such records include government regulations (such as Medicare and Medicaid regulations), desire for the best outcome for the patient, and mitigation of liability. The records include patient notes that reflect information that a doctor or other person adds to a patient record after a given diagnosis, patient interaction, lab test or the like.
Record keeping can be a time-consuming task, and the physician's time is valuable. The time required for a physician to hand-write or type patient notes can represent a significant expense. Verbal dictation of patient notes offers significant timesavings to physicians, and is becoming increasingly prevalent in modern healthcare organizations.
Over time, a significant industry has evolved around the transcription of medical dictation. Several companies produce special-purpose voice mailbox systems for storing medical dictation. These centralized systems hold voice mailboxes for a large number of physicians, each of whom can access a voice mailbox by dialing a phone number and putting in his or her identification code. These dictation voice mailbox systems are typically purchased or shared by healthcare institutions. Prices can be over $100,000 per voice mailbox system. Even at these prices, these centralized systems save healthcare institutions vast sums of money over the cost of maintaining records in a more distributed fashion.
Using today's voice mailbox medical dictation systems, when a doctor completes an interaction with a patient, the doctor calls a dictation voice mailbox, and dictates the records of the interaction with the patient. The voice mailbox is later accessed by a medical transcriptionist who listens to the audio and transcribes the audio into a text record. The playback of the audio data from the voice mailbox may be controlled by the transcriptionist through a set of foot pedals that mimic the action of the “forward”, “play”, and “rewind” buttons on a tape player. Should a transcriptionist hear an unfamiliar word, the standard practice is to stop the audio playback and look up the word in a printed dictionary.
Some medical transcriptionists may specialize in one area of medicine, or may deal primarily with a specific group of doctors. The level of familiarity with the doctors' voices and with the subject matter can increase the transcriptionist accuracy and efficiency over time.
The medical transcriptionist's time is less costly for the hospital than the doctor's time, and the medical transcriptionist is typically much more familiar with the computerized record-keeping systems than the doctor is, so this system offers a significant overall cost saving to the hospital.
To reduce costs further, health care organizations are deploying speech recognition technology, such as the AutoScript™ product (made by eScription™ of Needham, Mass.), to automatically transcribe medical dictations. Automatically transcribed medical records documents usually require editing by the transcriptionist. This is especially true with respect to the formatting of medical records documents. Whereas speech recognition may accurately capture the literal word string spoken by the provider, the resulting document is generally not presented in an acceptable format. Examples of formatting which may need correction are punctuation, section headings and enumerated lists. While some speakers may dictate instructions which can assist in providing formatting, many will not, especially in the context of background speech recognition where most providers may not even know that the technology is being used to create the draft transcription.
Some healthcare institutions have specific formatting requirements that are difficult to accommodate automatically. Speakers may not be aware of the requirements, and therefore fail to provide verbal directives to assist in formatting, which is therefore prone to include errors. For example, some institutions require that physical examination sections of medical reports be divided into sub-sections, such as the following:
PHYSICAL EXAMINATION:
VITAL SIGNS: BP 120/80, pulse 75, temperature 99.1.
LUNGS: Clear to A&P.
HEART: Regular rate and rhythm. S1, S2 normal.
EXTREMITIES: Without edema.
Further, formatting corrections often involve substantially complex manipulation of existing text, such as insertions of line-feeds and punctuation marks, capitalization and cursor movement to correct a “single” error. For example, to turn a particular sentence into an enumerated list item, a transcriptionist inserts a line-feed, inserts the number ‘1’, inserts a period and inserts a space.
Another class of verbalizations that is especially prone to automatic speech recognition errors are proper names, particularly those of medical providers that do not practice at the same institution as the speaker. Within-institution providers often dictate “contact names” as the referring providers, or to whom copies of the medical record should be sent. Since there is a large diversity of such proper names, and many are very rare, these words are especially susceptible to speech recognition errors.