As early as the 1930's, researchers have been investigating the science of speech perception. As this area of linguistics has matured, a number of dictation tools and dictation post-processing tools have been developed. In general, existing speech recognition tools are directed towards free text dictation and provide very minimal navigation capabilities, which results in added time and frustration for users.
At the same time, EHRs, which provide an electronic version of a patient's medical history, have developed to provide a more seamless flow of information within a digital health care infrastructure. Integrating dictation into EHRs has been necessitated due to budget pressures, quality care considerations, and compliance risks, among other concerns. However, existing dictation systems are often difficult to use in an EHR context due to the structure often required for particular EHR sections, and moreover, between unique EHR systems.
For example, U.S. Pat. No. 6,374,226, entitled “System and method for interfacing speech recognition grammars to individual components of a computer program” describes a number of speech controller modules corresponding to program components within a computer program. Each speech controller module supports a speech recognition grammar having at least one rule, where the speech recognition grammar provides an interface to operations on the corresponding program component. A rule may include a reference to another local rule, or to a rule in a different speech recognition grammar, in which case a “link” to the other rule is formed. In this way, the disclosed system allows rules from the same or different grammars to be combined together, in order to build complex grammars. However, such a solution is impractical for an EHR context because separate rules must be pre-established and linked for every unique EHR-entry section.
U.S. Pat. No. 6,999,931, entitled “Spoken dialog system using a best-fit language model and best-fit grammar” describes using likelihood scores from a large vocabulary continuous speech recognizer (LVCSR) to select the best-fit language model (LM) among a general-task LM and dialog-state dependent LMs. However, such a solution requires LMs to be predetermined for each dialog state.
U.S. Pat. No. 7,996,223, entitled, “System and method for post processing speech recognition output” describes a post processing system configured to implement rewrite rules and process raw speech recognition output or other raw data according to those rewrite rules. The application of the rewrite rules may format and/or normalize the raw speech recognition output into formatted or finalized documents and reports. However, such systems lack any user interfacing functionality that might allow the user to adjust to the system and the system to adjust to the user.
U.S. Pat. No. 8,670,987, entitled “Automatic speech recognition with dynamic grammar rules” describes an automatic speech recognition (‘ASR’) engine having a speech recognition grammar that defines at run time a dynamic rule of the grammar that is not to be processed by the ASR until after the at least one static rule has been matched. However, the dynamic rules system describes a set of interrelated grammar rules that rely on other rules already matched and use a sequential order-based dependency for processing the input stream.
Therefore, there is a need for systems and methods that can more efficiently integrate dictation systems into EHR record-keeping systems and solve the dictation problems exemplified by the patents described above.