1. Field of Invention
The present invention relates to machine and voice stenographic education and more specifically to a software program to perform error analysis and evaluation and dynamically create prescriptive practice from submitted stenographic writings, when such writings are compared to reference masters. The present invention relates further to the creation of a method and process which extensively categorizes identified errors, thus passively identifying relationships between, and frequency of a plurality of types of errors in a manner such that the writer is easily able to determine relative importance of error types and efficiently prioritize practice. The present invention relates further to the use of error identification to dynamically create from each performed analysis remedial practice specific to each writer's need for corrective activity.
2. Prior Art
When training a machine or voice stenographic writer to perform the required functions of the profession, writers typically listen to a plurality of practice dictations of judicial, broadcast, and related spoken word narratives. While listening, each writer writes each dictation using stenographic shorthand machine or voice writing software program(s) applying a shorthand theory previously learned to translate the spoken word heard into either strokes executed on a stenographic machine or voice commands re-dictated into speech recognition software. The result is a paper text or electronic file representing the English (or other language) text of the original dictation with varying degrees of accuracy depending upon the writer's proficiency. The process of rendering acceptable translation is complicated by the need of each writer to capture the spoken word at speeds up to and sometimes exceeding 240 words per minute in order to be able to create an accurate record of multiple speakers.
Writers have typically attempted to develop the psychomotor skill of stenographic writing by repeatedly writing dictations at one speed until an acceptable degree of accuracy is achieved and then moving on to a higher level of speed and repeating the practice process. At each practice level, a certain portion of the writer's writing does not correspond to the original spoken text, producing errors in the stenographic writing. When this occurs, these errors must be identified so that appropriate corrective action may be taken to eliminate them in future writings.
To determine each writer's progress in these activities, writers and their teachers typically attempt to identify errors and manually create remedial practice material designed to correct identified deficiencies. To accomplish these objectives, it has been traditionally required that an expert writer review the writer or trainee's writing element by element against a reference master document of the original writing, noting discrepancies. The expert writer may then attempt to manually compile lists of related discrepancies, and from these, suggest corrective practice.
This approach is labor intensive, slow, and error prone, and cannot be manually performed with sufficient frequency for each typical writer to create a large enough body of identified errors to accurately deduce error patterns, relationships between accurate translation and errors, and prioritize practice appropriately. Corrective action is, therefore, generally limited to admonition to writers to perform simple repetition of previous practice punctuated by drill practice on standardized material containing content similar to words and passages in which writers have made errors in the past or on word lists composed of corrected versions of the errors themselves. Such corrective action tends to create a practice environment in which each writer attempts to correct each error in sequence before moving on to the next error, and does not prioritize practice to try to eliminate the largest causes of errors or the most frequently made errors first, the second most frequent next, and so forth. Thus practice tends to be time-intensive, demotivating, and tends to confuse improvement in translation accuracy based simply on acquired familiarity with the repeated material with improvement based on elimination of the writing habits which caused the errors.
Furthermore, current practice does not easily allow data on the practice results and progress of large groups of writers to be aggregated and analyzed with a view toward predicting which habits and sequences of activities which, if performed, have the highest likelihood of achieving success among writers at large.
Each writer, therefore, practices alone with substantial uncertainty as to the efficacy of his or her efforts. As a result, machine and voice stenographic training programs experience exceptionally high levels of attrition as writers become discouraged and frustrated by lack of perceived progress.
To address this problem, computer programs have been developed to help writers and teachers identify errors in writings. These programs have been primarily concerned with presenting a translation of the stenographic or voice-written writing with errors noted in some sequential fashion. The writer then is expected to view the errors identified. This focus on linear error identification is rooted in the manner in which stenographic machine writers traditionally have performed their jobs: capturing the record in some form of a file, reviewing the file after capture to identify and correct errors, and then producing a typed or word-processed transcript of the file with corrections as a work product. However, as a tool to eliminate future errors, these error identification-oriented programs display significant problems with respect to performing true writing analysis and evaluation beyond simple error identification.
Current Art, Type I
The first type of programs is computer-assisted translation (CAT) software programs. Examples of these types of programs are Case CATalyst™ by Stenograph, L.L.C.; Total Eclipse™ by Advantage Software, PROCAT by Advance Translations Technology, DigitalCAT by Stenovations, and others. While not developed for the purpose of error identification, these programs have some capacity to perform this function.
These programs contain databases of “outlines” and their respective English language equivalents. These databases are called “CAT dictionaries.”
When the stenographic writer writes a file into the program, the program will display a text of the translation performed against the CAT dictionary and will note certain types of errors. Writing is compared to entries in the CAT dictionary to achieve the translation. For example, if an entered “outline” does not match any entry in the CAT dictionary, the display will note the untranslated outline in some fashion. If a specific outline has been entered into the dictionary to represent more than one English language word, the CAT program will display the possible translations as “conflicts” which the user must select from to create an appropriate translation. The CAT program, when prompted, will typically display a percentage score of accuracy of translation.
Using a CAT software program imposes several significant disadvantages on the writer with respect to error identification. First, since there is no reference master text against which the writing is compared, missing elements of the dictation not included in the writing are not identified. Second, writing the wrong word (i.e., writing “the” instead of “this”) will not be identified as an error if the incorrectly written word's outline and translation exist in the CAT dictionary. Third, many CAT programs include artificial intelligence features which automatically correct errors before display thus preventing the writer from recognizing some portion of his or her errors at all. Fourth, since translation in a CAT program depends upon the incorporation of a CAT dictionary, each writer must maintain and constantly update this dictionary in order to achieve a useful level of translation. This presents a particularly difficult problem for student writers who typically have not yet constructed extensive CAT dictionaries. When such an appropriate dictionary is unavailable, many correctly written words may display as errors simply because their outlines are not contained in the CAT dictionary. This deficiency tends to render performance progress difficult to discern. Fifth, current CAT programming technology makes no attempt to analyze or compare errors identified as to type, frequency, or relationship. Sixth, current CAT technology, because it does not store reference masters for comparison, also does not dynamically create prescriptive practice from the context of submitted writings and/or identified errors.
Current Art, Type II
The second type of current art are web-resident or local computer resident programs that allow a user to input a file written either on a stenographic writing machine connected to a computer equipped with a CAT software package or in a voice recognition software program and compare said file to a stored text document. Such comparison typically produces a report which generally presents the writer a reproduction of the writing with errors noted sequentially as they occurred in the writing, and gives some indication of the percentage of accuracy of the input file compared to the reference master. Examples of current and prior art of this type within the stenographic industry include: The Professor by Stenograph, L.L.C.; Mentor by Advance Translation Technology, dba PROCAT; and Realtime Coach™ by Realtime Learning Systems, Inc.
The current art of this type is generally adequate for identifying many errors sequentially as they occur in a writing. In such examples of the current art, identification of errors is typically based on document compare software technology wherein discrepancies and missing words are noted. As such, this type of current art still displays significant deficiencies with respect to writing analysis and evaluation.
First, the error reports typically generated by the current art compel the writer to review errors in the sequence in which they occurred rather than by commonality of error, frequency of occurrence, or other deduced pattern. Thus each writer must address each error individually.
Second, error identification alone leaves the writer with few options for prioritizing practice time beyond simply repeating the practice or segments thereof sequentially, attempting to correct each error in sequence during the repetition. Indeed, much of the current art of this type is primarily concerned with “realtime” error identification. In such an environment where the programs are analyzing writings as they are being created, linear display of identified errors in sequential format may be the only feasible method of display. However, with respect to remedial practice, this presentation format does not allow writers to focus on types or kinds of similar errors easily. Beyond making distinction between punctuation and word errors, the current art typically does not aggregate errors by type or specifically disclose frequency of different types of errors. This deficit also makes it difficult for the current art to facilitate easy identification of the relationships between different types and kinds of errors or the relationship of accurate or good writing to writing errors.
Third, the current art apparently does not attempt to create unique, dynamically generated, contextually extracted, prescriptive practice based upon extracting the errors in a writing, correcting them, and then extracting surrounding context and presenting the corrected error within it to allow the writer to focus subsequent practice on remedial needs properly.
Fourth because the current art does not apparently concern itself with extensive categorization of error types, kinds, or frequencies, it does not offer a method to concentrate or focus practice on error types. The writer is thus left to analyze the report trying to determine practice priorities and take appropriate corrective action. This process is labor intensive and significantly reduces the amount of writing which may be done in any given period of practice time, as some portion of each practice session must be dedicated to interpreting the error data presented.
Fifth, the current art does not provide sufficient capability to create remedial practice based upon contextually anchored corrections of the writer's identified errors as a part of the analysis process. The writer is then required either to simply repeat the original practice hoping to recall the errors made and try not to make them again, or to practice pre-populated practice material. The pre-populated material, which, while it may include words or other elements similar to those in which the writer made errors, does not include corrected versions of the exact errors or the precise context in which the error occurred. Since a significant portion of stenographic writing errors occur because of difficulty in correctly translating context previously heard, the present lack of contextually derived prescriptive practice creation significantly reduces the usefulness of the error identification in terms of facilitating writing improvement.
Sixth, the current art favors the writing of each file to be performed “realtime” into the error analysis program, and such programs typically take input directly from a realtime session invoked in the CAT program. Thus, files cannot be written or edited and uploaded for later analysis. This tends to limit the use of the current systems to those writers who are proficient in or at least comfortable with realtime stenographic writing, the most difficult of all stenographic writing, so it tends to restrict the use of the current art to those practice instances which are likely to be the most inaccurate. Such restriction poses the disadvantage of making it more difficult to distinguish a writer's structural writing problems from those which result from simply attempting new material containing unfamiliar vocabulary or performing at a new and higher rate of dictation speed.
Current Art, Type III
The third type of current art available is the traditional word processing applications software program with document-compare capabilities such as Microsoft Word™. To use this type of program, the writer would need to have access to a CAT program able to send output to the application program, and the writer would need to have a reference master text file of the original dictation written. Since the latter is rarely available to the writer, this type of current art is rarely used for stenographic error identification.
In those in situations instances where such is available, the limitations of the document-compare capabilities of such programs place the writer at the following disadvantages. First, the discrepancies are noted in linear fashion with no automatic capability to organize them by categories without intervention by the viewer. Such necessary organization would then be labor intensive and error prone. Second, such programs do not typically create prescriptive practice materials extracted from the context of the writing. Finally, the current art does not contain features which might easily be used to aggregate writer performance data from large groups of writers, to associate writing events with performance conditions at the time of writing, or to facilitate data reduction to help create predictive practice models.
What is needed is a computer program method and process that is widely accessible to a plurality of stenographic writers and which can be used to perform not only error identification, but instantaneous extensive category-based analysis of writings allowing writers to easily see comparative frequency, distribution, and relationships among errors and error patterns. Further, this method should also automatically create from each analysis unique, contextually derived prescriptive practice constructed from the precise stenographic writings input and errors made therein, regardless of CAT program type or CAT dictionary used, with the ability to analyze both realtime and non-realtime writings. The method should collect data on performance conditions at the time of writing, associate such data with writing analysis, and store data in such form and to such extent as to facilitate the reduction of said data through commonly used statistical analysis tools with a view toward constructing predictive practice models for use by future writers.