1. Field of the Invention
This invention pertains generally to spelling checker dictionaries for computer-based word processing programs and other text-handling applications, and more particularly to network-based dictionaries and techniques for updating such dictionaries to add new words based on information provided by users of the programs.
2. Description of the Background Art
Word processing programs for computers generally include a system for allowing the user to check the spelling in the text that is being processed. Users of such programs occasionally make spelling mistakes unknowingly when writing text, and text that is created externally and imported into the computer for further word processing may also contain misspelled words, of which the user may be unaware. With modern computer technology, most advanced word processing programs have "desk-top publishing" capability, enabling users to generate and process very large volumes of text for which proofreading by the traditional word-by-word reading method is a lengthy and formidable task. Therefore, a spelling checker system is almost a necessary feature of such word processing programs for generating large text files.
A spelling checker is a program which runs typically in conjunction with a word processing program and includes a spelling dictionary that identifies the correct spellings of a collection of words. A few dictionaries are morphological in character, in that they apply a set of spelling rules to any given word to determine the correct spelling. However, most dictionaries are databases containing lists of correctly spelled words, and a spelling checker compares a given word in text with each word in the dictionary to verify the spelling. Of course, such spelling checkers are strictly limited by the size of the dictionary. The typical size of an English language dictionary for current word processor spelling checkers is approximately 100,000 words. By comparison, a current unabridged edition of Webster's Dictionary contains over a quarter million entries, and of course the Oxford dictionary of the English language is substantially larger.
Word processors normally provide the capability for a user to add words to a supplemental dictionary that is stored on the user's computer. In practical terms, this capability cannot fill the gap between the size of any typical main dictionary provided with the spelling checker system and an unabridged dictionary of the English language. Often, however, a given user tends to repeatedly use or encounter only a certain limited set of special or customized words and names in word processing text. For example, a person writing a novel may need a dictionary with the proper spelling of names of various characters in the story. Individuals doing word processing in large business organizations often need a dictionary with labels and names of various business products, as well as the names of other individuals in the organization. Technical writers in certain fields, such as electronics and computer technologies, are constantly encountering new words and acronyms that are continually being coined at a rate far too rapid to be included in any normal dictionary. In all of these instances the supplemental dictionary enables the user to build up a customized database of special words and to enhance the spelling checking process to include these words.
In an organizational environment, computers are generally connected together to form a local area network (LAN). It is often the case that the word processor users in such an environment generate supplemental dictionaries having many common entries, thus duplicating each others' efforts. Further, the spellings of commonly used words may vary between the local supplemental dictionaries created by different users because of spelling errors by individual users or ambiguities in the spelling of any given word. Clearly, in a LAN environment it is desirable to provide a commonly shared dictionary in which words can be entered by different users, with some means for verifying the spelling accuracy of the entries.
An attempt to provide such a shared dictionary is made in the network version of a word processing program produced by Frame Technology Corporation of San Jose, Calif., sold under the trademark "FRAMEMAKER.RTM.". This word processor gives each computer user in the network access to four different types of dictionaries for use in checking the spelling of text. The Main dictionary is provided by the vendor of the word processor (Frame Technology Corporation) and is a database which cannot be altered by any user. Each user also may have one or more Personal dictionaries, which contain words entered only by that user and may be modified by the user at any time. In addition, each document may have a Document dictionary, which can be modified by any user that is creating or editing that document.
Finally, this word processing system provides a Site dictionary which is accessible to all users in the LAN at a given site. The Site dictionary generally contains technical words and words that are commonly used at the site, such as the company name and product names. This Site dictionary thus fulfills some of the dictionary-sharing objectives which are useful in a network environment. However the Site dictionary in this word processor can only be altered by the user designated as the site administrator. If another user wishes to add, delete or change any word in the Site dictionary, the proposed modification must be communicated to the site administrator by means external to the word processing system, and all changes in the Site dictionary require that individual's personal attention. In this sense the Site dictionary is a supplemental dictionary only for the user who is the site administrator. Clearly it is desirable to provide a supplemental shared dictionary for word processing in a network environment with automated means for updating the dictionary based on information from all users in the network.
With the advent and increasing popularity of Internet computing, it is also desirable to provide a supplemental dictionary which is accessible over networks of very wide ranges. An automated updating feature for a shared dictionary encounters several problems in such an environment. Such a dictionary may be used simultaneously by thousands of users. With each user being allowed to modify the dictionary, the number of proposed modifications sent to the dictionary may become very large, and some means must be provided for organizing this volume of information. In particular, there may be spelling errors, conflicts and ambiguities in the proposed words received from a large population of users. Within a given language the population of users may speak various different dialects, and the correct spellings of many words may depend on the dialect of the user. Some system is required for resolving these problems and updating the dictionary in a controlled and accurate manner.
Finally, the expense of dictionary maintenance presents a special problem in the Internet context. In a LAN environment the cost of supporting a Site dictionary can be borne by the organization where the LAN is installed. However, on a wide range network a shared dictionary generally must be provided and maintained by some entity that is independent of most of the users of this dictionary. These maintenance services include sorting through the modifications that are proposed by users and selecting the spelling for those words that are being added to the dictionary. This selection process cannot be completely automated, and requires the efforts of personnel who are lexicographically skilled. A practical shared dictionary system must provide some means for equitable apportionment of the maintenance expenses, preferably including incentives for users to contribute new words and proposed modifications to the dictionary.