There are a lot of ambiguous words in many languages, i.e., words that have several meanings. When a human finds such word in text he/she can unmistakably select the proper meaning depending on context and intuition. Another situation is when a text is analyzed by a computer system. Existing systems for text disambiguation are mostly based on lexical resources, such as dictionaries. Given a word, such methods extract from the lexical resource all possible meanings of this word. Then various methods may be applied to find out which of these meanings of the word is the correct one. The majority of these methods are statistical, i.e. based on analyzing large text corpora, while some are based on the dictionary information (e.g., counting overlaps between dictionary gloss and word's local context). Given a word which is to be disambiguated, such methods usually solve a classification problem (i.e., possible meanings of the word are considered as categories, and the word has to be classified into one of them).
Existing methods address the problem of disambiguation of polysemous words and homonyms, the methods consider as polysemous and homonyms those words that appear several times in the used sense inventory. Neither of the methods deals with words that do not appear at all in the used lexical resource. Sense inventories used by existing methods do not allow changes and do not reflect the changes going on in the language. Only a few methods are based on Wikipedia but the methods themselves do not make any changes in the sense inventory and those.
Nowadays, the world changes rapidly, many new technologies and products appear, and the language changes respectively. New words to denote new concepts appear as well as new meaning of some existing words. Therefore, methods for text disambiguation should be able to deal efficiently with new words that are not covered by used sense inventory, to add these concepts to the sense inventory and thus, use them during further analysis.