A short form is a word that is usually formed by combining some select characters from a long form term, while ignoring the long form's other characters. Typical short forms include acronyms, abbreviations, and initialisms. For example, IBM is a short form of the term “International Business Machines,” of which the latter is IBM's corresponding long form. Long forms can typically have one or more words. Prior art approaches for the detection of short forms and expansion to their respective long forms have been constrained by language specific rules, which limit their ability to be implemented in systems that are implemented in multi-lingual environments. More efficient, language-independent short form detection and long form expansion is beneficial, as the product short forms and their corresponding long forms can be used for, among other things, query expansion and to improve search results, search indexing, terminology extraction, and ontology population.