The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.
Sequence models can be used for building natural language systems, such as part-of-speech taggers, named entity segmenters, and mention chunkers (shallow/light parsers). Sequence models can be trained using training data that is labeled or annotated by linguists. Natural language systems, however, may be limited to specific languages for which sufficient labeled training data exists, which are also known as resource-rich languages (e.g., English, French, German, Spanish, Japanese, and Korean). Obtaining labeled training data for other languages, also known as resource-poor languages (e.g., Catalan, Estonian, Norwegian, and Ukrainian), can be costly and/or time consuming. Therefore, efficient techniques for cross-lingual learning of sequence models are needed.