Artificial intelligence (AI for short) is a new technology and science for studying and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. AI is a branch of computer science, intending to know essence of intelligence and to produce an intelligent machine able to act in a way similar to that of human intelligence. Researches on the AI field refer to robots, speech recognition, image recognition, natural language processing and expert systems etc.
Chinese word segmentation means segmenting a sequence of Chinese characters into separate words. The Chinese word segmentation is a basis of text mining. When the Chinese word segmentation is performed successfully on a text sequence input into a computer, the computer may recognize a meaning of the text sequence automatically.
An existing segmentation model is generally based on statistics or based on a dictionary, leading to poor generalization ability. Even for a supervised segmentation model based on statistics having certain generalization ability, as there are few manually annotated corpuses, the segmentation model is small, such that a generalization error may occur easily. In prior art, the segmentation model having certain generalization ability may be acquired by using a method of re-training the segmentation model using a generalized feature vector.
It is time and energy consuming to re-train the segmentation model, and the quality of word segmentation is difficult to be ensured.