1. Field of the Invention
The disclosure generally relates to an information extraction method capable of providing a reliable information and having a self-rebuilding function, and a system and a computer program product thereof.
2. Description of Related Art
Because of the fast development of the Internet, more and more dynamic information (e.g. weather information, stock market information) can be accessed or downloaded from the Internet. The technology of information extractor is developed for extracting specific information from an information source (e.g. a webpage).
The technology of information extractor allows a user conveniently extracting desired dynamic information from the information source. However, in case the format of the information source changes (e.g. the webpage is redesigned), the information extractor usually has to update the extraction rule thereof in accordance with the new format of the information source. Otherwise, the information extractor would become incapable of correctly extracting information from the corresponding information source.
Formats of information sources may be frequently and unexpectedly updated. As such, it is always an arduous and difficult job to manually maintain an information extractor for normal operation. Further, when many different types of dynamic information are desired to be extracted, it would have been a mission impossible to maintain the information extractors (e.g. information extractors for extracting closing indices of a stock market and temperatures of Taipei, respectively) for all of the types of the information. Furthermore, the reliability of dynamic information extracted from the specific information sources is often unguaranteed due to some unexpected factors (e.g. the dynamic information is not timely updated at the connected information source). Therefore, it is an object of those skilled in the art to provide a mechanism capable of self-recovery or rebuilding abnormal information extractor for providing reliable dynamic information.