1. Field of the Invention
The present invention relates, in general, to a method of converting the data of a database and creating an eXtensible Markup Language (XML) document and, more particularly, to a method of converting the data of a database and creating an XML document, which converts large-capacity data stored in a database using XML replacement technology, and then creating a dynamic well-formed XML document.
2. Description of the Related Art
The Internet connects a large number of communication networks spread all over the world to one another, and computers connected to the Internet use a communication protocol such as a Transmission Control Protocol/Internet Protocol (TCP/IP) so as to perform communication among the computers.
Further, HyperText Markup Language (HTML) is one of the data formats used in the World Wide Web (WWW), and is a scheme for describing hyper media documents. HTML defines the logical structure of hypertext using a standardized document format called a Standard Generalized Markup Language (SGML) and is stored in the format of a typical text file.
In order for a user to view a specific web page using a web browser such as Internet Explorer, it is general to enter the Uniform Resource Locator (URL) address of the web page. Therefore, when the user does not know the URL address of a relevant web page, it is difficult to access a target document.
Therefore, software is required which can easily search for the user's desired information among a great amount of information overflowing on the Internet even if the user is not aware of URL information about individual Internet sites. This software is commonly referred to as a search engine.
The principle of a search engine is that pieces of information about websites, collected in advance while a search robot or a predetermined search program called a spider program goes around a plurality of sites open on the Internet, are stored in a database (DB) and that when the user enters a specific keyword, only websites containing content matching the entered keyword are selected from the DB and then presented. In more detail, the search engine runs a spider program via a Common Gateway Interface (CGI) when a search request, such as keyword entry, is received from a user computer.
Here, the term “CGI” denotes a standard interface which is disposed between a web server and an external program and is configured to receive data from a web browser installed on a user computer, execute an externally installed program according to the received data and receive the results of the execution from the executed program. The running spider program receives the results of the search from an index DB in which the URL addresses and pieces of information of various types of websites are stored, converts the search results into a document in HTML format, and transmits the HTML document to the user computer.
In the case of such search engines, at the beginning when Internet services were initiated, a directory-based search scheme was used in which a search engine searches and classifies individual Internet sites and web documents and arranges the search and classification results into a DB and which can access final data while gradually subdividing the classes of preset themes according to the theme search or menu search selected by the user.
However, as the scale of the World Wide Web has gradually and rapidly expanded, and the number of Internet sites has suddenly increased, it has become impossible to smoothly search for desired information by using such a directory-based search scheme. That is, data held in the search engine must also increase in proportion to the scale of the World Wide Web that has rapidly expanded. However, as in the case of the conventional search engine, an existing method of checking a single web page and storing the checked web page in the DB using a manual operation is inadequate for keeping up with the growth trend of the World Wide Web.
For this reason, a search engine has appeared in which the concept of the above-described search robot is introduced and which automatically searches for web pages, indexes the web pages, and then provides a search service. Such a search engine uses a keyword (search word)-based search method, searches for all web documents related to a keyword entered by the user, and then provides the results of the search to the computer or the like of the user. However, there is an inconvenience in that an excessively large number of web documents are found in the search, thus causing the user to search again for his or her desired contents on a screen showing the search results.
Meanwhile, XML is an abbreviated form of extensible markup language, and is a next generation Internet document standard which must be essentially used in the age of the Internet in the future. This was defined as a format for Internet standard documents by the World Wide Web Consortium (W3C) in 1998. This XML is implemented in a structure which can be easily understood by human beings and which can be easily read by machines, and is a language generated by making up for the disadvantages of SGML while overcoming restrictions in the representation of HTML.
HTML that has been most widely used to date as a content representation language on the Internet is suitable for the function of representation, but has limitations when it is desired to reuse documents or search for documents. The reason for paying attention to XML as the next generation Internet language for overcoming the above limitations is that XML is a language enabling scalability, compatibility and the structuring of information to be realized.
Meanwhile, a DB is a structure for data stored according to a specific relationship reflecting the meaning of data. Since a DB, as a warehouse of information, is used by a large number of application programs, the structure of the DB must be able to be modified without having to revise application programs.
Generally, the development of a DB-based data model scheme has been conducted in such a way that information is stored in a DB on the basis of the DB, search results are received from the DB at the request of a user and are converted into a document in HTML format, and the HTML document is transmitted to a user computer.
For example, a method of searching for Internet data and arranging the Internet data into a DB is disclosed in Korean Patent Application No. 10-1998-0006152. This discloses a scheme for separately arranging only data belonging to a specific field, among pieces of information on the Internet, into a DB, and for enabling a commercial search service using such a separate DB to be provided. A web browsing system and web browsing method for attaching additional link information to an HTML document provided at the request of a user is disclosed in Korean Patent Application No. 10-2008-0015282. This web browsing system and method discloses that additional link information is selectively attached to an HTML document received from a specific web server at the request of the user and interpreted by the web browser, thus allowing the user to conveniently and efficiently perform web surfing and searching. However, as described above, the above patents are problematic in that they must convert search results into an HTML format and transmit the converted search results to a user computer, thus deteriorating the speed of a data search, and in that when errors occur during a procedure for receiving search results and converting the search results into an HTML format, incorrect search results may be displayed.