1. Field of Invention
The rapid advancement of computer and database has allow human to produce numerous computerized document to date. Meanwhile, progress in electronic communications has further offered the opportunity to make all of these documents remotely accessible via the network.
A typical example is Internet, which millions of document systems containing hypertext information is becoming available during the next decade. World Wide Web (WWW) allows individuals to set up a home page describing each own's document content and particular subjects of interest. Meanwhile, WWW allows individuals to establish hyperlink referring to any relevant document systems on Internet around the world.
However, document systems today still employ traditional computer architecture including RISC (Reduced Instruction Set Computing) or CISC (Complex Instruction Set Computing), which are optimized for data computation, not for document processing. Therefore, proper establishment of these WWW hyperlink and proper identification of the relevant Internet document systems for conforming with individual's particular subject of interest becomes a monumental task, since millions of document search, retrieval, and evaluation operations will be required.
As a result, it becomes necessary to explore a further advanced computing architecture which can be particularly optimized for the efficient processing, storage, referencing, transmission, and retrieval of various document data types.
The present invention is related to integrated circuit system technologies according to a novel Document-Instruction-Set-Computing (DISC) principle.
More specifically, the present invention defines the core functions for DISC integrated circuit to autonomously perform document processing without human intervention. In particular, these DISC microprocessors can effectively perform distributed document storage, processing, transmission, and retrieval operations for Internet based systems, services, and applications including personal communications systems, interactive database, HDTV, object-oriented systems, and multimedia computing devices.
Internet is used to illustrate the performance requirement of real time document processing for very large network and very large database. Our invention is not limited to Internet.
2. Description of the Prior Art
Recent interest for efficient document retrieval on massive computer network such as Internet has triggered the demand to explore on-line autonomous transaction processing and storage system architecture, i.s., referencing, manipulation, or retrieval of electronic document from local or remote host system. This will allow enterprise, consumer, business, educational, military, and medical industries to develop novel Internet system applications employing smart card, personal communications, multimedia, compound document database. However, due to the real-time performance requirement for transaction processing of these document files, novel methods would be required for the effective on-line computation of incoming digital document signals.
Namely, for these new emerging transaction-oriented document processing applications, the signal channels would typically remain silent until the selective authorized users or applications initiated a request for the channel usage in order to perform local or remote access, transmission, storage, or retrieval of the particular document data files. The incoming document signal sequence from the remote or local storage, i.s., file or database, will comprise data in variable block form, which include user identification code, document file description commands, and the relevant content data as organized in paragraph, section, chapter, or other human comprehensive means. Due to the nature that transactions can happen at any time instance, and in a total random fashion, therefore, unless systems equipped with special hardware architectural improvement, it is really not possible to predict, anticipate and schedule these events employing traditional scheduling, optimization, and computation methods as described in the background arts.
Although there are plenty of background arts, such as RISC (Reduced Instruction Set Computing) and CISC (Complex Instruction Set Computing), which taught how to improve the traditional data computation methods and to further apply them for document processing, it is essential to remember their main object is still to improve data computation instead of document processing. For example, the Intel 80.times.86 and Motorola 680.times.0 microprocessor families have been evolved over the past twenty years to improve the floating point or fixed point data computation performance through providing architecture and instruction sets improvement on the pipeline of ALU (Arithmetic Logic Unit). As a result, the only possible means for these prior arts to improve document processing performance is through the increase of system clock speed. However, this is not sufficient for the real-time transaction processing performance of document data types.
All of the prior arts methods require the document data signal to be first captured in a continuous bit stream, since they are not able to capture data in variable block size. Next entire command and data content will need to be buffered at a local storage. Finally, the appropriate instruction set will be fetched for execution. These methods, though practical, require expensive high speed processing and memory circuits in order to reach the real time performance. Furthermore, these circuits must be constantly remain active in order to continuously monitor the signal channel for any incoming signal sequence. None of these methods have ever taught how to provide better architectural support and instruction sets for the optimum performance of run-time document data processing, it is conceived that these background arts impose serious cost and power consumption disadvantage for their product implementation, and subsequently limit the market realization potential of these emerging technologies and applications.
Therefore, it would be highly desirable to develop integrated circuit according to a novel computing architecture in order to reach the optimum run-time performance of processing document related data files, this architecture will instead provide much lower priority to traditional data computation task. Furthermore, there is a significant demand for integrated circuit with such capability to be embedded within the document storage, transmission, or processing machines such as printer, copier, scanner, fax machine as a transaction processor engine, and to selectively connect these machines to provide transmission, receiving, manipulation, retrieval, or storage of document data.
A closer look at the document data type reveals that the stream input/output that the prior arts, i.s., CISC, and RISC provide are not appropriate since document data consists of variable size of text data block which is organized in either paragraph, section, chapter, or any other human comprehensive form. Therefore, it would be highly desirable for the integrated circuit to support variable size data block, and to provide capability to parse the entire document into a selective plurality of segments, wherein each segment can be further parsed into smaller blocks in order to search the relevant subject of interest for selective user or applications. As a result, a designated reference engine for corresponding document segment for selective user or applications.
After further examination, it would be highly desirable to allow the integrated circuit to be optimized for the high level programing language based procedural calls. This is because the referencing and parameter passing mechanism between individual document segments are closely resembled to high level programing language procedural calls. As a result, we feel it is most efficient to provide the direct execution capability of these high level language instructions including procedural calls for the optimum performance of document processing.
Furthermore, prior arts in software technology have made significant advancement in the area of database and object-oriented programming, and these technologies have been used extensively for various document related applications. However, conventional CISC and RISC hardware technologies can only provide extremely poor performance to support them. Closer examination reveal that this is because these software technologies do not require heavy floating or fixed point data computation, but relies more on the procedural call processing and referencing.
Finally, digital dictionary have only been explored by prior arts for the learning of foreign language, i.s., the inclusion of audio, video, or graphical script for each vocabulary can significantly assist the human user for the proper spelling, understanding, and pronunciation of each word. However, none of the prior arts make use of the principle of dictionary to make the system or computer to understand a selective collection of subjects of interest for each individual user or applications. N one of the prior arts further provides a flexible interface environment for allowing user or applications to create, delete, augment, or update their subject of interest interactively. It is also our understanding that none of the prior arts have based on the principle of a dynamic dictionary to directly execute high level language instructions for identifying, retrieval, and referencing of the relevant document materials.