The Deep Web is the part of the Internet that is inaccessible to conventional search engines, and consequently, to most users. Deep Web content includes information in private databases that are indirectly accessible over the Internet but not crawlable by typical search engines. For example, libraries maintain data bases of books that are accessible to the public but in order to access them a user needs to fill out a web form in order to access the content.
In general, it is assumed that deep Web was growing much more quickly than the surface Web and that the quality of the content within it was significantly higher than the vast majority of surface Web content. Although most of this content is publicly available, it's accessibility to typical Internet end users is very limited.
It would be highly advantageous to have a platform that would enable end users to effectively configure and access surface web and deep web data sources remotely.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The materials, methods, and examples provided herein are illustrative only and not intended to be limiting.
Implementation of the method and system of the present invention involves performing or completing certain selected tasks or stages manually, automatically, or a combination thereof. Moreover, according to actual instrumentation and equipment of preferred embodiments of the method and system of the present invention, several selected stages could be implemented by hardware or by software on any operating system of any firmware or a combination thereof. For example, as hardware, selected stages of the invention could be implemented as a chip or a circuit. As software, selected stages of the invention could be implemented as a plurality of software instructions being executed by a computer using any suitable operating system. In any case, selected stages of the method and system of the invention could be described as being performed by a data processor, such as a computing platform for executing a plurality of instructions.
Although the present invention is described with regard to a “computer” on a “computer network”, it should be noted that optionally any device featuring a data processor and/or the ability to execute one or more instructions may be described as a computer, including but not limited to a PC (personal computer), a server, a minicomputer. Any two or more of such devices in communication with each other, and/or any computer in communication with any other computer may optionally comprise a “computer network”.
The present invention relates at least partly to a method for enabling automated content aggregation based on deep Web sources, comprising: analyzing a plurality of deep web sources to detect a plurality of fields; selecting at least one field; and aggregating content provided to a plurality of deep web sources through the at least one field. The present invention also relates at least partly to a method for automatically generating domain based knowledge bases based on deep Web sources, comprising: selecting a plurality of deep web sources according to a domain; analyzing the plurality of deep web sources to detect a plurality of fields; selecting at least one field at least partially according to the domain; and aggregating content provided to a plurality of deep web sources through the at least one field.
It will be appreciated that for simplicity and clarity of illustration, elements shown in the drawings have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numerals may be repeated among the drawings to indicate corresponding or analogous elements throughout the serial views.