The present invention relates to electronic search engines and more specifically to a system wherein, when a data reference which references a record is provided by a system user, a processor performs a tiered search of linked databases for the sought record and, depending upon characteristics of the data reference (e.g., content characteristics and other non-content characteristics), limits the search to a specific database/database sub-set or extends the search to additional databases/directories.
Herein, unless indicated otherwise, the term “record” is used to refer generally to electronically stored and accessible data including, among other things, word processor documents, web browser pages, pictures, tables, charts, video clips, audio clips, multi-media presentations, etc. Also, hereinafter, the term “database” will be used to refer to a collection of data stored on an electronic medium while the term “database” will be used to refer to a data construct which resides on a database and which catalogues at least a sub-set of database data in an ordered fashion to expedite database searching.
The data processing industry generally has developed several tools that enable a system user to locate specific records stored on databases linked to workstations. To this end, early computing systems typically included a workstation linked to a single database with separate records stored at specific memory addresses. To access a record, a system user had to provide the precise name of the record to access or the record's address and then a workstation used the provided information to locate the sought record and facilitate access thereto.
Eventually the industry developed networking systems referred to generally as local area networks (LANs) and wide area networks (WANs) which linked several workstations to a plurality of different databases so that a system user could have access to data in many different databases from a single computer. To this end, in addition to providing the exact address on a database or name of a record to locate, a system user also had to indicate the database on which a record was stored so that the processor could locate the record.
With the advent of the Internet, in addition to their proximately linked memories and company wide LAN and WAN databases, many workstations are now linked to virtually tens of thousands of databases via massive electronic networks. In fact database storage is quickly becoming “commoditized” as storage industry leaders construct and service server and database warehouses sometimes referred to as storage area networks (SANs). Thus database storage and support is quickly becoming an outsourced application.
While many advantages are associated with the massive Internet network, the sheer amount of data accessible via the Internet poses many problems. This is particularly true as the boundaries between a “proximate” (i.e. directly connected) workstation memory and Internet databases (e.g., SANs and other portal support databases) become blurred.
Various tools have been developed to help Internet users locate and access records in virtually all network-linked databases. As with vintage systems, to access a record on a proximate database, typically a system user has to specify a database or specify a database on which the record sought is stored. Thereafter, the record name has to be provided and a database managing processor searches for the named record on the database or in the database specified. Where the record is not stored on the specified database, the processor indicates so and the user must select another database or database to search.
In the case of the Internet often a system user does not actually know the exact address or name of the record sought. Instead, the user only knows the general nature of the record sought. To facilitate the task of locating a record and rendering the record accessible, the Internet industry has developed search engine technology whereby a system user can provide a general description of a record sought in a query box. A processor then uses the general description to identify a specific database that likely includes the record sought. Next the processor searches the identified database rendering a list including many different records, each of which may in fact be the record sought. Thereafter the system user can select a record from the list for viewing.
Another tool that has been developed to access records and link related records together is generally referred to as hyper-linking. In hyper-linking schemes text and pictorial references in one record to other existing records may be distinguished and linked to addresses corresponding to the referenced records. By selecting a phrase or image, a user instructs the processor to access the record at the related address and provides the record for review.
Various tools have been developed to enable insertion of hyper-linking references in records for linking to other records. For example, e-mail software and word processors enable a user to enter an Internet address into record text. A processor recognizes the address as a hyperlink address, highlights (e.g., presents the address in a distinct color) the address and facilitates linking to the record stored at the address through selection of the highlighted hyper-link phrase. While this tool is useful it requires the system user to input the hyperlink address without error, a daunting task in many cases, especially as Internet addresses become longer and longer. In addition, inserting an address into record text tends to break up a readers train of thought.
Other tools have been developed which allow a user to earmark text phrases within a record for linking to web browser pages and then to manually provide a linking address for each earmarked phrase. These tools render more readable records but still require a user to enter complete Internet addresses which is a tedious task.
In addition to the systems described above the industry has also developed tools that enable a user to publish records as web documents that can be linked to other documents via addresses. Again the addresses have to be specified by the system user during publishing and each time the record is to be linked the system user has to again specify the address.
Efforts have been made to automate the web publishing process. One particularly useful effort is described in U.S. patent application Ser. No. 09/247,349 (hereinafter “the '349 application”) which was filed by the present inventor on Feb. 10, 1999, is entitled “Method and System for Automated Data Storage and Retrieval” and is incorporated herein by reference. The '349 application describes a system wherein a processor recognizes keywords, keyword phrases or data references (DRs) in a first record which reference other records stored on one or more databases and then generates links to the referenced records so that a person examining the first record can easily access any of the referenced records. Preferably access to the referenced records is facilitated by visually displaying the keyword phrase or DR in a highlighted format (e.g., similar to a hypertext linking phrase) which is selectable by a system user via a mouse controlled cursor or the like. Upon selection of the phrase or DR, the associated record is provided.
U.S. Pat. No. 5,895,461 which issued on Apr. 20, 1999, is entitled “Method and System for Automated Data Storage and Retrieval with Uniform Addressing Scheme”, is a parent patent to the '349 application and which is also incorporated herein by reference teaches a system whereby a system user can indicate a keyword phrase (e.g., via entry of a special character earmarking the phrase) within a first record which is meant to reference another record. When the keyword phrase is identified, a processor uses the phrase to determine which record the phrase should be linked to and then renders the referenced record accessible within the first record.
Again, in one preferred embodiment, the referenced record is rendered accessible by visually highlighting the keyword phrase in the first record in a format that is selectable to access the referenced record. This patent also teaches a system whereby a user can enter a phrase into a special search request field and, thereafter, the processor will locate a specific record stored on a database linked thereto that is referenced by the request. This patent contemplates that any database, including databases linked to a processor via the Internet, may be searched for a record referenced in another record.
While the above described searching and linking systems have many advantages, they still suffer from several important shortcomings. First, in many cases a system user does not know exactly on which of several different databases a record is stored. For example, with respect to a LAN or WAN, often there are many different databases which a specific system user may use to store a record. Subsequent to storing a record the user may not remember which of several different databases to search to locate the record. In this case the search process entails searching each database separately, often a time consuming process.
Second, in many cases the tools provided to search one database may be completely different than the tools provided to search another database. For example, to search a memory that is directly linked to a computer a document manager may be employed while to search an Internet database a search engine may be employed. Thus, a complete search in this case would require a system user to use many different tools in order to locate a specific record.
Third, vintage database systems typically either require a system user to specify a database to be searched (e.g., this is true in the case of a LAN or a WAN) or include a managing processor which identifies a single database to be searched when a record query is made as in the case of the Internet. In many cases such simple searching routines fail to search all of the possible records which may be referenced by a query. For example, in the case of a LAN that searches only one database at a time, records stored under other databases would not be contemplated. Similarly and as another example, in the case of the Internet, assuming a search request for a record wherein a primary term in the query is “Illinois”, an exemplary search engine typically includes a managing processor which identifies a server and related databases that correspond to Illinois and the search is limited to the identified server and linked databases.
Fourth, the systems described above fail to contemplate that, in a universe of databases, efficient database searching should follow a specific order wherein searching begins in the most likely location in which a specific sought record will be located followed by less likely locations.
Fifth, while systems exist for identifying and accessing records which correspond to randomly selected references in a first record, there is no system as of yet which facilitates inserting quick links between randomly selected references in a first record and second records associated with the selected references.
Sixth, often novice database users fail to recognize that a record may be stored on more than one database. As a result, after a single database has been searched to locate a record, often a novice user will assume the record has been lost or is inaccessible for some other reason. In this case the user would likely attempt to either recreate the lost record or access the record in some other form (e.g., hard copy stored in a traditional filing cabinet) despite the fact that, by simply specifying another database for searching, the record may be easily located.
Thus, it would be advantageous to have a system and method that overcomes the limitations described above. Specifically, it would be advantageous to have a system that automatically facilitates efficient database searching, that limits or extends database searches as a function of characteristics (e.g., reference content, reference context, the processor used to indicate the reference, etc.) of a reference to the record sought and that enables easy linking between a located record and a reference in another record.