The present invention generally relates to database retrieval methods, and more particularly, to a search method that provides quantifiable analysis for determining the distribution and evolving or declining topics within a specialized area.
Databases are software tools that contain records. These records are arranged in different fields. Some of the fields that can be found in a bibliographic database might include some of the following: title of the book or article, authors, institution(s), source, abstract, keywords, bibliography, etc. This sort of arrangement allows the user to search in a field or a combination of fields (utilizing Boolean terms; and, or, not) to find desired information.
The bibliography field typically contains one or more citations. A citation may acknowledge the source of another document that is cited as a support for a point of view, or as an authority. In the past, Eugene Garfield, Ph.D. at the Institute for Scientific Information(copyright) (ISI) has used the bibliographic field format to produce statistical data. One of these sets of statistical data pertains to what is known as the Citation Index by which the frequency of a citation reflects the impact that such publication has had within its discipline. This concept is known as Impact Factor. ISI publishes on a yearly basis a list containing thousands of journals according to their ranked Impact Factor and their relation to a plurality of specific subjects (Journal Citation Reports(copyright), JCR(copyright)). Other lists provided by ISI categorize journals by immediacy index, citation half-life, total number of citations, etc.
ISI also works in the area known as bibliometrics. As proposed by Pritchard in 1969, bibliometrics has been defined as xe2x80x9cthe application of mathematics and statistical methods to books and other media of communicationxe2x80x9d. Thus, ISI has been addressing the following questions: What are the largest journals? What journals are the most frequently used? What are the xe2x80x9chottestxe2x80x9d journals? What are the xe2x80x9chottestxe2x80x9d articles? What journals have the highest impact factor? What publications does a journal cite, and which ones cite it? What is the historical origin of a new topic, etc.?
ISI and others have expanded this technology and are now capable of addressing other questions outside of the bibliographic field, thus giving rise to the science of informetrics. Other questions being addressed include authorship, country, institution and journal analysis, etc. The areas impacted by this type of analysis cover a wide-spectrum of subjects including: broadcasting, ethics, geology, psychology, management, chemistry, biology, medicine, etc.
Despite this progress in the informetrics field a couple of fundamental questions remain unanswered: What is happening in a specific topic; and what is at the forefront of that topic. Prior methods for predicting trends in specialized areas include hiring consultants that provide an opinion on the evolving areas. Some disadvantages with using consultants are that: (1) Consultants are expensive and typically limited to only their specific area of expertise; and (2) Consultants may incorporate their preferences into the opinions without easy detection. Thus, their opinions might be biased. A research facility interested in directing their research efforts at the evolving areas, to maximize their funding for research projects must expend a large amount of money to hire multiple consultants that will predict trends for each specialty within the research facility. Another disadvantages is that even after this large expenditure of money, the research facility still does not know the relative funding to apply between the specialties due to the speculative nature of consulting.
Databases contain a concept tree structure composed of keywords. These keywords are tags assigned to each article in the database so that any user can retrieve the same articles with consistency. Thus, these keywords operate as a means to recognize specific articles related to that topic. Prior art computer retrieval systems have been able to combine specific keywords with a set of journals. However, these systems lack the statistical analysis, depth, integrity, comprehensiveness and completeness required to consider these studies scientific. Furthermore, there are no published studies or systems that address the specific questions of what is happening in a specific topic, or what is at the forefront of that topic by the use of keywords database and/or keyword tree structures. Given these limitations, there is a need for a system of analyzing trends in research in a way that is efficient, unbiased and reproducible. In addition, a system that would allow a user to know the distribution of specific keywords in a given topic does not exist.
The present invention provides a system and method with the capacity to compare and analyze keywords of a specific area of study. By the use of the methods of the present invention, some sets of keywords will be seen as xe2x80x9cwarming upxe2x80x9d due to their upward trends whereas other keywords might be seen as xe2x80x9ccooling downxe2x80x9d due to their downward trends. Given the accepted fact that growing areas of research are the ones that are more likely to produce scientific breakthroughs, the system identifies these emerging (xe2x80x9chotxe2x80x9d) areas of research that may accelerate the scientific advances of users. Similarly, users are able to view and shift from non-productive (xe2x80x9ccoolxe2x80x9d) areas of research to productive xe2x80x9chotxe2x80x9d areas.
The process involves the utilization of a database program and provides specific keywords associated with the investigated topic. The present invention also provides a method for indexing the keywords using a keyword tree structure so the data is in the correct format for analysis. The process also provides a method for analyzing the number of occurrences of keywords along with the analysis of an impact factor associated with the keywords. The formatted data then allows the construction of several charts so a user can easily assess the state and forefront of a specified topic.
The process involves the input of the name of the journal to be investigated, removal of the none-original articles such as editorials, news, comments, etc from the query built in the retrieval process, limiting the query by the different years to be investigated, and downing the articles according to the years or group of years to be investigated.
In another embodiment, the process is best suited for the study of a keyword or a small set of keywords that do not require a pre-search to find out the specialized area keywords. The purpose of this process is to find additional keywords that other keywords relate with. Thus, this embodiment does not require to select any journals from the database since the user wishes to know which keywords relate to its query regardless where it is published. Input the keyword(s) into the query and remove none-original articles such as editorials, news, and comments from the query, limit it by the different years to be investigated and download the articles according to the years or group of years to be investigated. This process accounts for all of the focused keywords. The process of this embodiment then calculates a correction factor and applies it following pre-indexing process. The process then continues to the index step to sort the keywords. The process then proceeds with the statistical analysis.
This type of investigation provides the user with a tool to know which are the areas related to the keyword or small set of keywords under investigation. One interesting aspect of it is to find out new correlations of this keyword(s) with unsuspected topics. This type of search is particularly appealing to anyone searching for new uses. For instance, very often pharmaceutical compounds have multiple applications. Novel research that applies to a related compound to the one being investigated might be picked up by the user""s search since they might be sharing higher hierarchical keywords.
In another embodiment, the process combines the specialized keywords and a selection of all the specialty journals available with the top non-specialty journals with the highest impact factor than the best specialized journal. The question best addressed with this system is xe2x80x9cwhat is happening in a specialized area of research.xe2x80x9d
The information provided by this method provides managers with a novel tool to establish current needs and anticipate future requirements that will ultimately maximize their efforts and gains. Beneficiaries of this system would include the following: scientists, managers, strategists, venture capitalists, investment bankers, foundations, information and market analysts, publishers, historians, etc. At the institutional level, the beneficiaries include: companies, non-profit organizations, research centers and governments agencies. The present unbiased and quantifiable system and method allows a user to see the reality of past and present topics"" distribution and trends within a specialized area. Moreover, by extrapolation of the data a forecast of future trends is made possible.