The present invention relates generally to the field of bioinformatics. In particular, the invention relates to methods, media and systems for graphically displaying computer-based biomolecular sequence information.
Informatics is the study and application of computer and statistical techniques to the management of information. Bioinformatics includes the development of methods to search computer databases of biomolecular sequence information (e.g., nucleic acid and protein) quickly, to analyze and display biomolecular sequence information, and to predict protein sequence, structure and function from DNA sequence data.
Increasingly, molecular biology is shifting from the laboratory bench to the computer desktop. Today""s researchers require advanced quantitative analyses, database comparisons, and computational algorithms to explore the relationships between sequence and phenotype. Thus, by all accounts, researchers cannot and will not be able to avoid using computer resources to explore gene sequencing, gene expression, and molecular structure.
One use of bioinformatics involves studying an organism""s genome to determine the sequence and placement of its genes and their relationship to other sequences and genes within the genome or to genes in other organisms. Such information is of significant interest in biomedical and pharmaceutical research, for instance to assist in the evaluation of drug efficacy and resistance. To make genomic information manipulation easy to perform and understand, sophisticated computer database systems have been developed. Incyte Genomics, Inc. of Palo Alto, Calif., has developed several such databases (for example LifeSeq(copyright) Gold), including some in which genomic sequence data is electronically recorded and annotated with information available from public sequence databases. Examples of such public sequence databases include GenBank (NCBI) and SWISSPROT. The resulting information is stored in a relational database that may be employed to determine relationships between sequences and genes within and among genomes.
While genetic data processing and relational database systems such as those developed by Incyte Genomics, Inc. provide great power and flexibility in analyzing genetic information, further improvements in these systems will help accelerate biological research for numerous applications.
One area of interest in this regard is the display and viewing of biomolecular sequence information. As noted above, an important goal of genome research to determine the sequence and placement of an organism""s genes and their relationship to other sequences and genes within the genome, to genes in other organisms, and to related protein sequences. The ability to clearly and effectively display gene loci information for a given organism or organisms would greatly assist this task.
Accordingly, the development of a display viewing tool which allows a user to clearly and effectively display gene loci information for a given organism or organisms and/or other biomolecular sequence information is desirable.
The present invention meets this and other needs by providing methods, media and systems for graphically depicting computer-based biomolecular sequence information. Generally, biomolecular sequence information may be graphically depicted in a variety of different forms in accordance with the present invention. In particular, the tools of the present invention facilitate data manipulation permitting detailed analysis of selected portions of the biomolecular sequence information. The biomolecular sequence information may be composed of nucleotide or amino acid sequence information or both. The graphical depictions may be in several different formats providing different information relating to the sequences, and may be displayed in one or more screens of a computer user interface.
A graphical viewer in accordance with the present invention preferably has a plurality of panels, each panel displaying information about the biomolecular sequence data of interest in a different way. For example, a first panel could show a graphical representation of an entire biomolecular sequence, or a portion of the sequence of interest, with the locations of particular sub-sequences of interest indicated. A second panel could show a more detailed graphical representation of all or a selected portion of the sequence represented in the first panel, allowing a user to focus on particular sub-sequences of interest. This second panel view could depict additional information, relating to the particular subsequences of interest. A third panel could show additional information which may include annotations or graphical representations of the number or type of sequencing operations used to generate the biomolecular sequence data. Alternatively, the third panel could show confidence level, or origination, for example, of the biomolecular sequence data represented in one or more of the other panels. Additional panels on the same or additional screens could show, for example, the actual nucleotide or amino acid sequence of, or relating to, a selected subsequence of interest represented in one or more of the other panels, or other information relating to the biomolecular sequence data. Also, additional panels may comprise one or more Working Basket panels which could show, for example, nucleotide or amino acid sequence information selected from among the other panels and collected in the Working Basket panel wherein further detailed analysis could be conducted on the collected sequence information.
In accordance with the present invention, one embodiment comprises a computer implemented method for presenting biomolecular sequence data. This embodiment includes retrieving biomolecular sequence data from a database and graphically depicting elements of the biomolecular sequence data in a user interface of a computer system. The data can be retrieved in response to a user generated query. Additionally, one or more components of the depicted biomolecular sequence data are graphically selected. A further embodiment can comprise displaying the biomolecular sequence data in a plurality of panels comprised within a single frame. Still another embodiment of the present invention can include graphically selecting the one or more components of the biomolecular sequence data by selecting the components of the biomolecular sequence data as a group of biomolecular sequence data or individually selecting the components of biomolecular sequence data. In one embodiment, the selected data can be stored in a Working Basket panel.
One preferred embodiment includes a method for retrieving the biomolecular sequence data, presenting biomolecular sequence data in the plurality of panels which include a first legend panel graphically depicting at least a portion of a biomolecular sequence associated with a reference ID, a second target or reference panel graphically depicting at least a portion of the biomolecular sequence depicted in said legend panel, and a third panel which can selectably indicate further information including annotated information or details concerning the number and type of sequencing operations conducted to determine the sequence data depicted in other panels.
Another aspect of the invention is reflected in an embodiment wherein the graphically selected components of biomolecular sequence data stored or displayed in a Working Basket panel are subjected to further detailed analysis. Additional methods of further analyzing the sequences include manipulating the data in the third panel to conduct the detailed analysis by scrolling up and down the third panel to further examine details of the biomolecular sequence data displayed therein. In one particular embodiment the details examined can include the number and type of sequencing operations used to determine the biomolecular sequence data. Another aspect embodied by the invention includes highlighting graphically selected components of biomolecular sequence data, wherein the highlighted data can be hidden from view on selected panels leaving viewable certain remaining biomolecular sequence data. In addition to making easily viewable the desired information, this aspect of the embodiment, allows the viewable remaining biomolecular sequence data to be manipulated to analyze the viewable data in detail, wherein such manipulation can be accomplished by scrolling up and down panels displaying the viewable data.
An additional method of embodiment of the invention comprises a computer implemented method having programming instructions allowing a user to focus in on certain biomolecular sequence data of interest by retrieving biomolecular sequence data from a database, graphically depicting the data in a plurality of panels displayed on a user interface of a computer system, and graphically selecting one or more components of the biomolecular sequence data. Also, the programming instructions can include a user generated query which determines which biomolecular sequence data is to be retrieved. The programming instructions enable the graphically selected components of the biomolecular sequence data to be stored and analyzed in a Working Basket. Alternatively, or additionally, the embodiment includes programming instructions for highlighting the graphically selected one or more components of the biomolecular sequence data, and for hiding from view the highlighted data leaving viewable certain remaining biomolecular sequence data. The programming instructions can also include instructions enabling the viewable remaining biomolecular sequence data to be manipulated to analyze the viewable data in detail, wherein such manipulation can be accomplished by scrolling up and down panels displaying the viewable data.
In still another aspect, the invention provides a computer-readable medium containing programmed instructions arranged to graphically depict biomolecular sequence data. The computer-readable medium includes programmed instructions for retrieving biomolecular sequence data from a computer system database in response to a user query, and graphically depicting elements of the biomolecular sequence data in a user interface for the computer system. Additionally the embodiment includes instructions enabling graphically selecting one or more components of the biomolecular sequence data. These graphically selected components of biomolecular sequence data can be graphically displayed in a manner which depicts the further detailed sequence information which can include, but is not limited to, the number and type of sequencing operations used to generate the biomolecular sequence data. Further embodiments of the invention can store the graphically selected components of biomolecular sequence data in a Working Basket, wherein the contents of the Working Basket can be manipulated to provide further analysis of the biomolecular sequence data.
In yet another embodiment a computer system comprises a database including biomolecular sequence data configured in a format having a plurality of data fields independent of the source of the data and a user interface capable of receiving from the database, biomolecular sequence data responsive to a query and graphically displaying the biomolecular sequence data in a plurality of panels. Additionally, the database is configured to include data fields comprising a query object handler, a hit object handler, a feature handler, and an analysis tool or method object handler and a unique features object handler. Alternatively, the database is configured to include an object handler which can reformat the biomolecular sequence data into a plurality of data fields which comprise a query object handler, a hit object handler, a feature field, and an analysis tool or method object handler and a unique features object handler.
These and other features and advantages of the present invention will be presented in more detail in the following specification of the invention and the accompanying figures which illustrate by way of example the principles of the invention.