Biotechnology data is collected and analyzed for many diverse purposes. As is known in the art, biotechnology data typically includes data obtained from biological systems, biological processes, biochemical processes, biophysical processes, or chemical processes. For example, sequences of deoxyribonucleic acid ("DNA") from many different types of living organisms are often determined and mapped. DNA is double-stranded polynucleotide including a continuous string of four nucleotide base elements. The four nucleotide base elements include deoxyadenosine, deoxycytidine, deoxyguanosine, and deoxythymidine. The four nucleotide bases are usually abbreviated as "A," "C," "G" and "T" respectively. DNA is used to make ribonucleic acid ("RNA"), which in turn is used to make proteins. "Genes" include regions of DNA that are transcribed into RNA, which encodes a translated protein.
One fundamental goal of biochemical research is to map and characterize all of the protein molecules from genes in a living organism. The existence and concentration of protein molecules typically help determine if a gene is "expressed" or "repressed" in a given situation. Protein characterization includes, identification, sequence determination, expression, characteristics, concentrations and biochemical activity. Responses of proteins to natural and artificial compounds are used to develop new treatments for diseases, improve existing drugs, develop new drugs and for other medical and scientific applications.
Biotechnology data is inherently complex. For example, DNA sequences include large numbers of A's, C's, G's and T's, that need to be stored and retrieved in a manner that is appropriate for analysis. There are a number of problems associated with collecting, processing, storing and retrieving biotechnology data using "bioinformatics" techniques known in the art. As is known in the art, bioinformatics is the systematic development and application of information technologies and data mining techniques for processing, analyzing and displaying data obtained by experiments, modeling, database searching and instrumentation to make observations about biological processes. Biotechnology data is commonly presented as graphical plots of two or more variables. A "peak," i.e., a local maximum in a plot of two or more variables, is often a feature of interest in biotechnology data.
When biotechnology data is collected, the collection process often introduces variability based on an environment used to conduct the experiment. For example, DNA sequences may be determined by processing samples using gel-electrophoresis. A label (e.g., a dye) is incorporated into the samples placed on gel-plates for detection by laser-induced fluorescence.
Gel-electrophoresis resolves molecules from the samples into distinct bands of measurable lengths on a gel plate. Gel-plates created with different batches of the same gel may be used to complete the same experiment, with the same target (e.g., the same polynucleotide sample), multiple times. All of the experiments should ideally yield the same results, since the same target is used in the same experiment. However, the gel-electrophoresis process typically introduces small errors in the biotechnology data due to variability in the gel-electrophoresis process.
For example, a gel may have been prepared by two different lab technicians, may have come from two packages of the same product, may have been purchased at different times, or may be applied to gel-plates at slightly different consistency or thickness, either by a lab technician or by with an automated process (e.g., a robot), etc. These factors and other factors typically introduce "experiment-to-experiment variability" into an experiment completed multiple times that ideally should yield exactly the same results.
Another problem is that biotechnology data is also collected with micro-arrays. Micro-arrays can also be used to provide sequence information instead of gel-electrophoresis. Micro-arrays may also introduce variability into the same experiment due to variations in sample preparation for the micro-arrays. Yet another problem is that biotechnology data that is data collected with experiment-to-experiment variability typically only grossly appropriate for visual display using bioinformatics techniques known in the art.
As is known in the art, one of the most commonly used methodologies in biotechnology is "comparison." Many biological objects are associated with families that share the same structural or functional features. For example, many proteins with a similar sequence may have common functionality. If a protein with a sequence similar to a known protein is located, the located protein may have a common functionality, and thus may have a common response to an environmental condition (e.g., a new drug).
Visual display of biotechnology data is typically recognized as typically being "necessary" for biotechnology research. Visual display tools allow creation of complex views of large amounts of inter-related data. Experimental data is typically displayed using a Graphical User Interface ("GUI") that may include a multiple windowed-display on a computer display.
Visual display and comparative analysis is typically hampered by variability introduced into experimental data. For example, if five iterations of the same experiment with the same target are visually displayed, the output values should ideally be superimposed on one another. However, due to experiment-to-experiment variability, the output values for the five iterations of the experiment typically will differ slightly and a visual display will tend to "magnify" experiment-to-experiment variability. This may lead to confusion during analysis and cause a user to lose confidence in a process used to collect and display experimental data.
In addition, in many instances, experiment-to-experiment variability is of a same order of magnitude as desired experimental results. Using visual display of experimental results with experiment-to-experiment variability, a user may not be able to determine if differences in results are due to a new target (e.g., a new polynucleotide sequence) or experiment-to-experiment variability.
Thus, it is desirable to reduce experiment-to-experiment variability in data obtained from experiments. The reduction of experiment-to-experiment variability should allow visual display and comparative analysis to be completed without confusion or loss of confidence in processes used to collect, process and display experimental data.