A number of patents and publications are cited herein in order to more fully describe and disclose the invention and the state of the art to which the invention pertains. Full citations for these documents are provided herein. Each of these documents is incorporated herein by reference in its entirety into the present disclosure, to the same extent as if each individual documents was specifically and individually indicated to be incorporated by reference. For the avoidance of doubt, the citation of a document herein is not an admission that the document is in fact prior art.
Throughout this specification, including the claims which follow, unless the context requires otherwise, the word “comprise,” and variations such as “comprises” and “comprising,” will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.
It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise.
Ranges are often expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another embodiment includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by the use of the antecedent “about,” it will be understood that the particular value forms another embodiment.
Biosystems can conveniently be viewed at several levels of bio-molecular organisation based on biochemistry, i.e., genetic and gene expression (genomic and transcriptomic), protein and signalling (proteomic) and metabolic control and regulation (metabonomic). There are also important cellular ionic regulation variations that relate to genetic, proteomic and metabolic activities, and systematic studies on these even at the cellular and sub-cellular level should also be investigated to complete the full description of the bio-molecular organisation of a bio-system.
Significant progress has been made in developing methods to determine and quantify the biochemical processes occurring in living systems. Such methods are valuable in the diagnosis, prognosis and treatment of disease, the development of drugs, for improving therapeutic regimes for current drugs, and the like.
While genomic and proteomic methods may be useful aids, for example, in drug development, they do suffer from substantial limitations. A “metabonomic” approach has been developed which is aimed at augmenting and complementing the information provided by genomics and proteomics. “Metabonomics” is conventionally defined as “the quantitative measurement of the multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification.” This concept has arisen primarily from the application of 1H NMR spectroscopy to study the metabolic composition of biofluids, cells, and tissues and from studies utilising pattern recognition (PR), expert systems and other chemoinformatic tools to interpret and classify complex NMR-generated metabolic data sets. Metabonomic methods have the potential, ultimately, to determine the entire dynamic metabolic make-up of an organism.
As outlined above, each level of bio-molecular organisation requires a series of analytical bio-technologies appropriate to the recovery of the individual types of bio-molecular data. Genomic, proteomic and metabonomic technologies by definition generate massive data sets that require appropriate multi-variate statistical tools (chemometrics, bio-informatics) for data mining and to extract useful biological information. These data exploration tools also allow the inter-relationships between multivariate data sets from the different technologies to be investigated, they facilitate dimension reduction and extraction of latent properties and allow multidimensional visualization.
This leads to the concept of “bionomics”, the quantitative measurement and understanding of the integrated function (and dysfunction) of biological systems at all major levels of bio-molecular organisation. In the study of altered gene expression, (known as transcriptomics), the variables are mRNA responses measured using gene chips, in proteomics, protein synthesis and associated post-translational modifications are typically measured using (mainly) gel-electrophoresis coupled to mass spectrometry. In both cases, thousands of variables can be measured and related to biological end-points using statistical methods. In metabolic (metabonomic) studies, NMR (especially 1H) and mass spectrometry have been used to provide this level of data density on bio-materials although these data can be supplemented by conventional biochemical assays.
For in vivo mammalian studies, the ability to perform metabonomic studies on biofluids is very important because it gives integrated systems-based information on the whole organism. Furthermore, in clinical settings, for the full utilization of functional genomic knowledge in patient screening, diagnostics and prognostics, it is much more practical and ethically-acceptable to analyse biofluid samples than to perform human tissue biopsies and measure gene responses.
Metabonomics offers a number of distinct advantages (over genomics and proteomics) in a clinical setting: firstly, it can often be performed on standard preparations (e.g., of serum, plasma, urine, etc.), circumventing the need for specialist preparations of cellular RNA and protein required for genomics and proteomics, respectively. Secondly, many of the risk factors already identified with a particular disorder are small molecule metabolites that will contribute to the metabonomic dataset.
A limiting factor in understanding high-content biochemical information (e.g., NMR spectra, mass spectra) is their complexity. The most efficient way to investigate these complex multiparametric data is employ the metabonomic approach in combination with computer-based “pattern recognition” (PR) methods and expert systems. These statistical tools are similar to those currently being explored by workers in the fields of genomics and proteomics.
Pattern recognition (PR) methods can be used to reduce the complexity of data sets, to generate scientific hypotheses and to test hypotheses. In general, the use of pattern recognition algorithms allows the identification, and, with some methods, the interpretation of some non-random behaviour in a complex system which can be obscured by noise or random variations in the parameters defining the system. Also, the number of parameters used can be very large such that visualisation of the regularities, which for the human brain is best in no more than three dimensions, can be difficult. Usually the number of measured descriptors is much greater than three and so simple scatter plots cannot be used to visualise any similarity between samples. Pattern recognition methods have been used widely to characterise many different types of problem ranging for example over linguistics, fingerprinting, chemistry, and psychology. In the context of the methods described herein, pattern recognition is the use of multivariate statistics, both parametric and non-parametric, to analyse spectroscopic data, and hence to classify samples and to predict the value of some dependent variable based on a range of observed measurements.
Although the utility of the metabonomic approach is well established, its full potential has not yet been exploited. The metabolic variation is often subtle, and powerful analysis methods are required for detection of particular analytes, especially when the data (e.g., NMR spectra) are so complex. New methods to extract useful metabolic information from biofluids are needed in order to be able to achieve clinically useful diagnosis of disease. Methods of analysing data (e.g., NMR spectral data), such as those described herein, may be used to identify diagnostic chemical species (e.g., biomarkers) that may subsequently be used to classify a test sample or subject, for example, in diagnosis, prognosis, etc. These methods represent a significant advance over previously described methodologies.