Our world is a complex place. It is faced with many difficult, interdependent problems; some caused by nature, others caused by man. As individuals, families, communities, and nations, we face an ever changing and compounding series of perplexing challenges spanning numerous domains: defense, health, climate, food, cyber, energy, transportation, education, weather, the economy. Compounding pressures in each of these areas threaten our health, our safety, our security, our livelihood, and our sustainability. We seek improved capabilities to detect, understand, mitigate, and prevent our brave new world of threats. To address these challenges, we invariably resort to science, our systematic enterprise for building and organizing knowledge that helps us understand, explain, and predict our world around us. At the core of science is inquiry. We formulate questions. We generate hypotheses. We predict consequences. We experiment. We analyze. We evaluate. We repeat. Our problems are complex; the process is slow.
Fueling the scientific process are the observations we make and the data we collect. With the advent of the 21st century telecommunications explosion, data is now flowing and evolving all around us in massive volumes, with countless new streams, mixing and shifting each minute. This data space is enormous and continuously changing. And by many accounts, its expansion and movement has only just begun. Analyzing and understanding this vast new ocean of data is now of paramount importance to addressing many of the complex challenges facing our world.
Today's data analytic industry is vibrant with a continuous supply of new and innovative products, services, and techniques that thrive and prosper based on their relative merits in the respective marketplaces. Unfortunately, these components are rarely interoperable at any appreciable scale. Moreover, the rapid proliferation of analytic tools has further compounded the problem. With only loose coordination, these partial solutions are ineffective at combating the broad spectrum of problems. Attempting to impose a “one-size-fits-all” analytic solution, however, across today's tremendous data expanse poses significant scientific, technical, social, political, and economic concerns. Consequently, an enormous amount of resources must regularly be expended to address isolated issues and mitigate specific threats. Thus, the analytic community faces considerable challenges dealing with major classes of problems—particularly those at national and international levels.
The present embodiments describe an approach for organizing and analyzing our new data-centric world, rapidly sharing information derived from that data, unifying this information into a common global knowledge framework, and enabling scientific analysis at a scale never before possible. The approach is a transformative, multi-disciplinary approach to collaborative global-scale integrative research. This approach offers a new method for dramatically reducing complexity and accelerating the scientific discovery process. This approach was specifically formulated to address extremely complex problems related to global security, world health, and planet sustainability. The approach advocates the construction of a unified global systems model. At the core of its design is an extensible knowledge representation architecture specifically crafted for large-scale, high-performance, real-time scientific analysis of heterogeneous, dynamic and widely distributed data sources without compromise of data security, safety, or privacy.
The approach employs a powerful integration of advanced concepts and emerging capabilities selected from across the government, commercial, laboratory, and academic sectors. The resulting framework offers participants a pragmatic means to think globally, leveraging the aggregate of combined knowledge to further science, and better inform local, regional and international decision making. This approach exploits the uniqueness of heterogeneous data sources and diverse scientific theories, weaving all such points together into a powerful collaborative fabric. Great care, however, must be taken with this stitching process. The movement and replication of data is costly and risks inappropriate information disclosure and/or compromise. The farther data flows from its source, and the more information that is aggregated by an increasing number of parties, the greater the security and privacy concerns and accompanying loss of autonomy. As described herein, the approach addresses these concerns in its foundational design tenets.
A primary goal of this approach is to achieve a computational science analog to self-sustained nuclear fission. That is, the approach advocates a method for reaching a critical knowledge (mass) density that is capable of sustaining an unprecedented analytic (energy) yield. To achieve such high levels, the approach grapples with scale right from the onset, encouraging knowledge enrichment “downward” in addition to traditional data pipeline scaling “upward”. To this end, the approach includes the construction of a knowledge representation structure that spans the planet and numerous of its complex, interdependent subsystems. The construction of this structure is accomplished via carefully formulated transformations of the world's exponentially increasing data into an asymptotically limited information space. This technique enables global-scale computational tractability, promising a reduction in integration time and cost from quadratic down to linear in the number of source data systems. Thus, the analytic framework offered provides a practical, achievable means for accomplishing multi-disciplinary research across many diverse, complex and often heavily interdependent domains. The results of this work offer a conceptually simple yet elegant method for the scientific community to manage complexity across many heterogeneous domains at a scale never before possible.