1. Field of Invention
The invention relates to hardware, software and electronic service components and systems to provide large-scale, reliable, and secure foundations for distributed databases and content management systems, combining unstructured and structured data, and allowing post-input reorganization to achieve a high degree of flexibility.
2. Description of Related Art
One can envision highly distributed databases capable of managing simultaneous participation by billions of users, and highly distributed content management systems coordinating the contributions of billions, routinely integrating the contributions of both people and machines, and spanning multiple organizations, firms, and the globe itself. One can imagine flexible systems, where data is input in unstructured as well as structured forms, and subsequent users can access and present the data in flexible, evolving forms not anticipated at the point of data entry. Massively parallel processing—envisioned as occurring inside one machine or cluster of machines—was once the premier challenge facing the database and content management community. The new challenge, in our view, is massively parallel, and flexible, participation of billions.
In order to accomplish this, the world will need a new “business ecosystem.” Advances in information technology often show three related themes that may be thought of as analogous to the biological processes of expansion of and species succession in natural ecosystems. First, non-expert end-users will be empowered to solve problems. Second, technology platforms will be created that modularize technology contributions into niches. The niche contributions interrelate with each other through standard protocols and interfaces that are made “open” to technologists and the general public, so that tens, hundreds, and sometimes millions of innovators can contribute to the resulting business ecosystem, each according to his or her choice, creativity and competence. In turn new niches will be established, opened-up, and will bring in further new contributors and contributions.
As the business ecosystem expands, some specific technological components will become critical enablers to the continuing advance of the whole. Issues of flexibility, scale, reliability, and security will become vital to the community. These vital components, for example microprocessors, storage controllers, and network devices in the personal computer ecosystem, will require systematic application of research and development, capital investment, and coordination with industry partners in order that the whole ecosystem can progress. If the world is to make real the vision of the flexible participation of billions, there are a number of core components and systems that have not been invented, and will need to be invented.
The flexible participation of billions has been presaged by blogging—that is, the act of individuals creating Web sites and adding to them more or less daily. By dramatically increasing production and sharing of Web-based content, the blogging movement now produces a virtual river of content—available continuously and with global circulation. Just as word processing empowered millions to create their own documents, blogging software has made it relatively easy for millions to produce their own Web sites and keep them continually updated. By the promotion of a simple underlying standard for sharing text and other media, blogging has popularized the “syndication” or passing on of content borrowed from others—extending the reach of any given blogger and further increasing the total quantity of information in circulation.
A number of companies have emerged as niche players targeting various aspects of large-scale distributed databases, content management, and group participation. For example, some companies such as FeedDemon, NewsGator, myYahoo (Yahoo), and Bloglines have focused on client-side aggregation and presentation. Companies such as Technorati, Google, and Feedster have focused on the complementary services of searching for data feeds of interest. Other companies have focused on technologies for providing syndicated data streams such as SixApart, Drupal, TypePad, Flickr, Picasa (Google), and Blogger (Google). Other companies have positioned themselves as content providers, including new companies such as Engadget, Weblogs Inc., Topix.net, and MySpace, as well as established media companies such as the New York Times and BBC. Of course, various generic Internet technologies are also relevant to the rapidly growing weblog data flow, such as BitTorrent or Akamai's EdgePlatform.
While offering significant advancement in terms of experiences such as sharing news, music, videos and other items, as well as enabling players of games to interact with each other individually and in groups, the value chain is weak, fragmented, and closed to interoperability among contributors in many areas. The value chain will benefit from both improved contributions in specific functions or niches, as well as a more comprehensive overall vision of a possible “flexible participations of billions” ecosystem, additional niches (layers and modules) of functionality, recast functionality among modules, rationalization of protocols and interfaces among modules, and custom combinations of functions that establish end-to-end solutions for specific purposes. For example, available services are weak in presentation, search, signal, and network routing. Aggregators that centralize content use display formats that are widely criticized, despite a general agreement among users that they improve over conventional search engine displays. Storage of most blog content is in proprietary, isolated data sets controlled by blog service operators, and the data cannot be easily restructured or even moved from one provider to another. In their current form, services fail to provide enterprise-class features such as security, privacy, data integrity, and quality of service.
There remains a vital need for components and services that explicitly address the challenge of enabling the “flexible participation of billions” and that are capable of levels of scale, reliability, security and flexibility as yet unrealized and perhaps unimagined. There is a need for a new global business ecosystem, within which innovation by millions of people will be embraced, in order to meet the challenge. In order to stimulate the formation and rapid evolution of such a business ecosystem, there will have to be systematic development of general purpose software, systems and protocols specifically engineered to enable the flexible participation of billions. There also remains a need for such an infrastructure in the health care industry.