The present invention relates to distributed computing, and more particularly, this invention relates to addressing scalability of applications that, for performance reasons, require that the set of resources being queried be stored in memory.
Current distributed computing models generally involve either tightly-coupled nodes (i.e., clustered computing), distribution of workload based on service type, (i.e., SOA or 3/N-tier architectures), or virtualized services (e.g., cloud computing). These models are focused on services rather than the resources provided by those services and do not address either the issue of scalability of data provided by a given service or instances where discrete sets of data are used as resources for a set of services.
Several solutions exist that address some of these issues, for instance virtualization of databases through the combination of several underlying RDBMS resources; however, these solutions either combine data that is distributed by type (e.g., one table is sourced from one database while another table is sourced from another database) which does not address the concern of scalability of data of the same type, or they gather resources by querying multiple data repositories to locate the required information, which is inefficient because several nodes must be queried before the data is located. Finally, current distributed computing models do not address data aggregation concerns where aggregated data involving relationships between resources in the aggregated sets is preprocessed as part of the overall application's initialization procedure, thereby improving runtime performance. This is particularly relevant with data such as RDF, where information about a resource can be located in a number of locations.
Therefore, it would be favorable to have a mechanism for distributing workload by data set instead of service which overcomes the issues associated with the current methods.