In many situations, a database application needs to rely on external data that is provided outside of the primary application persistence, for example, data from an external data provider. Examples include:                Product data        Ratings        Price-calculations        Address data        Geo-location        
If the application needs live access to this external data, the overall responsiveness for the end user can dramatically decrease, as the performance of the system becomes tightly coupled to the response times of the external data provider. Often, external data providers do not have all the data already in place and pre-calculated, but instead need to build up the dataset on-the-fly, for example by triggering additional calls to other (external or internal) data providers.
These external systems my be old legacy systems, which do not leverage state-of-the art database technologies. Request can take many seconds (up to minutes), which may be unacceptable to the end user for some online application scenarios. Consequently, these systems may not always fulfill the requirements for a given application, especially when the load on the system is very high, or in a consumer application scenario where many thousands of user access the service in parallel. The following key performance indicators may be important considerations when building such systems:                Single user end-to-end response time        Number of parallel users/requests        
One approach to overcome this situation is to replicate the external data into the enterprise running the database application. However, certain challenges present themselves with regard to the external data:                Magnitude of the data:                    How big is the set of external data?            Can it be replicated completely?            Does it contain variable content, for example calculations?                        Up-to-date-ness:                    How often does the external data change?            Which percentage of the data changes in which timeframe?            How long does it take to update the complete dataset?                        Legal perspective:                    Can the data be replicated from a legal perspective?            Can the data be replicated from a privacy protection perspective?                        