Today, consumers of information services in a cloud infrastructure expect that certain Quality of Service (QoS) is fulfilled. The information services wrap ETL (Extract, Transform, and Load) functions, such as address standardization, matching and de-duplication, transcoding, cross-referencing, transformation, or cleansing into re-usable services. An ETL function (also known as a job) usually includes a number of operators, such as database connectors, pivot operators, standardization operators, etc. From a cloud service provider perspective, the challenges in information services are the following. (1) One challenge is atomicy. Some tasks (such as de-dublication) are set operations, which cannot be split. Other tasks (such as address standardization) are individual record operations. Since each service call has a service call processing overhead, it will be beneficial if this overhead is not paid more often then necessary. Thus, it will be ideal that these individual record operations are grouped together for batch processing, as long as the response time QoS is not violated. A service call processing overhead can be substantial, because each ETL operator memory has to be allocated even if just a single row is processed. The memory consumption can be substantial. Tests with a state-of-art ETL system reveal that 12 GB of memory can be easily exhausted with approximately 150 ETL operators. (2) Another challenge is processing locations. In a cloud environment, particularly for large volumes of data, the processing locations matter. For the processing locations, there are only two options. One of the options is that the data is transferred to the processing locations; the other is that processing tasks come to locations where the data resides. In an example of data profiling, if a source system has several TB of data, it will be impractical to move it to the cloud service provider location because the transfer time will be too long; however, if the location of the data and the profiling task are far away from each other in network terms (e.g., throughput, geography, firewalls, etc.), the processing time will be long.