An application such as the Microsoft SQL Server Integration Services (SSIS), from the Microsoft Corporation, is a platform for building high performance data integration solutions, including extraction, transformation, and load packages for data warehousing. SSIS may include graphical tools and wizards for building and debugging packages; tasks for performing workflow functions such as File Transfer Protocol (FTP) operations, for executing Structured Query Language (SQL) statements, or for sending email messages; data sources and destinations for extracting and loading data; transformations for cleaning, aggregating, merging, and copying data; a management service for administering SSIS; and application programming interfaces (APIs) for programming the Integration Services object model.
Currently, data transform applications developed using SSIS are only able to run on a single computer. Typically, the data transform applications include reading from a file and writing to a file. That may be sufficient for a single machine, however, it may not be sufficient for using multiple machines, as it may be too cumbersome of a task for multiple machines to handle writing to multiple files.
Because of the single machine limitation, certain SSIS tasks may be impossible or difficult to complete, which may be due to memory requirements, or are not able to meet time requirements, which may be due to low computer count. Even if one were to manually run SSIS on two machines in parallel, one may still have a problem of distributing the input data between the two machines and doing aggregations across all data.