Structure and data (together hereafter referred to as “information”) can be moved between relational databases using an infrastructure such as Data Transformation Services (DTS) in the Microsoft® SQL Server™. An improved way to move such information is to take advantage of XML (eXtensible Markup Language) by applying it to the DTS, such that a script file in XML format is able to drive the transfer process of the DTS, allowing for selective transfer of information stored in SQL format in a database. The XML driven process allows for the selective moving of information because the script file in XML format is based on the structure of the information to be transferred. With this general background, the following sections provide brief background of DTS, XML, and SQL.
DTS
Most organizations have multiple formats and locations in which data is stored. To support decision-making, improve system performance, or upgrade existing systems, data often must be moved from one data storage location to another. One way to move data is by using DTS in the Microsoft® SQL Server™.
DTS is a set of tools that can be used to import, export, and transform heterogeneous data between one or more data sources, such as Microsoft SQL Server, Microsoft Excel, or Microsoft Access. Connectivity is provided through Object-Linking-and-Embedding Data Base (OLE DB), an open-standard for data access. Open Database Connectivity (ODBC) data sources are supported through the OLE DB Provider for ODBC.
A DTS solution is created as one or more packages. Each package may contain an organized set of tasks that define work to be performed, transformations on data and objects, workflow constraints that define task execution, and connections to data sources and destinations. DTS packages also provide services, such as logging package execution details, controlling transactions, and handling global variables. Using a DTS object model, packages can be created and run programmatically and custom tasks and custom transformations can be built.
A DTS package usually includes one or more tasks. Each task defines a work item that may be performed during package execution. For example, tasks can be used to: (1) move data between a source and a destination and to optionally apply column-level transformations to the data, (2) quickly load large amounts of data into a SQL Server table or view, or (3) retrieve values from sources outside a DTS package at package run time and assign those values to selected package properties.
When a group of tasks is defined, there is usually an order in which these tasks should be performed. When tasks have an order, each task becomes a step of a process. For instance, in a DTS Designer tasks are manipulated on the DTS Designer design sheet and use precedence constraints to control the sequence in which the tasks execute. Precedence constraints sequentially link tasks in a package.
The DTS data pump is a DTS object that drives the import, export, and transformation of data. The data pump is used during the execution of the Transform Data, Data Driven Query, and Parallel Data Pump tasks. These tasks work by creating rowsets on the source and destination connections, then creating an instance of the data pump to move rows between the source and destination. Transformations occur on each row as the row is copied. After the data pump processes the last row of data, the task is finished and the data pump operation terminates.
SQL
In the simplest terms, SQL is a language used by relational databases to query, update, and manage data. To work with data in a database, a set of commands and statements (language) defined by the DBMS (database management system) software must be used. Several different languages can be used with relational databases, but SQL happens to be the most common. The American National Standards Institute (ANSI) and the International Standards Organization (ISO) define software standards, including standards for the SQL language.
XML
XML is a World Wide Web Consortium (W3C) endorsed standard for document and data representation that provides a generic syntax to mark up data with human-readable tags. XML does not have a fixed set of tags and thus allows users to define such tags as long as they conform to the XML standard. Data may be stored in XML documents as strings of text that are surrounded by text markup. The W3C has codified XML's abstract data model in a specification called the XML information set (XML Infoset). XML Schemas also may be used to apply a structure to the XML format and content. In the case of an XML Schema, a diagram, plan, or framework for XML data in a document may be defined.
With knowledge of the format, developers can write programs that interact with, massage, and manipulate data in XML documents. Off-the-shelf software like web browsers and text editors can be used to work with XML documents. Some tools are able to work with any XML document. Others are customized to support a particular XML application in a particular domain like vector graphics and may not be of much use outside that domain. But in all cases, the same underlying syntax is used even if the syntax is hidden by more user-friendly tools or restricted to a single application. One unexpected but advantageous development in XML was its enthusiastic adoption of ‘data-heavy’ structured documents such as spreadsheets, financial statistics, mathematical tables, and software file formats.
Although XML is flexible in the elements it allows to be defined, it is strict in many other respects. It provides a grammar for XML documents that regulates placement of tags, where tags appear, which element names are legal, how attributes are attached to elements, and so forth. This grammar is specific enough to allow development of XML parsers that can read and understand any XML document. Documents that satisfy this grammar are said to be ‘well formed.’ Documents that are not well-formed are not allowed any more than a C program containing a syntax error would be.
The markup in an XML document describes the document's structure. It lets you see which elements are associated with which other elements. In a well-designed XML document, the markup also describes the document's semantics. For instance, the markup can indicate that an element is a date, a person, or a bar code. In well-designed XML applications, the markup says nothing about how the document should be displayed. That is, it does not say that an element is bold, italicized, or a list item. XML is a structural and semantic markup language, not a presentation language. A few XML applications, like XSL Formatting Objects, are designed to describe text presentation. However, these are exceptions to the general rule. Although XSL-FO describes presentation, one would not write an XSL-FO document directly. Instead, one would write a more semantically marked-up XML document, then use an XSL Transformations stylesheet to change the semantic-oriented XML into presentation-oriented XML.
One of the key advantages of XML is that it offers the tantalizing possibility of truly cross-platform, long-term data formats. It has long been the case that a document written by one piece of software on one platform is not necessarily readable on a different platform, by a different program on the same platform, or even by a future or past version of the same software on the same platform. When the document can be read, all the information may not necessarily come across. For example, much of the data from the original moon landings in the late 1960s and early 1970s is now effectively lost. Even if a tape drive were found that read the obsolete tapes, nobody understands the format in which the data is stored.
Conclusion
Thus, it would be desirable to take advantage of these above mentioned properties of XML and combine them with a SQL Server DTS, such that data and structure stored in SQL on a database could be selectively transferred from a source database to a target database. It would be further advantageous to use a script file in XML format to drive the DTS transfer process, where the script file reflects the structure of what is to be transferred.