The emergence of the Data Integration software segment's importance for data's usage and management was defined by a leading technology industry analyst group, the Gartner Group. Following are some assessments and discussion from Gartner's Sep. 22, 2008 Magic Quadrant for Data Integration Tools Report.
To quote the report, organizations increasingly view investments in data integration tools as a strategic basis for enterprise data management . . . . Organizations recognize the role of these technologies in support of high-profile initiatives such as master data management (MDM), business intelligence (BI) and delivery of service-oriented architectures (SOAs). Recent focus on cost control has made data integration tools a surprising priority as organizations realize the “people” commitment for implementing and supporting custom-coded or semiannual data integration approaches is no longer reasonable. Vendor consolidation continues, driven by the convergence of single-purpose tools into data integration suites or platforms . . . . Buyers must recognize that, as an evolving market, disruptions caused by merger and acquisition activity are likely as smaller vendors with valuable technology continue to be subsumed into larger entities to form more complete data integration tools portfolios.
The discipline of data integration comprises the practices, architectural techniques and tools for achieving consistent access to, and delivery of, data across the spectrum of data subject areas and data structure types in the enterprise, to meet the data consumption requirements of all applications and business processes. As such, data integration capabilities are at the heart of the information-centric infrastructure . . . . Business drivers, such as the imperative for speed to market and agility to change business processes and models, are forcing organizations to manage their data assets differently. Simplification of processes and the IT infrastructure are necessary . . . .
The data integration tools market comprises vendors that offer software products to enable . . . : Data acquisition for BI and data warehousing: Extracting data from operational systems, transforming and merging that data, and delivering it to integrated data structures for analytic purposes. Creation of integrated master data stores: Enabling the consolidation and rationalization of the data, representing critical business entities such as customers, products and employees. Data migrations/conversions: Traditionally . . . via the custom coding of conversion programs, data integration tools are increasingly addressing the data movement and transformation challenges . . . in the replacement of legacy applications and consolidation efforts during merger and acquisition. Synchronization of data between operational applications: . . . data integration tools provide the capability to ensure database-level consistency across applications, both on an internal and interenterprise basis; Creation of federated views of data from multiple data stores: Data federation, —enterprise information integration (EII) . . . providing real-time integrated views across multiple data stores without physical movement of data. Delivery of data services . . . SOA context: An architectural technique, rather than a data integration usage itself, data services are the emerging trend for the role and implementation of data integration capabilities within SOAs. Unification of structured and unstructured data: there is an early but growing trend . . . “**
To summarize, the prior art suffers from the following Data Integration Software Issues: Data Integration Technology Reality—Data has overwhelmed Technology; There is no single data architecture or data truth; Data is a company's most important asset in the 21st century.
Technology Reality—The issues and realities described below are a result from a number of technology, systems design, and software architecture limitations that have remained constant for over the past 25 years. The issues outlined here are currently being addressed by the major data integration vendors: Doing more of the same design; Doing it on a bigger scale (acquisitions), Expecting the companies to pay for a multi-step, expensive, technical high risk integration process over time.
The alternative option offered by vendors is hiding these technical issues of data complexity and chaos behind strategies such as Cloud Computing, Virtualization, Outsourcing, or Software as a Service.
Further Issues include those concerning Technology, Software Strategy, and Business Economics.
For example, Software vendors no longer provide the high level architectures to the market place for multiple application vendors to map to. Instead the market is given data integration “vision” by vendors. The implementation of the vision is being implemented through software acquisitions having disparate designs that are being merged over time into a single level of data integration software. The vision is really a compatibility strategy and not architecture. This is a result of almost 40 years of data integration type technology solutions coming to market. Data technology is getting more complex and more chaotic.
Another example issue is Fragmented Customer Data Environments. 40 years of vendor software, architectures, and processes has left companies with a fragmented disparate data environment. Second, the investment in data is large and growing. Third, data is an investment and asset that a company needs to have function in order to the run a company. Fourth, the business groups within a company are developing their own data solution on a departmental level.
A further issue is Vendor generated data silos. As the integration of software acquisitions goes forward the vendors are proposing to companies the value proposition of the issues related to data are high risk technical and business problems and we the vendor will remove these risks in return for your permission for their de facto taking over the data sources and data management responsibilities through a number of options. The vendor generated data silos eliminate a company's ability to make choices because the investment in software, implementation, maintenance, ongoing consult services makes changing financially and technically unfeasible without a great deal of cost and transition pain. Data silos have been a part of data management issues for decades. Each vendor's software, each generation of software acquired typically have their own data and data management infrastructure. It is manageable due to companies leaving the solutions in place for years; and they have slowed the rate on data software innovation to reduce cost and the technical implementation challenges.
Another issue relates to the fact that the vendor data integration strategy is to acquire enough software disciplines to offer a one stop shopping solution when a company needs data integration and data management software based solutions. They are attempting to dominate the market by being the most extensive and robust solution. Doing so by having a customer make a single vendor software choice. To date, the analysts have not confirmed any software vendor achieving the technical where with all to do this. It is an expensive, unproven, high technical risk option.
A further issue relates to License Revenue Model: How much function is enough? Software for over 40 years has been the license revenue model. It is model that today, has now become a technical problem versus a key revenue strategy for vendors and their customers. Today software revenue consists of version license fee, an annual maintenance fee, and consult services fees to implement and modify the software over time. The revenue license to be justified must periodically (3 to 4 year cycle) have new function in new versions being sold companies. The license model forces vendors to expand the scope and depth of function for each new release to justify the cost. This model is followed so publicly owned companies can retain or increase shareholder's value. Functionality growth has reach the stage where it is a problem for companies in terms of implementation costs and technical risks, diminished to no value of new the function, complexity/risks of change, and limited technical skills sets to support the process. This is another reason why architecture is no longer the focus but is it a marketing vision for software designing new software functionality.
A still further issue relates to Affordability versus Software Complexity. Small to medium business, enterprises and business groups within companies cannot afford nor technically support the data integration software trends. Whether the single vendor option is viable is a matter of debate. That is why the trend of self developed or hand code solutions are a major trend in the data integration segment. The cost basis must be re-aligned to the budgets available.
Yet another issue is Financial Operational Criteria/Group Funding. The CFO's set the financial and operational parameters for IT. The IT investments are being held to ROI or cost-benefit criteria. This is leading to SaaS, outsourcing, Cloud Computing. Etc. The financial community is sending the IT budgets to the business groups, letting the groups make their own decisions. This leaves IT with maintenance and computing operations and data storage support requirements.
A still further issue relates to User group strategies. User groups have compute skills and budgets to make their own decisions for their departmental or group operations. The PC/departmental compute and storage trend is still a major factor in software requirements. The vendor single solution approach does not fit the business, financial, or technical model used by departmental data solutions. They continue to develop and hire IT skills within their groups. The challenge is how to effectively provide a design of software to meet their data integration requirements.
A yet further issue is Sub-vendor technology. It is common development practice to utilize other vendor's software to support the value add software for the data integration software instead of the entire development being done in-house. This creates vendor dependencies on the sub-vendors. Case in point is the Microsoft software base GUI employed by the major software vendors. Many times the sub-vendor software and/or its architecture are not fully compatible or have taken off into a very different path to the vendor's own development requirements. When Microsoft announced VISTA was to replace XP based software, the data integration vendors did not move to the VISTA platform for a number of technical and business reasons. Technically, with icons, pop up screens and workflow design being tightly connected to the GUI design, VISTA represents a major re-write to the software without function enhancement considerations. The vendors opted to stay with their current GUI design. The trend is software vendors are delegating a part of their architecture and design, to other vendor standards.
The limited technical skill resources to implement and support complex software and data integration projects. The staffs trained in using data integration software are in very small groups of a few dozen trained data integration staff to support 5,000 to 10,000 or more size user community. The role becomes one of maintaining control versus supporting business/ROI initiatives. The highly skilled technical resources are not growing and therefore a new software design is needed to work with user groups have different and less skills than an IT corporate staff possesses.
The technology design challenges are also contributing to data integration issues. New, innovative design and GUI solutions are slow in coming because the underlying technology and design have not progressed. There are significant core technology design issues that are impacting data integration software's capacity to support company's leveraging their structured data sources.
The design, use, technical risk, and cost issues outlined are key factors facing data integration and management software. There is thus a need in the art to reduce the software complexity, data chaos, and provide the maximum flexibility for a company to use its data as required. Second, improving the ROI and business value proposition; by placing the data integration benefits and risks with those who require the solutions—the Subject Matter users. Third to provide the software to manage the disparate structured data sources at the Subject Matter Expert user's computing skills sets. Finally, provide an architecture and solution at a stable cost. To achieve this outcome requires a new design and invention basis for data integration and management software.
It is a design for current issues and to support the emerging technologies of the 21st century: Cloud Computing, Virtualization, Software as a Service (SaaS), Services Oriented Architecture (SOA), Master Data Management (MDM), Business Intelligence (BI), Customer Relationship Management (CRM), outsourcing, open source software, and complex vendor software products (ERP, Real Time BI, Web 2.0, enterprise mash ups, etc). Fourth Generational Language (4th GL) interpretive software base: The standard software programming base is 4th GL programming running an interpretive code execution basis. 4th GL's design point is as interpretive based productivity programming whose purpose is to insulate programmers from the machine level code impacts. As applications grow in number over time and the amount of data being processed, the size of the server/mainframe must be increased for performance purposes.
Further, the visualization processes deploy icon commands; pop up screens for options to select, workflows for object base design, IE file folders and dashboards monitoring using a traditional Windows based GUI to provide function and visualization to the data integration processes. What changes have occurred are incremental to the core design first employed in the 1980's under the then new PC based (Client-Server design) initiatives running in private networks using dumb devices for users. This type of software design requires significant IT skills and experience to install, maintain, and use in the operations/business environment. The population in a company with the training and skills is 10 to 25 people supporting 5,000 or more employees or customers.
Second software design impact is the required use of script programming to support 4th GL data integration solutions and connections between the programming. Script programming designs require scarce, trained, and experienced infrastructure programmer skill sets for writing support code for vendors' data integration software to function in a production or development environment. It is an intensive process.
Third impact is the design issues with data software GUI's very complex screen design, the processes supporting the data integration software, and its interpretive, 4th GL. These screens are very icon intensive, use file folders schema extensively, pop up screens and program workflows within the GUI screen design. It requires IT skilled users to utilize this level of complex software design. The design inherently restricts how non IT users can access data.
The fourth major design factor is the central control of the data integration content and data sources. Security and control have been considered the value proposition for data integration design and use for over 40 years. Design issues also include the fact that the Internet came 10 years after the core design and architecture was developed for data integration software tools; and has not progressed away from its first generation design point.
The design and programming retro fits by the major software vendors has improved the design marginally. With the emergence of new technology trends for 21st century outlined above, the design or combination of multiple software company acquisitions having very disparate architectures is a business market share program, not a design solution.
The interpretive code and code execution design architecture requires dedicated computing resources for designing and then executing the code in a computing production environment. The software must reside on the computing production system in order to execute. Since the large data integration vendors today employ interpretive code design, this assists in the business model of a single vendor strategy and the complimentary vendor generated data silo problem. Though 4th GL was designed as productivity software it has become part of the issues due to incompatibility of vendor versions of the code, new functions one vendor has and another does not leads to vendor's software and structured data formats being incompatible. Great for the vendor revenue but an issue for a company needing an open, flexible data platform in order to maximize the business opportunities and ROI
The acquisition strategy/trends are delaying re-design or new software design platforms because compatibility issues need to be resolved first. Customers do not want to deal with vendor software incompatibility design issues due to the master vision of vendor's market strategy. The latest trend by large vendors that indicates the scope of the design issues are recent announcements of dedicated computing hardware and storage merged with software to create “data warehouse machines” or “data machines/platforms”. It is another indication that thirty year old designs and inventions is no longer meeting the requirements.
In all of these software design and architecture issues, the core of extensive technical skills is always a part of the designs offered. The central tenant of data integration software design is the control of the data is the most important functional design aspect. The control criteria also includes the security access processes but they are only a small part of the design complexity of the current data integration software standards. Thus, providing additional user access in the control data design, is viewed as placing the data at high risk for a number of valid business, security and IT management reasons. The net result however, is for the past 25 years departmental or group computing solutions have become a standard way of design support for users that is not fully managed by IT or the vendors. Such control issues as data de-duplication has already been by passed by the departmental solutions trends. The users are pulling the needed data to their compute and storage environment or are creating the data sources in the group.
The most serious software design issue with central control is that it involves the demand for data exceeds the limited IT resources. With 25 IT trained people to support 5,000 or more employees and control the data access and data source creation has been overwhelmed by the dynamics of the business model and workload. IT does not possess the skills, even when working with group representatives to support a dynamic 21st century global business environment. This is why the statement that data has overwhelmed technology reflects a very significant data problem. It is because the software design and the processes are not structured to support the data environment into the future.
The technical review and assessment evaluation is the data integration software designs and architectures had become functionally overloaded, too complex for users. The fact that alternative options such as Cloud computing are being implemented are the real indicators of the need to start the design process from a new base point. It requires new software design criteria based on the proven, best, and practical results. The major design criteria is a mixture of technology, performance, ease of use, and support of a business ROI for data integration and how structured data sources are utilized.
In summary, data integration's complexity and chaos issues trends are getting larger and more difficult. The business performance criteria, data ROI, the data software complexity—chaos issue, and the emerging technology trends results in the conclusion that more function, more complex single vendor solutions, and more consulting services are no longer the future for an effective, flexible data integration solution strategy. It has failed and a new invention and design is required. A new solution based on the current issues and the lack of innovative design thinking is needed in producing cost effective, software having the core function to support a company's IT data requirements; and provide the Subject Matter Expert user direct access support to the needed structured data at their existing computing skill sets.