1. Field of the Invention
The present invention relates generally to data processing and an improved data processing system. More specifically, the present invention relates optimized approaches to creating large information technology systems. Still more particularly, the present invention relates to optimized approaches for storing and processing data for a large project.
2. Description of the Related Art
Large corporations or other large entities use information technology systems to manage their operations. An information technology system is a system of data processing systems, applications, data, reports, flows, algorithms, databases, and other infrastructure used to maintain the data and operations of the organization. A large scale information technology system is not necessarily located in one single physical location, but can be situated in many different physical sites implemented using numerous physical devices and software components. A large scale information technology system can be referred to as a major information technology system.
Major information technology system projects, such as those used by large corporations, often fail and some fail disastrously. Failure often costs millions of dollars, tens of millions of dollars, or even more in wasted time, manpower, and physical resources. Thus, substantial effort is usually exerted in planning the construction of a major information technology system. Planning construction of a major information technology system, at least in theory, reduces the chances of failure.
Major information technology systems projects are beyond the abilities of a single individual to implement alone. Likewise, construction of major information technology system projects can not be viewed as a single monolithic project due to the vastness and complexity of these system projects. Thus, major information technology system projects are often constructed in phases using groups of sub-projects. Various groups of people work to complete each sub-project. As work progresses, the sub-projects are assimilated together in order to create the major information technology system project.
However, even with planning and the use of sub-projects, most major information technology system projects fail or are never completed. Even if the major information technology system project is implemented, the resulting major information technology system project does not function optimally with respect to maximizing the efficiency of the organization for which the major information technology system project is constructed. For example, subsets of the whole major information technology system project may not match data, business requirements, and/or resources in an optional manner. As a result, the organization suffers from the inefficiencies of the final major information technology system project. Correcting or adjusting these inefficiencies may be cost prohibitive due to the fundamental nature of how the major information technology system project was constructed.
The most typical reason for failure or inefficiency of these system projects is that the construction of these system projects is approached from a non-data centric viewpoint. Instead, design of sub-projects of major information technology system projects often is performed by managers, executives, or others who are experts at understanding where a business should go or how a business should operate, but are not technically proficient at implementing or constructing a major information technology system project. As a result, the sub-projects “look good on paper” but, when implemented, fail or, if successful individually, can not be integrated together in a desired manner. An entire major information technology system project may fail or be inefficient if sub-projects that were designed to build the major information technology system projects can not be integrated. Currently available methods and system projects do not provide a means to reliably create efficient major information technology system projects. Therefore, it would be advantageous to have an improved method and apparatus for creating optimized sub-projects useful for creating and implementing a major information technology project.
Additionally, an extremely complex problem can arise regarding how to store data for large information technology system projects. For example, a particular enterprise may need to access many different types of data, and possibly vast amounts of data of each data type. For example, data can be warehoused on-site using a process known as “extract, transform, and load,” often referred-to as “ETL” in the industry. Once available, ETL data is time efficient and easy to access, but requires possibly extremely large data storage facilities and complex database technology. In another example, data can be federated. Federated data is stored offsite, often in many different databases. Federated data is accessed via a network. Federated data requires less maintenance relative to data that has been ETL'd; however, federated data is often slow to access, relative to ETL'd data. Federated data also consumes vast amounts of networking resources and is dependent on target data schemas.
Complicating how data is stored, is determining in what form data is to be stored. For example, data can be stored in the form of pictures, simple text, in the form of specialized databases, in a form that is application-specific, in a mark-up language, or in many other different data types.
The determination of how data is stored and in what format data is stored can be extremely difficult and complex for large information technology system projects. Today, solutions are often sub-optimal, due to human limitations and due to possibly political decisions that impact how a project is put together. Thus, an improved method and apparatus is needed for optimally determining how data is stored together and in what format data is stored.