Generally, cloud computing refers to the use and access of multiple server-based computational resources using a digital network, such as the Internet. Cloud system users access the web server services of the cloud using client devices, such as a desktop computer, laptop computer, tablet computer, smartphone, personal digital assistant (PDA), or similar type device (hereinafter collectively referred to as a “client device” or “client devices”).
In cloud computing, applications are provided and managed by a cloud server and data is stored remotely in a cloud database. Typically, cloud system users do not download and install applications that exist in the cloud on their own computing device because processing and storage is maintained by the cloud server and cloud database, respectively.
Typically, online services are provided by a cloud provider or private organization. This obviates the need for cloud system users to install application software on their own separate client devices. As such, cloud computing differs from the classic client-server model by providing applications on a cloud server that are executed and managed by a client service with no installed client version of the application being required on the client device. The centralization of cloud services gives a cloud service provider control over versions of the browser-based applications provided to clients. This also removes the need for version upgrades of applications on individual client devices.
In operation, the cloud system user will log onto a public or private cloud. Computing is then carried out on a client/server basis using web browser protocols. The cloud provides server-based applications and all data services to the cloud system user with the results then being displayed on the client device. As such, the cloud system user will have access to desired applications running remotely through a server which displays the work being done using the cloud application on the client device.
Cloud database storage-allocated client devices are used to make applications appear on the client device display. However, all computations and changes are recorded by the cloud server, and files that are created and altered are permanently stored in the cloud database storage.
Cloud computing, when implemented, includes provisioning of dynamically scalable and virtualized resources. This may be carried out by cloud providers without cloud system users' knowledge of the physical location and configuration of the system that delivers the requested services. As such, cloud computing infrastructures consist of services delivered through shared data centers. However, from the client side, the cloud appears as a single point of access.
A generic cloud architecture includes an architecture of hardware and software systems involved in the delivery of the cloud computing services. Two significant components of the cloud computing architecture are the “front-end” and “back-end.” The front-end is what is seen by the cloud system user at his/her client device. This would include the client device application used to access the cloud via the user interface, such as a web browser, business intelligence (“BI”) tool, mobile device, or through some other system. The back-end of the cloud computing architecture is the cloud itself consisting of various computers, servers, and data storage devices of which the cloud system user has no knowledge.
The shared services within a typical cloud computing environment are shown in FIG. 1 generally at 100. Client 102 is the client device with its internal software that relies on cloud computing for application delivery through web services. Cloud application 104 is cloud application services also referred to as “Software as a Service (SaaS).” This is the delivery of software over the Internet that eliminates the need to install and run an application on the cloud system user's computing device. Since the applications are cloud applications, maintenance and support of these applications are greatly simplified.
Cloud platform 106 is cloud platform services also referred to as “Platform as a Service (PaaS).” PaaS is the delivery of a computing platform and/or solution stack as a service that uses the cloud infrastructure and cloud applications. This facilitates the deployment of applications from the cloud.
Cloud infrastructure 108 is cloud infrastructure services also referred to as “Infrastructure as a Service (IaaS).” IaaS is the delivery of computer infrastructure as a service typically in the form of platform virtualization. Cloud infrastructure services may be in the form of data centers operating virtual machines that run on physical machines.
Server 110 refers to the server layer of the cloud. This includes computer hardware and software for delivery of cloud services to client 102.
As previously stated, the cloud may be a public or private cloud. There also are other cloud configurations that may involve elements of both. Some of the well-known cloud types will now be briefly discussed.
A “public cloud” is a cloud in which resources are dynamically provisioned over the Internet using web applications and services from a third-party provider.
A “community cloud” is one that is established where several organizations have similar requirements and seek to share infrastructure to realize the benefits of cloud computing.
A “hybrid cloud” is one that recognizes the need of companies to deliver services in a traditional way to some in-house operating methods and provide technology to manage the complexity in managing the performance, security and privacy concerns that result from the fixed delivery methods of the company. A hybrid cloud uses a combination of public and private storage clouds.
A “combined cloud” is one in which two clouds are joined together. In such a configuration, there will be multiple internal and/or external cloud providers.
A “private cloud” is essentially the emulation of a public cloud operating on a private network. Through virtualization, a private cloud gives an enterprise the ability to host applications on virtual machines enterprise-wide. This provides benefits of shared hardware costs, better service recovery, and the ability to scale up or scale down depending on demand.
In the past, many computer-based data warehouse implementations could be considered for extensive cloud use but there were problems because they were single-tenant systems. Single-tenant systems of this type were configured as a seven (7) layer stack of dedicated hardware and software for each tenant (client) deployment. Each stack would at least include (1) an application layer, (2) a database layer, (3) an OS layer, (4) a cluster/management layer, (5) a server layer, (6) a fabric channel layer, and (7) a storage layer. On an enterprise-wide basis, the stack would need to be replicated a large number of times to accommodate each client deployment, which makes the maintenance and updating of client systems both time consuming and costly for the Information Technology (“IT”) professionals tasked with these responsibilities. As such, the traditional single-tenant implementations, though applicable, were not particularly desirable for warehousing data in a cloud environment. Companies such as Teradata, IBM, and Oracle offer database platforms, which are generalized platforms for data management. Applications are built on top of these generalized data management platforms to be either single-tenant or multi-tenant.
An example of a single tenant system is Eagle PACE™, which is a software application with the data warehouse model and functionality specifically designed for buy—side financial services organizations. This product needs to be implemented and maintained by professional IT services. As such, it takes extensive training to be able to set up the system data map, rules, and process logic tool used to load data. Therefore, typically, end-user clients cannot use Eagle PACE™ as an “out-of-the-box” solution. Eagle PACE™ has been implemented, for example, on top of Oracle, Sybase, and Microsoft SQL data management servers.
Further, Eagle PACE™ is not designed to support multiple clients on a single platform deployment. Separate infrastructure and software are required to be installed for each set of client data that requires separate reference, processing, or data security, e.g., a single-tenant system. Eagle PACE™ is also not designed to accept real-time updates or near real-time message flow. As such, Eagle PACE™ is a static load (files) rather than a dynamic load near-real time messages and data replication system.
Conventional data warehouse implementations do not offer self-service at the business deployment level. Further, conventional data warehouse software product applications, for example, Eagle PACE™, are not SaaS platforms that are capable of supporting multiple clients. Other known limitations of conventional data warehouse software products include, but are not limited to, a lack of data lineage tracking back to the origin of the data, lack of an independent database proxy connection. One must use the database client provided by the data base vendor, e.g., Oracle which can increase security risk. Additionally, conventional data warehouse software products are not dynamic, i.e., they do not have the ability to define data structures based on the meta-data and data being loaded. Instead, these systems are static, which means the data structures must be pre-defined at the database level before the data is loaded.
Conventional data warehouse models that are designed for handling “Big Data” generally are not particularly effective in areas of data integration and data governance. In this context, “data integration” is the development of a framework that will enable non-technical system users to directly access the data they need for analysis. Further, “data governance” is the managing of big data in such a way that roles and responsibilities may be delineated for every individual within a business that accesses, analyzes, reports on a derives new data, and governing processes that ensure data quality, data integrity, and a single source of truth with respect to such data. Data governance includes clear ownership of all data in the warehouse, tracking of data back to its origin source, and tracking all changes to data over time. All data must be tracked in multiple dimensions of time. ASOF a point in time, ASAT a point in time when changing data ASOF a point in time and by the ACTUAL time the data was posted to the warehouse.
Additional limitations of conventional data warehouse software products include, but are not limited to, a lack of the capability for data mart construction definition, data mart reuse, and automatic data mart refresh. For purposes of the present invention, a data mart is a subset of the data warehouse that pertains to data for a single department, business unit or specific use case. A data mart consists of data that has been selected from one or more of the many sources and categories of data stored in the data warehouse. This enables the department or business unit to use, manipulate, and develop the data for the data mart in any way they see fit without altering information inside other data marts or the original data loaded to the data warehouse. These conventional data warehouse software products also require that a separate copy of the product, infrastructure, and database be installed for each client deployment.
However, there is a need in computer-based private cloud systems for implementation of better systems and methods for cloud computing and cloud application development and deployment on an enterprise-wide basis. The system and method of the present invention solves these needs.
Therefore, there also is a need to overcome the limitations of conventional data warehouse implementations and provide a self-service capability for end-users/consumers to access, load, discover, select, filter, merge, aggregate analyze and visualize data in a permissioned, governance framework that supports multiple tenants concurrently in a data cloud.