1. Technical Field
The embodiments herein are generally related to data processing and analytics system and particularly related to a system and method for Big Data processing and analytics. The embodiments herein are more particularly related to an actor oriented system or platform or application model and a method for Big Data processing and analytics.
2. Description of the Related Art
Big Data is a large data set that is distributed over a set of storages. A typical problem is to find actionable insights. The data may be stored beforehand or may be a continuous stream to be accessed, stored and analyzed with distributed algorithms and frameworks. The Big Data analytics inherently requires a set of distributed computing, networking and storage resources that may be available locally or rented from a cloud infrastructure. In this manner, a Big Data is related to cloud computing. The ultimate objective of Big Data market is to gain an effective advantage in terms of time and cost or more convenient and easier to use system to handle processing of large data assets using an efficient software stack and hardware platform.
The Big Data have different forms such as unstructured data and semi-structured data like machine logs, text documents and media frames. The Big Data includes any data that is not relationally structured over managed tables and predefined data schemas in dedicated database systems. Simply the Big Data is a distributed data set that is not stored in a database. This definition even includes relations between database tuples that are not stored directly using foreign keys. Mining such relations is also a problem with respect to Big Data. A set of sample problems with respect to Big Data includes a monitoring of all data on a network to guard against cyber security attacks, a monitoring of social media outlets for trends and patterns, analyzing current and historical customer transaction data from various sources to detect fraud patterns or opportunities during the lifetime cycle of a customer, generation of various algorithms to stay current on large volumes of information, running of highly complex data queries on transaction data to see past and present patterns, predicting behavior and patterns leveraging current and historical data, a monitoring of transactions and fraud rings, managing and analyzing the flow of people, business and assets from various data sources, managing and analyzing the flow of information to find trends or key words in conversations, documents, news articles, social media outlets to fulfill National Security Missions, incorporating other data sources such as news, weather, stock information into scenarios for better predictions, performing entity identification and disambiguation automatically for modeling, running highly complex data queries on transaction data (years, months, decades) to see past and present patterns in governmental data, combinational processing of variety of data sources to study patterns and to predict based on the patterns, managing patient transaction data on electronic medical records including medical information, medical procedures and prescriptions, capturing and analyzing information from all branches of stores, locations and departments, analyzing information from different sources such as stores and sensor devices for managing pricing, inventory and distribution operations, processing customer data such as billions of call records, texts, streaming media, and GPS history, customer churn, usage behavior patterns, and failure or dropped call for prevention, frequent caller data for planning cross- and up-sell strategies; selecting easy steps to access, visualize and explore Data, getting an integrated, strategic view across multiple operational systems, measuring and optimizing agent performance, customer satisfaction, and marketing ROI; real time monitoring of large distributed systems, processing complete rich streams of social networking data, real time analysis on log information generated from widely distributed systems, and statistical analysis on real-time vehicle traffic information on a global basis.
The consumer product companies and retail organizations monitor social media like Face book and Twitter to get an unprecedented view into customer behavior, preferences, and product perception. The manufacturers monitor a minute vibration data from their equipments, which changes slightly as it wears down, to predict the optimal time to replace or maintain the equipments or devices. The manufacturers use the monitored data to detect aftermarket support issues before a warranty failure becomes publicly detrimental. The Financial Services organizations uses the data mined from customer interactions to slice and dice their users into finely tuned segments. This enables the financial institutions to create increasingly relevant and sophisticated offers. The advertising and marketing agencies track social media to understand responsiveness to campaigns, promotions, and other advertising mediums. The insurance companies can judge the home insurance applications to be immediately processed. The retail organizations sell their products to brand advocates and enthusiastic customers by changing the perception of brand antagonists, and by embracing social media. The hospitals predict those patients that are likely to seek readmission by analyzing the patient records. The appealing recommendations and more successful coupon programs can be generated for web-based businesses. The government makes a data public to develop new applications. The team strategies for sports teams are estimated or planned by tracking the ticket sales data.
The abovementioned variety of Big Data problems has created a driving and differentiating force for a large set of products regarding Big Data and high throughput stream processing. A rapid growth in the market is another cause for diversity of tools and products between the competitors. Differentiating factors for the product and services resides on three levels as framework, distribution, and packaging competing factors. On the other hand, some shared quality features are prevalent in outstanding competitors such as fault tolerance, large storage capacity, scalability, extensibility, cluster management for enterprise applications.
Hence there is a need for a system and method for analyzing and processing Big Data using actor systems. Further there is a need for an application model and methods to write Big Data programs using actor systems. Still further there is a need for developing an application model using runtime actor systems, asynchronous messaging, event driven middleware and a scalable general purpose language. Yet there is a need for a distributed processing system for connecting, organizing and balancing to realize a set of chained processes acting on a distributed Big Data.
The above mentioned shortcomings, disadvantages and problems are addressed herein and which will be understood by reading and studying the following specification.