1. Technical Field
The present teaching relates to methods, systems, and programming for data processing. Particularly, the present teaching is directed to methods, systems, and programming for scheduling transactions in a data system.
2. Discussion of Technical Background
The advancement in the Internet has made it possible to make a tremendous amount of information accessible to users located anywhere in the world. This introduces new challenges in data processing for “big data,” where a data set can be so large or complex that traditional data processing applications are inadequate. Scheduling is critical to achieve an efficient big data processing, especially for in-memory engines.
Since in-memory engines schedule transactions serially at each executor, conventional approaches do not allow mixed workloads on a single copy of data. As such, a long running transaction will block transactions that are either short-lived or with higher priorities. A traditional solution is to separate long running transactions and short-lived transactions, e.g. separating transactional and analytical workloads, which leads to two types of systems. In this manner, however, recent transactional data can only be used by analytical workloads after a long delay. In addition, maintaining two systems increases total cost of ownership significantly.
Therefore, there is a need to develop techniques to schedule transactions in a data system to overcome the above drawbacks.