1. Field of the Invention
The present invention is directed to query verification of untrusted servers, and more specifically to a query verifier that uses a synopsis to verify query results.
2. Brief Discussion of Related Art
Due to the overwhelming flow of information in many data stream applications, data outsourcing is a natural and effective paradigm for individual businesses to address the issue of scale. In conventional data outsourcing models, the data owner outsources streaming data to one or more third-party servers, which answer queries posed by a potentially large number of clients on the data owner's behalf. Data outsourcing intrinsically raises issues of trust. Conventional approaches to query verification build upon cryptographic primitives, such as signatures and collision-resistant hash functions, which typically only work for certain types of queries, such as simple selection/aggregation queries.
Conventional industrial and academic Data Stream Management Systems (DSMS) have been developed in recent years. The need for such DSMSs is mainly driven by the continuous nature of the data being generated by a variety of real-world applications, such as telephony and networking. Providing fast and reliable querying services on the streaming data to clients is central to many businesses. However, due to the overwhelming data flow observed in most data streams, companies typically do not possess the necessary resources for deploying a DSMS. In these cases, outsourcing the data stream and the desired computations to a third-party server can be the only alternative. Outsourcing also solves the issue of scale. That is, as the number of clients increases, the number of mirroring servers employed by the data owner can be increased. In addition, this can often lead to faster query responses, since these servers can be closer to the clients than a single centralized server. However, because data outsourcing and remote computations raise issues of trust, outsourced query verification on data streams is a problem with important practical implications.
For example, a data owner with limited resources, such as memory and bandwidth, may outsource its data stream to one or more remote, untrusted servers that can be compromised, malicious, running faulty software, etc. A client registers a continuous query on the DSMS of the server and receives results upon request. Assuming that the server charges the data owner according to the computation resources consumed and the volume of traffic processed for answering the queries, the server then has an incentive to deceive the owner and the client for increased profit. Furthermore, the server might have a competing interest to provide fraudulent answers to a particular client. Hence, a passive malicious server could drop query results or provide random answers in order to reduce the computation resources required for answering queries, while a compromised or active malicious server might be willing to spend additional computational resources to provide fraudulent results (by altering, dropping, or introducing spurious answers). In other cases, incorrect answers might simply be a result of faulty software, or due to load shedding strategies, which are essential tools for dealing with bursty streaming data.
Ideally, the data owner and the client should be able to verify the integrity of the computation performed by the server using significantly fewer resources than having the query answered directly, i.e., where the data owner evaluates the query locally and then transmits the entire query result to the client. If a client wants to verify the query results with absolute confidence, the only solution is for the data owner to evaluate the query exactly and transmit the entire result to the client, which obviates the need of outsourcing.
Further, the client should have the capability to tolerate errors caused by load shedding algorithms or other non-malicious operations, while at the same time being able to identify mal-intended attacks which have a significant impact on the result.