Currently, the telecommunications industry is rapidly deploying and developing distributed communications networks to provide data communications to personal mobile computing and communications devices, such as to cell phones, pagers, hand-held and laptop computers, wearable computers, vehicle-based computers, and so on. These distributed communications networks provide a communications medium over which information service providers can deliver individually personalized or customized information, such as stock quotes, sports scores, news articles, etc., via data/text, voice and/or video messages (e.g., email, instant messaging, voice messaging, etc.) to the users' personal mobile devices.
In the near future, it will be desirable for the information service providers (and advantageous to the users) to provide information services that further deliver contextually relevant information to individual users. In other words, the services provide information relevant to the user's current context (e.g., location, activity, setting, social/business relationships, etc., as well as personal preferences). For example, a contextual information service may deliver a notification that a social acquaintance is physically nearby when the user is off-work in a public place, or that a business contact is scheduled to attend a same event as the user, among many others. As another example, a contextual information service may deliver a weather or news report localized for the user's present geographical location, or in anticipation of the ski or golf outing on the user's calendar.
When such contextual information services are operated on a large-scale (herein termed a “context megaservice”), the information service likely will require processing queries on large collections of sensitive personal information (e.g., location, preferences, circle of friends, etc.). Further, for purposes of scaling performance to a large scale operation, this processing may be distributed across many server computers, including on servers operated by various different entities (e.g., other information providers).
For many people, such large-scale distributed processing of personal information raises privacy concerns. Many people therefore will be reluctant to disclose personal data to anybody other than (at most) a few trusted entities. Such concerns may limit the adoption and scale of context megaservices, despite their potential utility to the users.
The present invention is directed towards ways to distribute processing based on confidential information without making the confidential information available to untrusted information processing servers in an intelligible form (i.e., plain text). The present invention opaquely encapsulates the confidential information into a form of a “software black box” on a trusted computer (e.g., the user's computer, or a trusted server) where the raw confidential information resides. This black box encapsulates the confidential information in a manner from which the confidential information cannot be explicitly derived, but answers queries on the confidential information. The black box can be distributed or published to other computers, where the black box can be used to answer queries without revealing the confidential information.
In one embodiment of the invention illustrated herein, the black box takes the form of a set of query, answer pairs, where the query hash is represented as a hash result that is a one-way hashing function of a set of query input values. This set of query, answer pairs is distributed to other computers which can then effectively query the confidential information without having access to or directly processing the raw confidential information. Instead, a query comprising a set of the query input values is hashed using the same one-way hashing function. The hash result of the query is used as a look-up into the set of query, answer pairs to obtain the appropriate answer to the query.
This form of black box encapsulation of confidential information queries protects the confidential information from discovery on the computers to which it is distributed in at least two ways. First, due to the one-way hashing function, the initial set of input values from which each individual query, answer pair in the query, answer pair set cannot be directly re-constructed from the query value of the pair. More significantly, however, the logic or reasoning and possibly other confidential data values (in addition to the query input values) that determine the answer for the set of query input values is not visible from the query, answer pairs set, even if it were possible to reverse-hash the query hash results in each pair to their initial query input values.
Additional features and advantages will be made apparent from the following detailed description of the illustrated embodiment which proceeds with reference to the accompanying drawings.