With the proliferation of mobile devices such as smartphones, they may be used for detecting objects and performing augmented reality applications. However, due to limited storage space on such mobile devices, the local database, if exists, may only store a limited number of objects for supporting such visual search, which can adversely affect the accuracy of the object recognition. To obtain a better search quality, a mobile device may send images captured to a server to perform the object recognition.
FIG. 1 illustrates a conventional approach to perform a visual search. As shown in FIG. 1, a mobile device 102 sends a query image 104 as the query data to a server 106 (also referred to as the cloud). The server 106 may then extract two-dimensional (2D) features and associated descriptors from the query image 104, and matches these descriptors against the descriptors with three-dimensional (3D) positions in an object database (not shown) in the server 106, finding 2D-to-3D correspondences. This approach may be employed to identify an object and yield a pose relative to the mobile device 102 at the time the query image 104 was taken.
One of the problems with the above conventional approach is that there is a long latency involved in sending the query data to the server 106, processing the query, and returning the response data to the mobile device 102. During this time, the mobile device 102 may have moved from the position where the query image 104 was captured. As a result, a pose of the mobile device 102 computed by the server 106 may be out of date.
Another problem with the conventional approach shown in FIG. 1 is that the query image 104 contains a large amount of data that needs to be sent from the mobile device 102 to the server 106. This problem can get worse as the image resolution of the mobile device 102 continues to increase, from example, from 2 megapixels to 4 megapixels, etc. This increased query data can further delay the duration between the time the query images 104 may be sent to the time a query response may be received from the server 106.
Therefore, there is a need for method, apparatus and computer program product that can address the above issues of the conventional approach.