The Internet has become a vast source of information, accessible by the majority of North Americans. However, as the information available grows by an order of magnitude as time goes on, many user become bogged down in “information overload” when they attempt to use the Internet to search for specific information. In the case of local merchants and service providers, the Internet simultaneously offers more and often better choices to consumers over using traditional methods and makes those choices harder to find by the vast amount of information that must be manually combed through to determine its validity.
People using search engines to locate products and services in their community can get thousands of results even using very specific search terms. While a savvy user understands that most of the results further down the list will not be of use to them, they have to manually scan down the list and review many of the results to determine if any of them meet the criteria of what they were looking for. The user must decide if they should continue to scan or if their time would be better spend altering their search parameters. In other words, the consumer does not know if they are not finding what they want because they have not entered the proper search terms or if it is simply not there. This is further complicated because companies that design web pages are familiar with the tactics that can move their clients' web page up in the rankings of popular web search engines. This floods a consumer with a large amount of false positives that results in a lot of wasted time. In the case of local service providers, there are many sites on the Internet that allow a local service provider to advertise their service for very little money or even for free. However, the consumer is forced to once again sort through lists of “want ad” type listings to find the right service, determine if the service is available in their area and if it is at a price they are comfortable paying. Often, some of the information required is not made available unless the consumer calls the advertiser. Again, this results in hassle for the consumer and wastes time.
These processes are contrary to the modern person's desire and need to obtain the information they seek quickly and efficiently. The reason these problems exist is because the current solutions do not utilize tools that can assist the consumer to vastly narrow down their search results. Even if they did, virtually all information on the Internet does not contain the necessary infrastructure to permit these tools to be used optimally. Giving consumers and businesses access to these tools provides the means for consumers to quickly locate the business that meets their exact requirements (and just as important, inform them if nothing meets those requirements instead of listing a bunch of results that “might” meet them) and gives businesses the means to measure the markets that they work in for interest in their product or service.
The typical use of a database is to store vast amounts of information so that specific information can be queried on demand. This returns a subset of the entire database which is relevant to the person performing the query. For example, a person could query a database of employees for those only having the first name Jim. Also typical of most databases is how data is associated into records and cross indexed with separate data that is relevant to the record. Thus, the person can query for a known element (e.g., the name ‘Jim’) in a manner that displays all the related information associated with the records that satisfy the query in order to discover unknown information (e.g., the last names of employees named Jim).
A geospatial query (or GSQ) is a specialized search performed in a database of information where all the database records requiring searching have associations with location or geographical information, and the criterion used in the query also includes location or geographical information. The results of a geospatial query are generated by how well the spatial information supplied in the query matches the spatial information stored in the database records. For example, a user could supply a geographic location in a query of known areas in order to retrieve the records which contain an area that include the location provided.
Prior art teaches how forward and reverse geocoding can be used to determine if a geographical location (or “point”) or a geographical area (or “zone”) is contained within the boundaries of a polygon shape. For example, forward geocoding is used in U.S. Pat. No. 5,961,572 (Craport et al.) for determining if a point is contained in the bounds of a polygon and U.S. Pat. No. 5,978,747 (Craport et al.) for determining if a zone is contained in a polygon. U.S. Pat. No. 6,868,410 teaches a high performance method in determining if a given point or zone is contained in a polygon. U.S. Pat. No. 7,287,002 (Kothuri) teaches how that invention employs a user interface to return to a user the data associated with a polygon (or polygons) which encompass the geocoded location as described by the user.
U.S. Pat. No. 5,991,739 uses latitudinal/longitudinal co-ordinates to geocode customers and Vendors (Collectively, ‘Users’). The reference provides an Internet Ordering Machine of customers and Vendors, attempting to partially automate certain aspects of a Vendor's day to day operation (the order taking process). This machine attempts to partially automate the ordering process of businesses by having vendors provide a product list so that customers can choose the product they wish to purchase. The machine then replaces the act of the person picking up the phone and calling the business to order by relaying what the customer chose on the web page to the business through simulated voice calls, faxes and/or emails. The transaction is an actual ordering of product from the vendor. They statically geocode users (assign one location) and do not appear to allow a user to define multiple locations. They use a grid system that progressively shrinks the resolution of an area until a single unit is of sufficient area to represent the smallest measure possible to accomplish the point a to point b calculation. They are providing two services in a one way direction, as it is only the consumer who queries the system. The Consumer's service is ease of locating a vendor and ability to order on the spot, while the Vendor's service is the ability to obtain new business automatically. They use fax, email and voice notifications to vendors. They go so far as to have document reading and voice recognition software to interpret commands, requests and responses so that they can facilitate the partial automation of the Vendor's business, resulting in a relatively complex system.
U.S. Pat. No. 6,363,392 appears to focus on its ability to take unformatted information, extract spatial information from it and assign a “confidence” factor as to how well the extracted location would match the actual real world location, and also seems to pertain to mass uploading of information. Included is a means to use a map to adjust a location, more specifically to “correct” misinterpreted location information. Another focus is its use of spatial indexing to determine proximity. This uses a process that breaks down locations into ever smaller quadrants to determine locations and their proximity to one another. In this case, it is a method that trades pinpoint accuracy (but maintains a tolerable accuracy) for extremely fast comparisons. It also seems to be a point A to point B comparison. In summary, this patent is about taking in vast amounts of info in various stages of format. It provides a formatting, determines location and assigns an indexing system to speed up proximity search capabilities.
U.S. Pat. No. 6,571,279 teaches displaying information to a viewer based on optimizing a match between information purveyors, such as advertisers, and the viewer in a manner that is executed local to an information delivery system (column 1, line 50); the use of advanced user profiles which can be coupled with location information and information delivery systems to optimise subscriber customized information delivery to identify subscribers (column 4, line 40); that a buyer may desire to be targeted for certain mailings that describe products that are related to his or her interests, and that a seller may desire to target users who are predicted to be interested in goods and services that the seller provides (column 8, line 51); the use of a pseudonym (column 8, line 60); provision for authenticating a user's right to access particular target objects, such as target objects that are intended to be available only upon payment of a subscription fee (column 9, line 28); allowing advertisers to access a database of consumer information to gauge receptiveness to a product type and be allowed to target consumers that meet certain criteria, including the alert to a vendor of a consumer request for information, if compliant with users' privacy policies (column 16, line 8); and an example of how a business benefits for location knowledge of a consumers (column 17, lines 35-65). This is designed as a tool for giant corporate stores to profile mass numbers of consumers in order to better target advertising and marketing to locations where consumers congregate or pass through. It uses real time location information to measure congregation and predict arrival times. Information and profiles of consumers are covertly amassed through numerous online and real world means. Location information to make their invention function relies on technology that can reveal an individual's real world location. Calculations seem to be based on distance between point A and point B. They are having consumers go to vendors by getting the right advertisement or incentives into their field of view or in their hands. They allow vendors to search the accumulated data (in an anonymous way) and can send info to consumers that permit it. They state that information is stored on vendor computers and processing duties are distributed to client computers to mitigate bandwidth and processor intense activities; perhaps due to a combination of the vast amounts of information they need to deal with and the now antiquated state of computer technology at the time
Aforementioned U.S. Pat. No. 5,961,572 uses the number of times a line intersects the boundaries of a shape in the determination of whether the point from which the line is drawn is inside or outside the shape in question. This patent's means of determining the location to be geocoded may be considered quite inefficient—if a provided address does not geocode, increasingly wider ranges of landmarks must be specified until a geocode can be discovered. This method is designed to discover a single point and find cycle through a list of known shapes until the correct one is found.
Aforementioned U.S. Pat. No. 6,868,410 is in essence an improvement over the previous. It claims the same method of determining a point in a shape through the “line intersects” method. However, this patent improves upon the method by using additional databases of information and using efficient indexing to allow referencing that data faster.
U.S. Pat. No. 6,701,307 appears to teach use of a map to define a location, use of a “quad key” system of indexing locations for fast searching, “spidering” documents (aka, using Google bots) on the web and creating an indexing database of captured material and creating quad keys to further tag the information, and using “location to location” as a means of deriving a search area. This usually results in a circle, but the description states “other shapes” may be used, such as common shapes like triangles, squares and rectangles.
US Patent Application Publication 2004/0133471 teaches pay for performance advertising, but appears to lack a sophisticated “geospatial query”.
U.S. Pat. No. 5,978,747 deals with zone in zone comparisons, using the principles of U.S. Pat. No. 5,961,572.
US Patent Application Publication 2006/0155609 appears to simply use zip codes, area codes or other predefined areas as a means to represent geographic information.
U.S. Pat. No. 7,403,939 is a patent for returning query results based on geographic information about the requestor. “General” methods of passively acquiring geographical information (zip, area codes, IP address) are described, and an indication is made that geographic information is about a user is retrieved from an electronic store (a database) and that the query may also apply to “proprietary” data. The geographic information still applies to predefined categories like Zip, City or neighbourhoods.
U.S. Pat. No. 6,789,102 has teachings designed for use in a vehicle and deals with repositioned data terminals.
U.S. Pat. No. 7,024,250 appears to use mobile phones as a part of the apparatus, with geospatial queries executing based on location updates from mobile devices.
U.S. Pat. No. 6,546,374 teaches a process where databases are searched and clients/vendor information is exchanged so that a traditional transaction may commence. There is reference to relevant results based on “proximity.
U.S. Pat. No. 6,473,692 pertains to a more efficient way to store landmarks at a geographic location that can be easily reverted to traditional location identifiers (e.g. Latitude/longitude).
While the prior art teaches how we can query a plurality of known areas defined as polygons stored in various types of databases, it does not appear to teach dynamic addition, modification or removal of polygons by someone who is not trained in or familiar with the art. Without such an apparatus, which is provided in embodiments of the present invention, only one or more administrators (someone who is specially trained to perform advanced functions that are denied to or beyond the ability of a typical person using the invention) are required to import known polygon information. This is a problem when data associated with geographical points or zones and the points and zones themselves must be able to be added, removed, re-associated or otherwise modified simultaneously and in real time by a plurality of users. As the simultaneous user count increases, it becomes prohibitively expensive to hire, train and provide infrastructure to administrators that can affect the required changes on behalf of the users.
Prior art teaches that a point can be determined to be inside or outside of a polygon by extending an imaginary line along the x axis of the point in one direction and counting the number of times this line intersects with the boundaries of the polygon. If then number of intersections equal zero, the imaginary line is extended from the point in the opposite direction along the x axis. If this line also equals zero, then the point is not inside the polygon. If however, one of the imaginary lines does intersect with the polygon boundary, then a count totaling an even number of intersections confirms that the point is outside the polygon while a count totaling an odd number of intersections indicates the point is inside the polygon.
Prior art also teaches that when a point must be checked against a plurality of polygons, the Minimum Enclosing Rectangle (or MER) technique is used. Simply put, this is the smallest possible rectangle that could be drawn around the polygon that would contain all of the polygon's vertices. By calculating in advance the MER for each polygon, a geospatial query can first check to see if the point is enclosed inside the MER. This would in effect be a Point-in-rectangle calculation. To test for point-in-rectangle, the x and y coordinates of the point are tested against P1 (having coordinates x1, y1) and P2 (having coordinates x2, y2) coordinates, where P1 and p2 are located in opposite corners of the rectangle. A point-in rectangle calculation is true when ((x1>=x>=x2 or x2>=x>=x1) AND (y1>=y>=y2 or y2>=y>=y1)).
Clearly, a point-in-rectangle search is extremely fast in comparison to a Point-in-Polygon search. It is even faster if it is known that all the MER's use the same two points (for example, the upper right point and the lower left point) as it will be know that if x1 is always greater than x2, then there is no need to also calculate the x2>=x>=x1 portion of the calculation (the same situation applies to knowing how to calculate for y1 and y2 as well). Thus, a geospatial query will only perform the more time intense calculations on a polygon that first is found to have a MER that contains the point in question. A further advantage of the MER is that it acts as an artificial boundary when extending the imaginary line away from the point during the Point-in-Polygon calculations. In other words, the imaginary line need not extend beyond the bounds of the rectangle as by definition the polygon will not have boundaries outside of its MER.
When dealing with extremely large numbers of MER's, the prior art also teaches that a well known spatial access method called an R-Tree can be used to cluster groups of MER's in close proximity into larger MER's (called “nodes”). These larger MER's can in turn be contained in even larger MER's and so on until the highest level of nodes (known as the root nodes) are reached. Root nodes are the largest MER's in the tree and combined encompass the entire collection of MER's in the database.
Using a geospatial querying in a database that uses the R-tree structure adds a relatively small number of additional point-in-rectangle calculations in order to bypass what could be several orders of magnitudes greater number of calculations. This is because point-in-rectangle searches of the root nodes will only perform further searches on “child nodes” (the MER's that are contained within the node being searched) contained within a node that has been found to contain the point being queried. Each child node may have child nodes of their own depending how deeply nested the R-tree is. As child nodes are accessed deeper in the nested structure of the R-tree, “leaf nodes” are encountered. A leaf node contains (or points to the location of) the MER and the vertices of a polygon. The point-in-rectangle calculations performed on leaf nodes will return a result of “point not found” if none of the leaf nodes' MER's contains the point. If one or more leaf nodes' MER's are found to contain the point, then the point-in-poly calculations are used to determine which, if any of the polygons contain the point. It is not uncommon that a geospatial query may have to search along more than one “branch” (another way of describing a chain of nested nodes extending from a root node) of an R-tree. However, once all branches end in leaf nodes, the query is complete and a result of none, one or many polygons found to contain the point are returned.
When a geospatial query is only required to find a point inside a plurality of polygons that are known to be nonintersecting or inside polygons that are known to only have polygons that would overlap in a limited number of situations, the prior art is sufficient for the execution of the query. However, embodiments of the present invention permit users to create polygon boundaries that can overlap the polygons already stored in the database. As a result, an indefinite number of polygons can be defined by users that are close to, partially overlapping or completely containing (or contained by) an infinite number of other polygons previously defined.
Accordingly, there remains room for improvement in geospatial query techniques to address some of the shortcomings of the prior art.