Technical Field
The present disclosure relates to computerized systems and methods for data processing and, more generally, to search and information retrieval technologies. By way of example, and without limitation, the present disclosure relates to computerized systems and methods for indexing recurrent calendar event information, and for scoring and providing search results including this information.
Background
Use of information retrieval services, such as search engines, has grown significantly over the last decade. People can now submit queries and access information using a variety of devices, such as personal computers, laptops, tablets, personal digital assistants (PDAs), personal organizers, mobile phones, smart-phones, televisions, and other devices. Queries for information can be performed locally on a device, or over a network such as the Internet. With increased access to such technologies over a wide variety of devices, people have become more reliant than ever on applications and services for accessing desired information.
Many information retrieval systems, such as Internet search engines, operate by identifying terms of a search string, and comparing the identified terms against an index of documents. For example, a provider of search services may collect, parse, and store data from a collection of documents, such as web pages on the World Wide Web, in an index. The index may facilitate the fast and accurate retrieval of relevant documents based on queries from users. Without such an index, a search engine would have to scan through every document in the collection, which would require a lot of time and/or processing power for a large collection of documents.
Some search engines use an inverted index to identify documents that include a word or phrase matching a query term. An example of an inverted index 100 is illustrated in FIG. 1. By using an inverted index, a search engine can identify each document that contains each term of the query. For a search query including the term “pizza,” for example, a search engine using the exemplary index of FIG. 1 would identify documents 1 and 3 as including this term. Once the documents containing a term are identified, they can be ranked based on one or more of a variety of factors, such as location of the term in the document, frequency of the appearance of the term in the document, etc.
An inverted index may be created by parsing each document in a collection to identify the terms included in the document. For example, computer systems can be programmed to identify certain sequences of characters as terms (e.g., words, phrases, or other elements, such as html code). The terms can then be associated with the document in the inverted index.
In addition to storing identified terms, the index may store other information regarding each term, such as a location of where the term appeared in the document, the part of speech of the term (e.g., noun, verb), etc. This additional information can be used in ranking documents, including web pages. For example, for a search query that includes two terms next to each other, such as “George Washington,” location information may be used to rank a document having the terms located next to each other with a higher ranking than a document that also contains the two terms, but in different locations in the document.
In order to provide the ability to search the most current information on the Internet, search providers may continuously update the index. For example, search providers may continuously retrieve and index web pages to account for changes in documents, such as web pages. Such retrieval of web pages is known as “crawling” the web.
An example of a process 200 for returning web search results based on a query is illustrated in FIG. 2. In step 210, a query is sent from a client to a web server. In step 220, the terms of the query are sent to one or more index servers. The index servers identify which web pages contain each of the query terms. In step 230, the identified web pages may be retrieved from one or more servers. These web pages may be ranked in order of relevance based on a variety of factors, as noted above. Furthermore, a portion of the text, or “snippet,” from the document may be retrieved for each of the search results. For example, a portion of the text surrounding the query term in the document may be retrieved as a snippet to provide a client with a context for a search result. In step 240, links to the documents may be provided as search results to the client, along with the snippets.
Web pages are not the only type of information stored on the Internet or accessible through search engines. Today, people store all types of information, including calendar information, documents, photos, social networking information, and much more. It would be useful to provide an index for the quick and accurate retrieval of this information using search engines. However, generating an index can be complicated for certain types of information. One area in which such complications occur is when trying to index and provide search capabilities for calendar events.
The use of electronic calendar programs is common today, and many people rely on electronic calendars to organize their daily commitments. Electronic calendars, such as Google Calendar, allow users to store calendar events on network servers, so that they can be retrieved from anywhere and from a variety of different devices. Such calendars typically allow various types of calendar events to be saved, modified, or deleted.
One type of calendar event is an event with a single occurrence at a particular date and time. Such an event may be stored as a data entry including attributes for information describing the event, such as a title of the event, description of the event, comments, list of participants attending the event, location of the event, date of the event, and start and end time of the event. An exemplary illustration of such a data entry is provided in FIG. 3A.
Another type of calendar event is a recurrent event. For example, a user may wish to schedule an event that occurs repeatedly, such as a meeting that occurs on Wednesday from 2:00 p.m. to 3:00 p.m. every week. Rather than requiring the user to create an entry for every instance of the recurring meeting, many calendar applications allow the user to enter the information for the event once, and to set an attribute that causes the event to recur at the desired interval. Such an event may be stored as a data entry including attributes for all of the information normally associated with a single event occurrence, as well as attributes indicating that the event recurs, a start and end day for the recurrence, and a pattern of its recurrence. Such a data entry may be called a master data entry. The calendar application can use the recurrence pattern to compute the individual dates of the event, as needed. An exemplary illustration of such a master data entry is provided in FIG. 3B.
As noted above, it would be useful to index calendar events in order to provide for the quick and accurate retrieval of calendar information. However, the existence of recurring calendar events makes it difficult to create such an index. Accordingly, an efficient solution is needed for indexing and providing information regarding recurrent calendar events.