Accurate estimation of origin/start and destination, hereinafter called OD estimation, is of interest in a number of use cases. It is, for example, of importance as basic data when planning a public transport system and when planning investments in new infrastructure like railways, trains, roads and buses.
Today OD estimation is generally performed in one of two ways, either by surveys among people or by using measurements of e.g. the number of road vehicles that passes a certain place (link traffic).
Surveys have some advantages over other techniques. You can for example get information on why a trip has been conducted, how many people that traveled in the same car and if they would have used public transportation if it had been available. A downside of surveys is that they are very expensive. Another downside with manually collected surveys is that people tend to forget how they traveled, when they left their place of origin/starting point, when they arrived etc. Sometimes people also lie about their traveling for one or another reason. All in all, this might not give a good foundation on which to build accurate estimations.
Link traffic flows can be obtained from measurement devices, such as cameras and inductive sensors, installed along most major roads and the data from these measurement devices could be used for OD estimation. Even though this is probably cheaper than surveys, it still requires installation and maintenance of measurement devices. Also, it can only give an OD estimation along roads where there are measurement devices installed and the result will only be a rough estimate.
The estimate is often in the form of a two-dimensional OD matrix having, for example the destinations on one axis and the origin on the other. The values in the matrix relate to the frequency of each O-D pair, e.g. how many vehicles that have started from a certain origin and where their trip ended in a certain destination. Typically, there is an OD matrix for a certain period of time. Thus, in a three-dimensional matrix, a third axis could represent a certain time period.
The idea of making an OD estimation with the help of a cellular telecommunications network has been disclosed in [1] N. Caceres et al., “Deriving origin-destination data from a mobile phone network”, IET Intell. Transp. Syst., Vol. 1, No 1, p 15-26, March 2007 and [2] K. Sohn and D. Kim, “Dynamic Origin-Destination Flow Estimation Using Cellular Communication System”, IEEE Transactions on vehicular technology, Vol. 57, No. 5, September 2008. In the two studies OD matrices are calculated or enhanced in a simulation model by use of cellular network data. An approach taken is to use cellular network data to get link traffic counts, and then use the counts in the calculation of an OD matrix. A big advantage of the methods described in [1] and [2] from a privacy perspective, is that it is very hard, if not impossible, to identify a single individual, at least as long as the link traffic flows on busy roads is studied. A drawback with e.g. [2] is however that since handover information is used to estimate link traffic volume, it can only be applied to mobile phones when in a call or sending/receiving data. It can also only give a very approximate solution, since the problem of finding OD matrices form traffic link counts is under determined. In their simulation they also assume that a mobile telephone is always connected to the closest base station, which is a simplification which is not quite true.
Another approach of tracking people using mobile phones is disclosed in [3] Y. Asakura and E. Hato, “Tracking survey for individual travel behavior using mobile communication instruments”, Transportation Research Part C 12, 273-291, 2004. This article discloses a methodology where applications are installed on mobile communication instruments (not mobile phones) in order to collect data about which base station the mobile communication instruments are connected to. Using that data and an algorithm, it is determined if a person is standing still or moving. A similar algorithm is also known from [4] J. H. Kang et al., “Extracting Places from Traces of Locations”, Mobile Computing and Communications Review, Volume 9, No. 3, July 2005. [3] never addresses the problem of having higher degree of sparse data which would be the case with data obtained in a cellular network today. In [3], special devices are used to collect data on the signal strength from nearby base stations at given, regular and quite frequent times. The retrieved information is therefore much denser in time than in a network with cellular telephones for consumer use. Furthermore, there is no aggregation of data into something that could be utilized for OD-estimation, especially there is no generation of an OD matrix.