1. Field of the Invention
The present invention relates to a method and system for compressing and retrieving Light Detection and Ranging output data, and, more specifically, to a method and system for compressing Light Detection and Ranging output data by Run Length Encoding or losslessly compressing Light Detection and Ranging output data and rapidly accessing this compressed data which is filtered by attributes without the need to read or decompress the entire collection of data.
2. Description of the Related Art
LiDAR is an acronym for Light Detection and Ranging. As it pertains to the geospatial industry, LiDAR generally refers to an airborne, near infra-red laser that scans the surface of the earth to produce highly accurate horizontal and vertical data points that define the shape of the earth and elevations of above ground features. One benefit of LiDAR is that it can be collected either during daylight or at night. Once “raw” data has been collected, a series of semi-automated software techniques is used to clean up the data to produce a uniformly spaced set of data points that can then be used to generate accurate terrain and/or surface models. LiDAR output data is typically stored in the industry standard LAS file format. The LAS specification is published the industry consortium known as the American Society for Photogrammetry and Remote Sensing (ASPRS). The current released version of the LAS is 1.4 and contains record formats 0-10.
Typical LAS files contain from 1 million to more than 1.5 billion points. To provide a sense of magnitude for how these numbers relate to file size and data storage requirements, one must consider the parameters used when specifying LiDAR data delivery requirements. LiDAR “collects” or data collection missions are tailored to meet specifications that can be unique to a specific project. Parameters that impact output file sizes include the following: Point Density/Spacing (Refers to the relative spacing between measured points and the total number of points in a given area (typically 1 sq meter)); Multiple Returns (Multiple returns provide information pertaining to the distance to the measured surface and the return signal strength from the reflecting object.); Pulse rate (Refers to the speed at which the laser emits pulses of light. Higher pulse rates yield increased point density.); Altitude (The altitude and velocity of the aircraft directly affect the point density, field of view (size of laser spot on the ground), and pulse rate settings. Flight plans must consider air traffic control regulations and traffic conditions.).
LAS datasets are commonly used to create digital surface models, contours, intensity images, and 3D renderings for a wide range of applications. Examples include: Base Mapping & Contour Generation, Support orthorectification of aerial imagery, Floodplain Mapping, Natural Resource Management, Transportation and Utility corridor mapping, and Urban Modeling and Planning.
LAS datasets, if not cut in to manageable tiles (read gridded files) can grow to multiple terabyte sizes at full resolution and can benefit from a compressed data structure. Currently most local and state government sponsored projects use LiDAR specifications developed by the Federal Emergency Management Agency (FEMA) published in 2000. The American Society for Photogrammetry and Remote Sensing (ASPRS) is another common reference; their Guidelines for Vertical Accuracy Reporting for LiDAR Data were produced in 2004 and incorporate relevant sections of the National Digital Elevation Program's Guidelines for Digital Elevation Data. These guidelines provide recommendations for scaling data collection parameters to best match the intended application thereby saving collection costs.
While the LAS specification is helpful in standardizing data between vendors and producers, it is not particularly efficient. Its primary goal is readability of data to facilitate an easy exchange of information between subject matter experts in the geospatial domain.
An LAS file is structured to contain all “points” in series as shown in FIG. 1. It is organized in what can be called a row first format. If a point attribute is defined in the LAS Record format being used, it takes up space in the file, even if there is no information to be conveyed. There is no concept of what is commonly referred to as a null pointer. For example, the User Data field takes only one byte in each record. If there is no user data to convey, the field is typically filled with a zero value, taking one byte. If the collection has only a relatively small number of points like 10 million, that is still 10 million bytes of storage wasted.