Future-proof Data Storage - 25/07/2014

File and Database Storage Methods for Huge Point Clouds

Martin Kodde, contributing editor, GIM International

Point cloud datasets have proven to be useful for many applications, ranging from engineering design to asset management. While point clouds are becoming denser and more accurate, new software is allowing an ever-broader user group for these datasets. However, due to their size and dependence on specialised tools, data management of point clouds is still complicated. A major consideration for data managers is the choice of point cloud storage format, as several different formats are available.

Point cloud data can be collected in numerous ways, with laser scanning being the most common method. In terms of data management, it is useful to distinguish between dynamic and static acquisition. Examples of dynamic acquisition are airborne and mobile Lidar, while static acquisition can be achieved with terrestrial laser scanning. Static datasets result in ‘organised point clouds’, which means that the interval between subsequent points is constant. This knowledge can be employed by storing the scan data in a raster where each cell corresponds to a laser return. Rasters can be stored and queried efficiently, thus simplifying the point cloud storage problem. Manufacturers of terrestrial scanners have introduced their own formats for storing points in this way. However, this method of storage is no longer applicable when registering multiple point clouds or when using data from dynamic acquisition. Instead, the real 3D coordinates for each point need to be saved individually. It is this [E1] type of data that poses the biggest challenges in terms of storage size and performance.

File or database

In the world of traditional vector GIS data, there has been a shift from file-based storage to databases. This is not yet the case for point cloud data; datasets are most commonly stored as a set of files on a local drive or a shared network location. The two major spatial databases currently on the market, Oracle Spatial and PostGIS, do provide point cloud support but their functionality is still limited and does not yet scale very well with data size. Database providers are actively working on improving point cloud support. Until such support is sufficiently reliable, file-based storage is recommended. Therefore, in practice, organisations store their point clouds subdivided into files with tiles or strips.

Data model

In the field of GIS it is common to separate the semantics of data from the actual storage, and this is also a relevant consideration for point clouds. The geometrical part of a point cloud is clear: each point is defined by a set of 3 coordinates. In addition, each point can be enriched by attributes such as point colour or intensity. There is no official semantic definition available for point clouds, but the LAS file standard does implicitly define a data model. The LAS file format was defined by the American Society for Photogrammetry and Remote Sensing (ASPRS) in 2003 and has grown to be the de facto standard in practice. The specifications of this file format define which attributes can be stored for each point, and these include class code, colour, time, flight line and pulse count.

 

Standards

The naive way to store a point cloud would be to generate a regular text file, providing one point per row with coordinates separated by a pre-defined character. This is a convenient format that can easily be read by many applications. However, the resulting files will be large, and data exchange can be unpredictable due to misunderstandings in the meaning of the fields in the file. Hence, various organisations have tried to standardise the storage of point clouds efficiently. The abovementioned LAS format was developed by users from the airborne Lidar community, which resulted in a file format that was well designed for such datasets. Over time, the file format also found its use for mobile Lidar, terrestrial laser scanning and point clouds from photogrammetric dense matching. Since LAS is a binary format, it results in smaller file sizes than simple ASCII storage.

Continue reading in the online edition of GIM International.

 
Last updated: 26/04/2017