Digital Image Matching for Easy 3D Modelling - 28/06/2017

New Threat and Opportunity for Professional Surveyors

Normally when something is too good to be true, it is not true. But in the case of digital image matching (DIM), professional surveyors should look twice – because their clients, and certainly those in the construction, infrastructure and 3D city mapping markets, certainly will. 'GIM International' asked John Taylor, who is responsible at Bentley Systems for reality modelling in Asia Pacific and in the global defence market, for his insights into DIM.

Digital image matching (DIM) did not arrive like a bolt from the blue. It has been around in the photogrammetric world for the past two decades but, through further improvements to filter approaches and also thanks to optimal adaption of algorithms for surface interpretation and object reconstruction, the technique has been refined to generate photo meshes out of 3D point clouds. Now, it is possible to produce 3D mesh models, point cloud data, digital surface models (DSMs) and true ortho images as data products by processing source data – primarily imagery and/or laser-scanned point clouds. The imagery can be obtained from aerial surveys, unmanned aerial vehicles (UAVs or ‘drones’), as well as ground-based mobile and static image capture using a variety of camera types, ranging from GoPro through to purpose-built air survey camera systems. Bentley is a prominent market player with its ContextCapture application which is being embraced by a growing number of companies, including Topcon recently. GIM International asked John Taylor, who is responsible at Bentley Systems for reality modelling in Asia Pacific and in the global defence market, for a peek inside the ‘black box’.

Pixel resolution

It is claimed that DIM makes it easy to produce 3D models using up to 300 gigapixels of photos taken with an ordinary camera, resulting in fine details, sharp edges and geometric accuracy. Getting straight to the point, Taylor is clearly very confident about the possibility of millimetre accuracy. “Accuracy will typically be twice the pixel resolution. The image sets consisting of aerial and terrestrial images need sufficient overlap, of course. Virtually any digital camera can be used. However cameras with larger sensors and high-quality lenses will provide more information, allowing for the potential of better results. It should be noted that data of differing resolutions can be processed into a single model, so model accuracies may vary depending on the geospatial extent of the source data used in the processing.”

The process of creating a 3D model or point clouds in ContextCapture starts with adding images and/or point clouds as the data sources. When photos are used, the images are automatically aerotriangulated through a process of image comparison. This process automatically extracts tie points, matches pairs and determines orientation and positioning of the block of images. “The aerotriangulation process can be computed entirely without control or camera positions, or else controlled using the camera’s positional metadata or surveyed control points,” explains John Taylor. The aero triangulated block can then be processed into a 3D model. This process determines the exact extent for the resultant data, and sets the desired coordinate system and data formats. The process cleans up the point cloud and produces a triangulated mesh with a high geometric precision. The mesh is then textured from the photos using the best resolution from various photos used in this process.

Additional detail

The biggest output difference between ‘traditional’ digital photogrammetry and digital image matching software is the result. In standard photogrammetry, the result is usually a 2D product or perhaps a DSM, and most vector data products are an abstract of the source content. DIM uses a combination of photogrammetry and computer vision to create realistic 3D models in mesh or point-cloud formats. However, for survey managers, the single biggest difference might be the efficiency of the production process. “The automated aerotriangulation and resultant production of data outputs has minimal resource requirements,” Taylor affirms. “While the overall processes are similar, the difference lies in the huge number of images that can be automatically processed, using relatively low-cost computing resources, in a fraction of the time taken using a specialised digital photogrammetric workstation. Additionally, various parts of a scene can be captured using different cameras and at different resolutions to enable the production of multi-resolution data outputs.” The benefit of this is that large areas can be captured at a lower resolution (e.g. 5-10cm), and then specific parts of the scene (buildings, utility infrastructure, etc.) can be captured at higher resolutions (e.g. 1mm-2cm) to provide a wealth of additional detail, which could not be managed easily using traditional digital photogrammetry. Multiple jobs can be loaded, set up and left to process outside of working hours as part of a prioritised job schedule, which is particularly useful when multiple (as-built) surveys are being captured daily. Hybrid processing with laser scan data is also possible. “Laser scanning data does have advantages in low-light or night-time capture conditions, but can be noisy depending on airborne dust particles or moving objects in the scene during capture. Imagery clearly requires suitable lighting conditions (natural or artificial), but is less noisy in terms of processing. Being able to process both sources into a single model is clearly advantageous. Adding point cloud data is useful on any project where a combination of point clouds and photos are available. Our software will use the highest-quality data in these cases, making use of the points where they are dense and accurate and the photos where point data may be absent. It can also process laser scanning data into surfaces without imagery present, so there is some flexibility in how these data sources can be employed.”

The surveying profession

When talking about ContextCapture, Bentley claims “You just need somebody with a camera”. On the other hand, the company names the surveying industry as a main target group. How can DIM be commercially attractive for the surveying professional rather than adding another nail to the discipline’s coffin? Taylor is convinced of the joint opportunities in surveying and engineering: “New value opportunities for surveyors are provided throughout the infrastructure lifecycle by ‘continuous surveying’. That also goes for surveying in those complex situations where laser scanning has a much longer acquisition time and delivers a less dense survey while image interpretation is crucial.” As an example, he takes the London Bridge Station project in the UK (see below and Figure 2). “Given the age of the structure at London Bridge Station and the logistical limitations of laser scanning – a process that would take too long to complete with one or two (more expensive) scanners – the engineers leveraged photogrammetry for the initial survey and regular updates. It was less disruptive to the on-site workers, given the speed and size of a small digital camera to survey the site. Those are new kinds of services to deliver by surveying companies. They themselves or the client can use DIM to process the images into accurate 3D mesh models to facilitate decision-making or provide as-built documentation.”

With regard to complex conditions, he also refers to a smart city project in the city of Coatesville, USA. An engineering group was tasked to provide 3D design and conceptual planning services as part of the city’s ‘The Flats’ brownfield redevelopment, a rugged 30-acre former steel-mill site which contained hazardous materials. These conditions made it expensive (USD40,000) and potentially dangerous to perform a traditional on-site survey so the engineers decided to use aerial photos and DIM. They took 750 aerial photos in 20 minutes. It took 8 hours to produce a 3D engineering-ready model and three days for a final engineered plan, thus achieving a significant saving for the city.

Data formats

ContextCapture supports different software workflows within a single organisation, including a variety of CAD and GIS platforms. For those that fully embrace 3D, then the basic 3D mesh formats of OBJ, OSGB and Collada will often be supported by those systems as the formats have been around for some years. That is also the case for Esri, which has also chosen to develop its own 3D scene format, i3s; Bentley offers it as a standard export format. Specific CAD solutions are also supported by formats such as Bentley’s 3MX, 3SM and DGN and Autodesk’s FBX formats. Additionally, as 3D GIS increasingly looks towards the web as the core delivery platform, Bentley has partnered with AGI to form the Cesium Consortium and to enable Cesium 3D tile export as standard. If the GIS still requires 2.5D data for analysis, ContextCapture provides editing capabilities that automate DEM extraction and then export the resultant DEM data to suit GIS import, either in text or Grid formats at user-selected data densities. Ortho images are another standard output that support GIS data needs, and for those systems that are point cloud capable then 3D data can be exported as an LAS point cloud file. 

London Bridge Station

Plans for the London Bridge Station included reconstructing its concourse to include 15 new platforms, as well as establishing new retail stores and facilities. The Costain Group, which led the engineering project, needed an accurate 3D representation of the aging masonry structures to understand the subsurface for reconstruction potential. The model would also enable stakeholders to make better decisions on a tight schedule. While Costain previously used laser scanners to capture digital data to survey and document site conditions with precise accuracy, they now experimented with DIM. Using a simple camera to capture the old surface area delivered a denser survey than a scanner would, and also provided colour, enabling designers to quickly identify the bricks from the mortar joints. The project team then used ContextCapture to process the images into accurate 3D mesh models that facilitated decision-making and provided documentation of existing conditions that could be used throughout the lifecycle of the infrastructure. The use of DIM technology reduced the time it took to collect data and eliminated the process bottleneck associated with sharing a scanner among two dozen surveyors. Moreover, it streamlined workflows and improved design efficiency.


Last updated: 26/07/2017