Towards All-in-one Photogrammetry - 26/04/2012

A Case Study of Dense Image Matching Using Oblique Imagery

Dieter Fritsch, Jens Kremer and Albrecht Grimm, Germany

Modern photogrammetric camera architectures provide high-resolution imagery in nadir and oblique views. This case study sets out to prove the potential of oblique imagery to deliver dense point clouds to be used for DEM, 3D city model and true orthophoto generation – an important step towards ‘all-in-one’ photogrammetry.

3D point clouds for DEM or orthophoto production are typically collected using airborne Lidar systems. The recent development of dense image matching methods provides an efficient alternative by deriving point clouds from imagery. Within this fully automated image processing pipeline, corresponding pixels between multiple images are determined. Each pixel represents a sight ray from the object through the focal point of the lens to this pixel on the image plane. Since the length of this ray is not available, the observation of this point is required in multiple images to determine a 3D position. This is achieved by intersecting the rays from the determined corresponding pixels which leads to a 3D point. By repeating this process for multiple pixels, a bunch of 3D points, also referred as point cloud, can be computed. In particular, dense image matching methods determine a correspondence for almost each pixel of the overlapping imagery. Thus, very dense point clouds can be derived by Semi Global Matching (SGM) algorithms, where the ground resolution of points is close to the ground sampling distance of the image dataset. The accuracy is highly dependent of this ground sampling distance as well, which enables the selection of flight parameters according to the requirements of the specific application. Although SGM delivers excellent results for the nadir case, it has to be proven for airborne oblique imagery.

The modular DigiCAM aerial camera system consists of one or more camera modules. Each camera module is a fully equipped camera and can be operated as a single unit (Figure 1). Depending on the requirements of the aerial survey mission, single modules can be bundled to fulfil different tasks. The modular concept of sensor management and data storage allows an operation of the bundled multi camera system as one camera with one single Graphical User Interface. If the camera configuration is changed, the modification of the mechanical configuration is reproduced by the configuration of the Graphical User Interface.

For the evaluated project, two different configurations of a four camera system were used. To optimise the configurations for the aerial image products, the operator of the system had used the cameras with 100mm lenses for the nadir case and with 80mm lenses for the oblique case.

The Quattro DigiCAM installation (Figure 2, left) creates four near nadir images in a two by two pattern with small overlap. The images are taken synchronously and with the same camera settings. This results in four images that are stitched together geometrically and radiometrically into one uniform large-format image. The resulting virtual image is handled like an image from a one- sensor camera.

In the Quattro DigiCAM Oblique configuration (Figure 2, right), the camera modules are mounted to point forward, backward, left and right with an oblique angle of 45°. Because of the synchronisation of the images, the same time stamp can be used to obtain the exterior orientation of the images from the integrated AEROcontrol GNSS/IMU system. Besides this, the four images do not have overlap and are therefore used as independent images.

Data Collection and Image Orientation
The datasets were collected by the company AEROWEST GmbH, Dortmund, over the area of Lünen (Germany). The nadir images were captured on 4th March 2011 and the oblique images on 1st May in the same year.

The complete nadir block contained 917 large-format virtual images at 50% side overlap and 60% forward overlap. At a flying height of 830m, aerial images with a GSD of 10cm were captured. The full oblique dataset contained 757 oblique images for each of the four cameras. The flying height of 760m above ground resulted in a GSD ranging from 6.7cm to 13.6cm.

For the analysis described in this work, a sub block consisting of 48 nadir images (in four strips) and 170 oblique images was selected.

The camera system was equipped with an AEROcontrol GPS/IMU system so that the position and orientation of all images were directly measured. To ensure an optimal match between the nadir and the oblique images, an Integrated Sensor Orientation (ISO) was performed using the nadir and the oblique images together. For the measurement of the tie points, the automatic AT software MATCH AT from INPHO, Stuttgart, Germany, was used. The actual adjustment was calculated with BINGO from GIP, Aalen, Germany.

It has become commonly accepted that dense image matching methods such as Semi Global Matching have been proven as suitable for the derivation of high-quality point clouds for datasets from imagery acquired in the standard nadir case. Especially the combination of multiple stereo image pairs with different base-to-height ratios can be used to efficiently increase the accuracy and reliability of the point clouds.

Oblique Imagery: Opportunities and Challenges
The development of photogrammetric camera systems for the acquisition of oblique imagery such as the modular Quattro DigiCAM described above enables the observation of the ground from different angles. Therefore, the four cameras are mounted to provide an image in the forward, backward, left and right directions of sight. Thus, slanted surfaces such as house walls which are often hidden for nadir imagery are visible. Especially for deriving point clouds of urban scenes, this additional information is of high interest since, for the standard nadir case, mainly horizontal structures are acquired.

Performing dense image matching methods on this oblique imagery can also increase the completeness of the 3D point clouds. This is specifically suitable for the automatic derivation of 3D city models or for capturing terrain with large undulations such as urban scenes in order to avoid occlusions. Figure 3 shows an example point cloud retrieved by dense image matching using oblique imagery.

The key challenges for processing oblique imagery are the varying image scale and the low image similarity due to large perspective changes between the imagery. Consequently, the acquisition geometry is comparable to typical photogrammetric close range applications.

The varying image scale is caused by the angled view, where foreground objects have a small ground sampling distance while background objects have a large ground sampling distance. The Semi Global Matching method, proposed by H. Hirschmüller, 2008, uses a global smoothness constraint to derive a low noise disparity for each pixel. Originally, a constant disparity search space is used which is suitable for nadir imagery. For convergent imagery, however, this disparity search space increases significantly. This causes an increase in processing time and memory requirements which makes the original Semi Global Matching algorithm unsuitable for processing high-resolution oblique datasets.

Therefore, we employ a modified Semi Global Matching method with a dynamic disparity search space. By using a hierarchical approach employing different resolution levels, the search space is determined for each pixel individually. This is not only beneficial for the reduction of processing time and memory requirements, but also for the resolution of matching ambiguities.

In order to use redundant information, a multi-stereo approach is employed. Each suitable stereo pair is matched using the modified Semi Global Matching method. The derived disparity information is used in a triangulation step, where all corresponding stereo models are used for each image. Within this step, each model provides the correspondence information according to the selected base image by the disparity image. By intersecting the resulting rays for each pixel of the base image, a dense point cloud is generated while the redundancy is used to eliminate outliers and to reduce the noise.
Compared to classic nadir imagery, variances of image content are rather large due to perspective changes. This causes the matching to be more difficult which generally leads to lower completeness. Thus, the main challenge is to provide a method which enables results to be derived with maximal completeness.

Point Cloud Generation
The evaluated oblique dataset provides 66% overlap of images in flight direction. Note that, due to changes in perspective, variances in image content are rather large in contrast to a similar nadir configuration. Within the processing, one base image is selected. For the point cloud generation, its two neighbouring images from the same flight strip are selected. Moreover, overlapping images from neighbouring flight strips are incorporated into the matching process. Therefore only image pairs providing an overlap of more than 20% are considered. The main challenge for matching these image pairs is the change of image content due to the largely differing image scale. For the present oblique dataset, only views possessing the same line of sight were incorporated into the matching process. Thereby the overlap of all processed models amounts to at least 20%, and angles between the view directions are smaller than 30 degrees. Within the processing, all selected images are matched against the central base image. For each stereo pair, dense disparity maps are computed. Redundant information contained by the multiple models is later fused within the triangulation step. Note that only oblique imagery was used to generate the point clouds.

Processing overlapping imagery results in measurements of the same object point in multiple stereo models. This redundancy can be used to efficiently eliminate outliers and to increase the accuracy of the generated point clouds. An efficient method for detecting and eliminating mismatches was implemented for the processing of oblique imagery. Thereby, measurements of single stereo models and their matching uncertainty are transferred into object space and checked for consistency. Only if a certain number of measurements are consistent are they assumed to be correctly matched and used for triangulation. The actual triangulation problem is formulated as a linear system and solved using a singular value decomposition approach. Figure 4 encodes consistent and successfully triangulated object points from stereo models for an exemplary base image and six match images. Matching and triangulation of each base image results in one point cloud. All generated point clouds are eventually merged in object space.

Extremely Dense and Accurate
As shown in Figure 5, the implemented matching strategy results in extremely dense surface models providing good accuracy. For a 570x630m test area, approximately 105 million points were extracted. The point density might be sufficient for the generation of true orthophotos.

Table 1 shows the average matching success rates for the matching configurations displayed in Figure 4. As expected, due to the higher similarity of image content, the number of successfully matched pixels in image pairs from the same flight strip is slightly greater. Nevertheless, cross-strip stereo pairs can also be successfully matched, even though the same ground areas have different image scale. Note that the density is reduced within the subsequent consistency check in the triangulation process.

Table 1. Matching success rates of the image configurations in Fig. 4 (a)-(c).

Match Image

Matched pixels [%]

In-strip (a)


Cross-strip 1 (b)


Cross-strip 2 (c)


In contrast to nadir imagery, for which typically only 2.5D models can be generated, dense 3D reconstructions can be obtained for the present oblique dataset. In addition to house facades, small 3D structures such as balconies or cars (Figure 5) can also be extracted with good precision. Even objects which are difficult to match, such as solar panels or vegetation, are reconstructed due to high redundancy. These objects were generally successfully reconstructed from the current dataset. As is typical for the Semi Global Matching algorithm, abrupt steps in height are maintained and distinct borders lines can be reconstructed reliably.

This case study has proven that extended Semi Global Matching algorithms can be used to fully exploit the spatial content of oblique imagery. The fusion of point clouds derived from nadir and oblique imagery, simultaneously, is a further step and will ultimately deliver a very dense surface model, which is superior to airborne Lidar point clouds. Thus, photogrammetry has the potential to deliver dense DEMs, 3D city models and true orthophotos using one image flight only, which we call ‘all-in-one’ photogrammetry.

This case study is the result of a close cooperation between the Institute for Photogrammetry, University of Stuttgart, Germany, as specialist in dense image matching algorithmisation, and IGI mbH, Kreuztal, Germany, a systems manufacturer delivering airborne and mobile Lidar systems, digital camera modules and integrated systems for flight navigation. The authors would like to acknowledge Mathias Rothermel and Konrad Wenzel for their contribution to this article.

Last updated: 20/11/2019