Drone based Roof Inspections with UNPLANNED outcomes!

A short story of a roof inspection with UNPLANNED outcomes!

Written by Walter and Oliver Volkmann of Drone2GIS Inc. and Micro Aerial Projects LLC

We recently were asked to do aerial inspections of the roofs of some 70 newly constructed houses. The client requested a complete visual light photographic coverage of each roof with sufficient resolution to detect cracked or slipped concrete tiles. Here we share our experience on a drone based roof inspection project.

Manually Composed Aerial Image Acquisition with First Person View (FPV) Facility.

We went to the site with our Solo drone equipped with a GoPro Hero 3+ Black camera with a 5.4mm lens and diligently covered each roof by means of manually composing and capturing mainly oblique aerial images. We could complete the inspection of two houses per flight battery. After downloading the images, we bundled them into folders – one folder per house. Each image set had to contain one “index” image that clearly portrayed the house numbers which were displayed above the garage doors of the dwellings. This was the only identifier the client required in terms of “geo-referencing”.  Figure 1 below is an example of an index image, which, when displayed on an average sized screen shows the dwelling numbers in legible size.

Figure 1: The ‘Index” photo in which the presumably unique house numbers, displayed above the garage doors, are captured.

The individual roof inspection images were carefully composed by means of “first person view” (FPV) – a facility which presents the drone operator with a real-time view of the scene as it is being captured by the on-board camera. To ensure completeness, a methodical sequence was followed in a flight that generally went around the roof in a counter-clockwise fashion. It took us two days to complete the field work and to package the deliverables. On the morning of the third day we hand-delivered the product to our client whose offices were located nearby.

Figure 2 below shows a few examples of the vertical and oblique images we manually captured with our Solo drone.

Figure 2: A selection of oblique and vertical images, all manually composed with the aid of first person viewing.

Given the complex design and considerable height of the roof profiles – the highest point of the roof being some 8.5m (almost 28 usft) above ground level – the inspection by air certainly saved considerable costs in time and money, and generally reduced the risk of injury and the potential for disturbance to the occupants. The survey was done with the permission of the occupants, hence concern for privacy was in this case not an issue to be considered in our operation.

Figure 3 below shows the most common defects, namely cracked or slipped roof tiles, identifiable on the manually composed aerial images. This level of detectability, as well as the speed and efficiency of our operation completely satisfied our client’s expectations

Figure 3: The most common defects, cracked and slipped tiles, identifiable in the oblique, manually composed aerial imagery.

Notwithstanding our client’s satisfaction with the manually conducted aerial roof inspection, we decided to also use this opportunity to test an alternative approach by covering the entire subdivision – i.e. all 70 houses – with vertical imagery from which to produce geo-spatial products such as an ortho-photo and a digital surface model.

Fully Automated Vertical Image Acquisition for Structure from Motion (SfM) modeling without Ground Control.

The SfM method of modeling the real world entails the use of overlapping aerial imagery (often also referred to as aerial stereo photography) and some means to geo-reference – i.e. place, scale and orient – the model correctly within a given spatial reference frame. To date the most common method of geo-referencing a model is the use of so-called Ground Control Points (GCPs). These are points which, through appropriate marking on the ground, are identifiable on the aerial images and whose spatial reference frame coordinates are known. Typically, some six to twelve evenly spaced GCPs would be used to geo-reference a project of this size. An alternative approach to the task of geo-referencing entails the use of Camera Exposure Positions (CEPs) instead of GCPs. In this approach a carrier phase capable Global Navigation Satellite System (GNSS) receiver is connected to the airborne camera to determine accurate coordinates of the camera at the precise moment of exposure. Since our survey copter is equipped with the V-Map system from Micro Aerial Projects , we chose to follow the latter of the two methods of geo-referencing our map. Although not necessary, we decided to also add to our tasks the provision of six check points – points which for all practical purposes are GCP’s, but which instead for geo-referencing are used strictly for independent checking the accuracy of the model.

You will notice that we are using the word “model” whereas you may quite rightly have expected us to talk about maps. The reason for this vocabulary is to emphasize that the SfM process commences with the establishment of relative camera positions and a point cloud from which a three-dimensional digital surface model (DSM) is built.  A digital ortho photo or SfM derived map, i.e.  a flat visualization of the point cloud is just one of several geo-spatial products than can be derived from the model.

Figure 4 below shows a typical SfM work flow in which use is made of GCPS. Setting ground control points, the step marked in red, is the only task in the work flow requiring manual inputs. From a practical and an operational point of view, the most significant difference between using CEPs and GCPs is that the former can eliminate or at least reduce the amount of manual labor required in SfM mapping.

Figure 4: Structure from Motion Work Flow with the use of Ground Control Points

To ensure that our map was going to be usable in conjunction with other geo-spatial information, we decided to reference our map to the official State Plane Coordinate (SPC) System. For this purpose we had to establish a reference point. Hence the first thing we did in the field was to choose a suitable location for a reference point which we demarcated with a 6d nail. We made sure that the location was free of any obstructions to the sky so that our V-Map receiver, when installed over the point as reference station, would be able to receive un-obstructed signals from the GNSS satellites. The reference station was to serve three purposes:

  • Provide an accurate set of SPC coordinates for spatial reference relative to the official state plane coordinate system.
  • Provide reference station observations, which, in combination with the raw observations to be recorded by the roving, rod mounted V-Map receiver, would be used in a post processed kinematic (PPK) global navigation satellite system (GNSS) survey to determine accurate SPC coordinates in all terrestrial surveys on this project.
  • Provide reference station observations, which in combination with the raw observations recorded by the roving, drone mounted V-Map receiver, would be used in a post processed kinematic (PPK) global navigation satellite system (GNSS) survey to determine accurate SPC coordinates of each of the 622 camera exposure positions.

Before starting with any field measurements, we powered on our reference station so that it could observe and record raw dual frequency GPS observations during all the subsequent ground and air survey activities. Figure 5 shows the V-Map receiver installed over our base point as GNSS reference station.

Figure 5: The V-Map receiver being used as a dual frequency GPS reference station. (Note the survey copter can be seen to the left above the reference receiver

To assess the geometric quality of our work, we set and surveyed six check points. For this purpose, we used the same V-Map receiver that would later be used on the drone to capture camera exposure positions in the air. Figure 6 shows how the light weight equipment can be transported from point to point and how points are demarcated and surveyed “on the fly”. In this application of PPK GNSS surveying the surveyor simply places the rod on the 6d nail at the center of the target, centers the rod bubble and takes a picture. There is no need for any extended occupation time in this type of surveying. To provide for some level of redundancy that could alert us to incorrect occupation, whether by human error or as a result of a bad rod bubble adjustment, we surveyed each point twice – each time with a different orientation of the bubble relative to the point to be surveyed. The check point survey procedure took us in this case less than half an hour to complete.  Note that the reference station was actively recording reference observations during the entire duration of the check point survey. On completion of the check point survey we powered down both reference station and rover and downloaded the raw observations of both from their respective SD cards.

Figure 6: The V-Map receiver mounted on a standard survey rod for kinematic GNSS surveying of check points.

Next we used the open source program Mission Planner to design a flight path for our survey copter for the automatic acquisition of 622 vertically aligned aerial images with a ground sampling distance (GSD) of 11mm, lateral overlap of 70% and forward overlap of 80%.  To achieve these requirements with our Sony a6000 camera and 16mm fixed focal length, the flight altitude had to be 50m above ground level and the camera had to be triggered every 9.8 m.  Figure 7 below shows our flight plan.

Figure 7: Flight Plan to acquire 622 overlapping aerial images from an altitude of 50m

After removing the roving V-Map receiver from the survey rod to the copter we could begin with the preparation of the automatic image acquisition flight by uploading the flight plan to the memory of the copter. Prior to launching our aerial image acquisition flight we powered on the reference station receiver and then we made sure that we were in full compliance with FAA regulations, that our flight plan was safe to execute (the power transmission line visible in Figure 8 below has a height of 30m) and that we had attended to each and every point on our check list. Figure 8 below shows our survey copter being launched on a fully automated image acquisition flight at 50m operational altitude. 

Figure 8: Our PPK GNSS V-Map equipped copter being launched on an automatic 25-minute, 50 meter altitude, image acquisition flight.

Some 25 minutes later the copter landed in automatic mode. Now we downloaded the images on the SD card of the camera and the raw observations recorded on the SD cards of the reference station and air-borne V-Map rover respectively.

Packing up the base station and drone equipment completed the entire field work component of this mapping exercise. Figure 9 below shows some of our standard selection of field equipment. It consists of three interchangeable V-Map receivers (one to serve as reference station, one to serve as drone or rod mounted rover and the third one as spare), two survey copters, check lists, critical air frame spares, drone repair tools, a small laptop (with a matt screen!) to interface with the flight controller, two fully assembled APM-controlled survey copters, flight batteries for six 30-minute flights, two Sony a6000 cameras with 16mm fixed focal length lenses, spare SD cards for V-Map equipment and cameras, survey rods, tripods, flight batteries, shovel, hammer and survey point demarcation materials such as rebar rods, nails, spray paint and GCP targets. We strictly follow the “one is none, two is one” rule for all airborne components.

Figure 9: A view of the bed of our field crew vehicle.

Table 1 below shows the time it took to complete the various field tasks.

Table 1: Duration of Field Tasks

The last thing we did prior to departing the project site was to scan through the aerial images to make sure that their quality was acceptable.

The first thing we did back in the office was to obtain precise SPC coordinates for our reference station. We converted the two V-Map reference station raw observation files (one created during the check point survey and the other during the aerial image acquisition flight) to receiver independent exchange (RINEX) format and uploaded them to the On-Line User Positioning Service (OPUS), a free service made available to the American public by the National Geodetic Service (NGS) of the National Oceanic and Atmospheric Administration (NOAA) of the United States. A few minutes after uploading the RINEX files, OPUS returned the reference station coordinates via e-mail. In addition to the coordinates, OPUS also reports the estimated errors. Table 2 below shows these estimated errors for each of the two sessions.

Table 2: Estimated Errors in the OPUS coordinates of the Reference Station

A comparison of the two sets of coordinates, shown in Table 3 below, confirms that the estimates provided by OPUS are realistic. Since the differences are well within prescribed limits for general mapping work, we adopted the average of the two solutions provided by OPUS as the final SPC.

Table 3: Differences in the coordinates as derived from the raw observations recorded during Sessions 1 and 2

Now that we had exact reference station coordinates we could start with the computation of the camera exposure positions. We used CamPos, a GNSS post processing workflow suite using components of the open source RTKLIB set of GNSS programs. The highly-automated process of computing camera exposure position coordinates with CamPos took some 10 minutes to complete. The final output from CamPos consists of a Google Earth compatible kml file displaying CEPs in color codes per GNSS solution type. Green symbols indicate optimum phase differential GNSS accuracy (i.e. integer ambiguities were fully resolved in the solution – so-called “fixed” solutions), orange symbols indicate so-called float solutions (integer ambiguities could not be fully resolved) and red symbols indicate autonomous “stand-alone” solutions.  In addition to the kml file CamPos also produces a csv file containing the coordinates of CEPs together with accuracy attributes expressed in integers, the so-called Q-factor (Q for quality), ranging from 1 to 5; Q=1 indicating optimum accuracy based on a solution with fully resolved phase integer ambiguities, Q=2 indicating a float solution and Q=5 indicating an autonomous solution (i.e. the solution not differentially corrected with reference station observations at all). In our approach, we tag each CEP with a numerical value of the accuracy per Q factor. We assume an a-priori accuracy of 5cm for fixed solutions (Q=1). We attach a nominal a-priori accuracy of 10m for all other solutions (and we exclude these solutions from the constraint of any adjustments made to the camera alignment or point cloud). Table 4 below shows an extract of the CEP coordinate table which we use as input to the SfM process.

Table 4: CEP coordinate file for input to SfM

The Google Earth visualization of all 622 CEPs is shown in Figure 10 below. Note that seven of the CEP solutions turned out to be “float” and are thus shown in orange.

Figure 10: Visualization of CEPs in Google Earth (Green = accurate, Orange = approximate)

Once we had computed the CEP coordinates we could proceed with the SfM workflow. For this we use a program from Agisoft called Photoscan Professional. After importing the 623 aerial images and the CEPs we performed a high accuracy camera alignment and investigated the degree of correspondence between SfM aligned positions and V-Map results. Figure 11 below illustrates the comparison.

Figure 11: Correspondence between SfM camera alignment and V-Map derived CEP coordinates

The correspondence between SfM camera alignment and the V-Map derived CEP coordinates turned out to be remarkably good, thus confirming that our a priori accuracy estimate of 5cm for fixed CEP coordinate solutions was not at all too optimistic. It is interesting to note that there is a concentration of the largest discrepancies right over the middle of the retention pond. This observation confirms that the high degree of accuracy in the V-Map derived CEP coordinates is sufficient to expose the weakness in the SfM camera alignment over areas which lack texture – such as water bodies!

The next thing we did was to set up a batch process which would complete the entire SfM workflow without any human inputs. The tasks and corresponding processing times were as follows:

Summing up, the whole SfM process took a period of 16 hours 13 minutes and 32 seconds. We did this processing on a computer with configuration as shown in Table 5 below:

Table 5: Components of Computer used in SfM processing

Once the SfM processing was completed we had to edit the point cloud by removing outliers resulting from the lack of texture over water surfaces. Next we exported from Photoscan Professional a digital ortho photo with GSD of 11mm and a DSM with cell size 22mmx22mm, both in geo-referenced TIFF formats. This process takes about half an hour. The ortho photo and DSM were them loaded into our GIS program, Global Mapper from Blue Marble Geographics. Since TIFF files are rather bulky, CAD and GIS programs tend to take rather long to render these types of files. Hence, we used Global Mapper to create an ortho photo in ecw format in which the data is more compressed and thus rendered significantly more quickly by programs such as Global Mapper. To make their appearance more pleasing and to exclude any regions where the SfM process failed to produce reliable results we also “trimmed” the ortho photo and DSM around the edges.  Figure 12 below shows the entire ortho photo as displayed in Global Mapper.

Figure 12: Ortho Photo of the entire development GSD = 0.011m

The first thing we wanted to find out was to see whether we could detect on our ortho photo the same defects that we could identify on the manually captured oblique GoPro photography. So we zoomed in on the same area of detail on the ortho photo as depicted on the oblique GoPro photo shown in Figure 3. The magnified section of the ortho photo is shown in Figure 13 below.

Figure 13: Magnifying the Ortho Photo for detection of effects. Compare to Figure 3.

The above figure shows that an ortho photo with a GSD of 11mm will be suitable for detection of slipped, but not of cracked, tiles.  To improve the resolution for the detection of cracks, we would have had to reduce the flying altitude. Flying lower would decrease the image footprint and hence shorten the distance between successive exposures. Shortening the exposure distance interval shortens the time interval – thus increasing the rate at which the camera must expose and store images. To mitigate the possibility of missed exposures due to camera over-load, the flight plan can be changed to reduce the speed of the survey copter – thus increasing the time interval between successive exposures and giving the camera a chance to cope with the exposure rate.

In addition to the insufficient resolution we notice that the shaded areas are too dark to inspect details in them. We have not yet taken the time to investigate whether selective manipulation of exposure values in the areas of interest on the ortho photo will render the dark shadow areas useful for inspection purposes. Another possible remedy to under-exposure is to fly under conditions of diffuse illumination, ideally created by, for example, high cirrus clouds.

Furthermore, the vertical ortho projection prevents visibility of spaces obstructed by overhead features such as roof overhangs. Only full 3D visualization of elaborately constructed 3D models will facilitate the inspection of obstructed spaces – an effort which we could not accommodate in the scope of this project, but which we will be investigating at the next best opportunity.

Having concluded that the ortho photo is inadequate for the detection of cracked tiles, we had a look at the images from which the ortho photo was made. On only two of the 13 images in which the cracked tile appears could we discern, not necessarily detect, never mind identify, the crack in the tile. These two images, shown in Figure 14 below, happen to be the ones closest to the zenith of the defect.

Figure 14: Appearance of cracked tile in vertical imagery as captured by Sony a6000 from a height of 50m above the ground.

The next thing we wanted to establish was the accuracy of our map. Using the raw observations of our check point survey and CamPos, we computed coordinates for each of the two occupations of all the check points in our check point survey and compared the resulting pairs of the two independent occupations at each point to ascertain that no errors had occurred in centering the V-Map antenna over the point.  The insignificant differences between the results of the independent occupations shown in Table 6 confirm that the V-Map observations yielded fully resolved integer phase ambiguities with correspondingly high accuracy and that the check points were accurately centered over during the independent occupations. Being suitably satisfied with the good agreements between the independent occupations we decided to compute and adopt the mean values as final coordinates of the check points.

Table 6: Comparison between successive V-Map occupations of Check Points

Figure 14 below shows the distribution of our check points.

Figure 14: Spatial distribution of check points.

Now that the check points were superimposed in a layer above our ortho photo and DSM, we zoomed in on each of the check points and captured the pre-marked target center coordinates (visible in the ortho photo) for comparison with the coordinates determined by terrestrial V-Map survey as described above.

The numerical comparison between the two independently determined coordinates sets is shown in Table 7 below.

Table 3: Comparison between coordinates derived from terrestrial and aerial (ortho photo and DSM) surveys.

Note that the two data sets – terrestrail survey and aerial survey – are, apart from being based on the same coordinates for the reference station, in all other respects completely independent of one another. Note also that in the comparison above there is one significant outlier, namely the elevation of check point 2. We suspect that this is as a result of the check point being located at the edge of the mapping area where image overlap is sparse. In fact check point 2 appears in only 8 aerial images while nearby check point 5 and check point 3, for example, appear in 14 aerial images. Removing this outlier would bring the standard deviation in the elevation from 4cm to 1.6cm. The lesson to be learnt from this is that one should always provide a generous margin of aerial stereo image cover around the area to be mapped. Ideally only areas which are covered by at least three strips of aerial imagery should be trusted for good mapping in all three dimensions.

Figure 15 is another visualization of the horizontal errors at the six check points in our map. Note that the dimensions of the panels are 20cm x 20 cm (8” x 8”).

Figure 15: Horizontal errors at the check points. Note the target panel has sides of 20cm (8″) length.

Having established that our map is as accurate as can be expected from this type of mapping we were interested to see how well one could use the DSM on hard surfaces.

Figure 16: Photographic record of monument being surveyed with a rod-mounted V-Map receiver.

Having performed an internal quality check on our map we decided to use third party data for an even more independent verification of our mapping accuracy. Our client was kind enough to furnish us with a digital copy of the licensed land surveyor’s plat of the subdivision in dwg format. When we imported the plat to Global Mapper we noticed that there was a systematic shift between the parcel corners as depicted on the plat and where one would expect them to be on the ortho photo. Since land survey regulations in the state of Florida still do not prescribe that cadastral surveys have to be referenced to an official datum and coordinate system, we concluded that the plat was referenced to what practicing land surveyors in the state of Florida often refer to as a “pseudo state plane” coordinate system. Because the digital plat did not come with any geo-referencing meta data, we went back to the site, located three monuments that appeared on the plat and used our V-Map receivers to obtain official state plane coordinates for them.  From this information we could calculate that a shift of 38.7cm (1.27’) in a direction of 69.5391° would significantly improve the agreement between the plat and our ortho photo. Figure 16 shows a photograph that was taken at the moment that one of the monuments was surveyed with the V-Map receiver mounted on the survey rod. This photograph is an objective and unimpeachable record of the feature that was surveyed and in many ways, improves on the conventional written notes and hand-drawn sketches required in terms of current practice. The resolution is actually sharp enough to interpret the lettering stamped into the disc embedded in the concrete monument.

In Figure 17 we show the plat (i.e. the parcel boundaries) after application of the systematic shift overlaid on the ortho photo.  The comparative insets show how well the V-Map survey results compare with the plat, thus confirming that the application of the shift properly referenced the plat to the same reference frame as the ortho photo.

Figure 17: Discrepancies between V-Map survey and Plat after applying a constant shift: At B, 0.000m (0.000′); at C, 0.038m (0.125′); at D: 0.006m (0.020′)

Now that we had transformed the plat from a local reference frame to the official state plane coordinate system we could begin comparing the plat with physical features shown on the ortho photo. Figure 18 shows a region where we detected a consistent east west discrepancy between boundaries and roof tops of 40cm (1.31’) which is significant in terms of cadastral surveying accuracy. Is this discrepancy a result of errors in the original plat survey or in the setting out of the foundations? One way to find out would be to search and pre-mark the boundary monuments and then to re-map the area. Alternatively, the more arduous conventional approach of searching and surveying the monuments as well as the houses accurately on the state plane coordinate system could be followed.

Figure 19: East-West discrepancy of some 40cm between roof-tops and plat.

Figure 19 is an extract of a region in the south-east of the ortho photo where the roof tops seem to align perfectly with the plat lines. However there seems to be an asymmetric placement of the block relative to the street alignments – also in the east-west direction.

Are the discrepancies, uncovered her by SfM surveying, perhaps due to a change in spatial reference between original platting and setting out of streets and parcel corners? To get to the bottom of the discrepancies displayed here between plat and ortho photo would require the survey of a larger number of corner monuments and other, well defined features. However, considering that apart from setting up and powering on the reference receiver, the ortho photo was produced with virtually no conventional survey work on the ground – the agreement between it and the cadastral survey is truly remarkable.

Watch this space for a continuation of our sojourn where we will be contemplating topics such as





Reporting Horizontal Errors in SfM Mapping

Fully appreciating the powerful role played by the Structure from Motion (SfM) technique in the mapping revolution, we understand that embedded in the fiber of an SfM produced map are some very important quality characteristics which, when not specifically disclosed and effectively displayed, will remain hidden to the user of the map. These characteristics can be understood as the quality DNA of the individual map. A responsible map maker should thus proactively determine and display this DNA for each of the maps he or she makes, thereby providing the individual map with an authentic certification that is much more relevant and meaningful than the credentials of the map maker himself.

One way to display horizontal errors is to plot them at a much larger scale than the map itself, as shown in the figure below.

Horizontal Error Visualization

The advantage of this type of error visualization is that it would very efficiently expose bias in position or rotation across the mapping domain. The disadvantage is that it is not all that easy to visualize the magnitudes of the errors.

So we are looking for effective ways to attach a clear picture of the true DNA to each of the maps we make. In the figure below there is another example of error reporting that may communicate better than numerical tables of errors or statements regarding the “class of accuracy” as defined by some or other organization. Here we show the “true” location of check points – determined to 10mm accuracy by total station – superimposed as yellow dots over the corresponding targets in the Ortho Photo. The distance between dot and target center represents the local horizontal mapping error.

Computed Errors

For a realistic understanding of the significance of these errors all the reader has to know is that the square targets have dimensions 20cm x 20cm and that the round white targets have a radius of 5cm.

Aren’t these “error-pictures” much more realistic and relevant than all the other ancillary, albeit interesting information such as the flying height of 50m, the wind speed of 5m/s, the GSD of 13mm, the MAP-M4 multirotor platform, the airframe mounted Sony a6000 camera with 14mm lens that was connected to a dual frequency on board V-Map GNSS receiver weighing only 130g (www.v-map.net), the average estimated 3D camera position accuracy of 3cm and the combined number of more than 50 man-years of experience in mapping under the belt of the map makers?

Oh, and never mind the fact that no Ground Control Points were used to make this map!