Is Big Data a Big Problem for Surveyors?
Article

Is Big Data a Big Problem for Surveyors?

Laser scanning has made data acquisition substantially simpler but the accurate analysis and registration of data requires both effective software and the core skills of the land surveyor. Is the software doing the job and is it encouraging those without the core skills to have a go? GW has discussed the subject with several surveyors with wide experience of laser scanning.

For some time now the world has been getting excited about Big Data, the ability to mine the huge amounts of data that companies collect about us and put it all together to deduce. . . well goodness knows what. It’s almost certainly the cause of those irritating phone calls from scripted over-enthusiastic people wanting to sell us something we don’t need and those emails from companies we’ve never heard of. More positively, big data promises major advances in fields like astronomy and genomics.

For ‘Mega’ Read ‘Giga’

Surveying however has its own problems with big data. And they are rather different from unsolicited marketing. Our big data problem comes from high resolution laser scanning and imaging, and it is aggravated by stitching scans together to make composite point clouds, 3D models and mosaics.

For surveyors, moving from surveying techniques involving taking relatively few field observations made using a total station or by manual methods to technology that involves gathering millions of observations per set-up using a laser scanner is a daunting prospect. Even when using traditional techniques, it can be a challenge to manage the raw data, computed data and results in such a way that it can be revisited easily. With laser scanning we are looking at gigabytes of data even for small projects. But some projects can easily demand hundred of scans. Is the hardware and software up to this and, if not, what are the issues that need addressing?

A User’s View

According to Patrick Collins of chartered surveyors Michael Gallie & Partners, none of the software packages recommended by the laser scanner vendors are good at handling large datasets containing hundreds of scans. There are limitations on the file size of individual scans and the software is slow at merging scans, so slow in fact that hours (or more) can pass whilst your high-end 64bit mega PC with Gigabytes of RAM grinds away. Also, the available software cannot handle the complete survey process from collection to analysis through to deliverable, necessitating intermediary formats where vital meta-data is lost, such as scan name, target positions and control information.

‘They’re also not very good at overlaying’, says Collins, ‘to the point where, when you attempt an overlay, the model collapses’. He adds, ‘One might expect, as new versions appear, that these problems would gradually go away but instead menu options are changed and you find you can’t do what you previously did. Processes which worked in a previous version of the software suddenly no longer work after so-called upgrades.’

Collins continues: ‘All of this can lead to a lack of confidence in the overall accuracy of the model, as scans are merged and overlaid. In a nutshell, some of the designers of these packages seem unaware of the underlying principles of surveying and the need to maintain dimensional accuracy, calibration and control. There are ongoing issues with the use of global coordinates in 3D software with one of the market leaders currently dealing with a bug where the overall scale of the point cloud is readjusted upon import, leading to small but important changes to the real world data.’

A Solution to BIM?

With the recent push towards all things BIM, and the clever placement of laser scanning as a solution to BIM as-built data acquisition, Collins finds that his company is seeing increasing demand for “laser scan surveys” unrelated to the actual client requirement. ‘We’ve long been a profession eager to use new technology and demonstrate it to clients to show them what it can offer (whether they need it or not!). Many of us are also rather ingenious at putting together systems of sensors and computers. Alas none of this is of much interest to clients. Once a surveyor has ascertained exactly what it is his client wants surveyed, at what scale, accuracy and the end purpose of the survey, then it should be up to the surveyor to decide whether he uses a tape or an airborne or terrestrial laser scanner or an imaging system.’

Many chartered surveyors, who use scanning for measured and area referencing surveys face competition from unqualified or part-time surveyors who have bought or rented a scanner and offer the service but who don’t have the fundamental survey knowledge. At worst they’re just black box, button-pushing operators. These issues are not new to the profession. We’ve been there in the past with EDM and GPS. As these revolutionary technologies appeared, there were always those keen to exploit them but without much knowledge of the basic principles of survey work. Collins cites a major supermarket chain which recently appointed just such a firm. The client was left with hundreds of unreferenced scans, no control and, on one project alone, a dimensional variation of up to 2.5 metres.

Who Needs Control?

Concerning the registration of large datasets, Collins has heard from one manufacturer of a client who has been able to register 700 scans, but without survey control. He observes: “Our experience has shown you will get “drift” across a dataset that size, especially in height, which is exactly what an experienced surveyor would expect. Ultimately the registration of 700 scans without survey control provides an un-provable result and cannot be considered a survey, but there is an inexperienced scanner user base who will accept the green lights from their software package as being millimetre correct!”

Laser scanning has made data acquisition substantially simpler but the accurate analysis and registration of data still requires the core skills of the land surveyor. The lack of understanding of “good” data (accurate, repeatable, reliable) is not just a problem for non-surveyors but extends into the software. Not only does much of the current offering fail to establish and maintain a proper control/registration structure but it also does not offer proper management of the data throughout the life-cycle of the project, through maintaining unique scan names, scan targets, co-ordinates and change management.

As one would expect, the processing software sold by established surveying equipment manufacturers structures data from the whole to the part – control to detail, as a surveyor structures data, but software without the surveyor behind it, treats control and detail as the same. If one ignores the principles of survey, one can get an answer, and quickly (and therefore price competitively), but it will not be robust.

Data - How Long and What For?

Turning to the question of long-term storage of laser-scanned big data, the first crucial question is what data has to be kept, why and for how long. Why, is a fairly straightforward question to answer. It has to be kept in case the survey is called into question in the future. Contracts normally stipulate five years or a bit longer. For that period, it has to be possible for any surveyor to follow the process used by the original surveyor from fieldwork through to products. For the raw data, this could mean not only recording the data in a suitable format and possibly imagery as well, but also the location and orientation of the scanner, date and time of observation and so on.

The next stage in the process is to register the scans together and to add survey control. This involves recording control positions, the results of registration computations and the unified point cloud data. At this point, if the point cloud is required as a deliverable, the surveyor has to remove spurious points from the dataset in a data-cleaning process. By this stage the surveyor may have three sets of duplicate data in raw format, pre and post registration and edited data, all of which has to be kept until the registered composite point cloud is verified as accurate and complete.

Generally, the client deliverable is a 2D drawing or a 3D vector model. This is the point at which the surveyor may need to convert the composite point-cloud data into yet another format to be able to select laser scanned points to use to define edges and surfaces. The surveyor might use software tools to identify some features and manual techniques to complete the survey. The result is a much smaller file but the surveyor has to consider how to model surfaces, which will never be truly flat, for the client’s purposes. This information also has to be recorded and presented to the client as required.

What we have is a huge amount of data in various formats with only parts of the process maintaining required survey information. This makes it difficult, if not impossible, to preserve any reasonable sort of traditional survey audit control.

The second question is: what should you do with this vast quantity of data after the retention period? This is a commercial decision although there may be legal implications. The surveyor has to decide its value (or potential value) and, if he decides to retain it, what information has to be retained with the point cloud and modelled outputs to give future potential users confidence of its quality. With a 100 scan project likely to reach 50Gb between various versions and formats. Data retention is neither simple nor cheap.

It’s Not that Simple!

Two main conclusions can be drawn from this discussion. Firstly, professional surveyors tend to use a number of different processing packages because there is not as yet a single piece of software that can act as a one-stop shop. This multiplies the task of data management and demands that interfaces are robust, otherwise data and metadata gets lost in translation. The situation has become exacerbated as laser scanning is used for larger and more demanding projects with ever bigger datasets.

Secondly, there needs to be a common understanding between scanner suppliers, surveyors and clients, that laser scanning involves more than pushing a button. This is a surveying instrument and, like any other surveying instrument, its use should follow good survey practice both in the field and especially when processing the data in the office. Salesmen should be emphasising the issues regarding data processing to their customers. Surveyors should understand and follow good survey practice. Clients should appreciate that this is the way to achieve good quality reliable data. By employing ‘surveyors’ who do not follow good practice, clients increase the risk of receiving poor quality data and the associated costs when it is put to use. Ignorance of this should be no excuse.

This article was published in Geomatics World January/February 2015

--------------------

Acknowledgements

This article began life as a chat over lunch and has escalated from there! Our thanks to Pat Collins, Ian Coddington, Stuart Robson, Jan Boehm, Paul Burrows and others in the industry for their contributions and comments.

Geomatics Newsletter

Value staying current with geomatics?

Stay on the map with our expertly curated newsletters.

We provide educational insights, industry updates, and inspiring stories to help you learn, grow, and reach your full potential in your field. Don't miss out - subscribe today and ensure you're always informed, educated, and inspired.

Choose your newsletter(s)