January 6, 2019
We live in an increasingly data centric world and we are consuming data at a rate, like never before in history. Yet, our understanding of the data we generate does not always take advantage of its true potential. To do this, data needs to be standardised, readable and easily accessible.
Within the Ground Investigation (GI) industry, we are fortunate to have a data format which satisfies these criteria, and available as the AGS (Association of Geotechnical and Geoenvironmental Specialists) data format. Implemented in 1992, the AGS data format has been used to consolidate information generated by laboratories, engineers, drillers, technicians, designers and geologists into one single data file system.
Today the AGS data format is the standard not just in the UK but in America, Australia, Singapore and many other countries around the world. This is testament to the dedicated volunteers from across the industry who have worked hard to promote the format, and also to the ease of use of the format and its accessibility. The primary purpose of the AGS data format is to transfer information from one geotechnical system to another. The most frequently used ground investigation systems are data management programs which translate GI data to produce logs, 3D models and analyse geotechnical and geochemical data. Each data management system has its own pros and cons but, whichever one you may be familiar with, it should always automatically check the AGS data before importing and exporting data.
There are still however, some occasions when the format or the data goes wrong and where the problem lies may not be immediately obvious. This article should help you understand some of the most commonly encountered errors you may come across and provide you with a better understanding of the AGS data format and what you should be watching out for. If you or your company are a member of the AGS, you should check out the AGS data format website (links provided) where you can download examples of AGS data, suggest codes and changes, and get a list of what the codes stand for. The types of errors encountered can be divided into two categories, data errors and data format errors.
Data Errors
The AGS is designed as a transfer medium from one system to another. To validate the structure of an AGS file AGS checkers are available (detailed on the AGS data format website) however, though these may ensure an AGS file is structurally correct, the data it contains could be complete gobbledegook. This is where it is the responsibility of the data managers, engineers, the drillers, laboratories and technicians to do everything they can to prevent bad data being included in their datasets.
The types of error you can expect are too numerous to fit into one article, but we can use some examples. These errors may be as simple as spelling a colour incorrectly, or serious enough that the incorrect placement of a decimal point for in-situ tests may jeopardise an entire project, leading to unnecessary costs and programme delays. Samples greater than the depth of the hole, core recoveries of 1000%, boreholes plotted somewhere in space around the north pole and holes drilled in the year 2119 are all examples of situations that (unless you???re reading this 100 years from publication) are unlikely to happen, yet it is these kind of errors which we are most likely to encounter.
As humans, we’re all susceptible to making these kinds of mistakes, either through an accidental key stroke, misinterpretation of handwriting or simply under pressure from time constraints. Technology can help us minimise these errors as well as save time and paperwork by reducing the double handling (rewriting) of data. Software already exists to aid with primary data collection but expect to see a seismic shift in the coming years when it comes to moving to a digital system over paper techniques. The BDA has previously stated the benefits of switching some tasks to digital systems and development continues into other roles.
How many of us are guilty of “as above” or “see previous” on logsheets when it comes to monotonous information such as serial numbers, dates, staff, units, methods etc? This seemingly redundant information is known as metadata and is invaluable not just when errors occur (by tracing data back to its origin) but also in allowing statistics to be generated quickly and accurately. It also substantially increases confidence in the data generated as it removes any ambiguity about specifics. Yet, many metadata fields in AGS files are left empty. Digital systems will also help collect more data than ever before by automatically filling out these data fields.
The software you use to manage your AGS data will determine how data is validated. If you are unsure of what validation protocols are available, then contact the software developer for tips and advice on how stricter controls can look for these types of data errors.
Now we know what type of primary data errors to look for we can look at some of the errors that can occur with the AGS data format.
Understanding the Data Format
To understand format errors, we need to look at how the data is presented and how it is structured.
The AGS version 4.0 data format is written in plain text, readable using any text editor and compatible across all operating systems. Data is separated into groups which are represented by 4 letter codes, signifying the table the data is to be stored in i.e. SAMP for Samples, LOCA for Locations and so on. Next, there are the identifiers HEADING, UNIT and TYPE. These indicate what group field the data should be stored in, whether they have a unit associated with them (metres, degrees etc) and what format they should be stored in (text, decimal places, from a list of values etc) respectively. The main chunk of the data follows on after the identifiers.
All the data is separated using a comma (,) and encased in double quotes (“DATA”)
Data Format Errors
Now we have a simple understanding of the structure we can begin to investigate some of the data format errors.
Authors:
Ben Swallow – Member of BDA Technical Standards Sub-Committee
Paul Hadlum – Data Manager at WYG
Sign up to our newsletter to keep updated on our latest news.