Knowledge journal / Edition 1 / 2019

Methods for determining groundwater levels: can we do better?

During the extreme summer of 2018, we became used to news reports about groundwater levels. But how are groundwater levels actually measured and how are these measurements processed into groundwater maps? This article lists all established methods including possible sources of error. It is clear from the results that an independent validation for a monitoring network for groundwater measurements is not an unnecessary luxury.

In the Netherlands, a groundwater characteristic (GWC) for an area is determined in three steps: (1) in-situ monitoring of the groundwater level (in the field); (2) interpolation and extrapolation over time; and (3) spatial interpolation and aggregation. This procedure is not standardized. In practice, several definitions and different combinations of methods are used. We have analysed the methods and techniques used over the past 25 years. The aim was to determine the impact of the errors that can occur at each step, and the consequences of this for the resulting groundwater characteristic.

What are the definitions that are used?

To define “groundwater level” several definitions are used. As a result, it is not clear whether the phreatic groundwater level, a perched water table, a rise in height or the (geohydrological) groundwater level has been measured. In addition, it is customary in the Netherlands to characterize the groundwater situation at a measuring location with averages, the so-called GxGs. These GxGs are not uniformly determined either.

Figure 1. Diagram of groundwater level measurements with a groundwater observation well (fully perforated) and a deep piezometer (short perforated filter)

How are the measurements carried out?

Four methods are used to measure the groundwater level: (i) groundwater observation wells; (ii) piezometers; (iii) open bore holes; (iv) field estimates (Figure 1). These methods are standardized (type of pipe, locations, measuring methods and data storage), however, they do leave room for their own interpretation (location, depth, filter length, frequency, relationship with soil type, etc.). For example, if a groundwater level standpipe is drilled through a poorly permeable layer on which a so-called perched water table has formed, you will not know exactly what you are measuring. This necessitates interpretation of the measurement, which can lead to subjectivity.
A piezometer measures the head and not the phreatic groundwater level. Piezometers provide a more accurate measurement than groundwater standpipes. This also applies to open bores provided that drilling takes place up to a poorly permeable layer on which there is a difference in head or a perched water table and, if necessary, several bores are drilled at different depths to determine the occurrence of perched water tables. A comparison of the four measurement methods using 13 assessment criteria showed that the groundwater level standpipe and the piezometer performed best (Table 1).

Table 2. Assessment of the four methods used to measure the watertable depth (++ stands for "complies with criterion" and – for "does not comply at all")

Errors in measuring the groundwater level are not found so much in the measurement itself, but in the uncertainty about what is being measured: the phreatic groundwater level in stationary or non-stationary state, the depth up to a perched water table or the head? Additionally, errors are introduced when over time the standpipes are moved, lengthened or when the filter length is adjusted. Uncertainty also lies in the fact that it is not known whether the water level in the standpipe is in balance with the water in the soil (just think of influences such as groundwater flow, air pressure and temperature). In addition, accuracy is often low when pressure sensors are used. Each of these causes can lead to errors ranging from a few centimetres to tens of centimetres.

How is upscaling over time done?

Upscaling groundwater level measurements over time can be done using (i) a direct calculation from time-series; (ii) statistical models; (iii) process models and (iv) expert knowledge. Upscaling in time is easy when the time series are sufficiently long and sufficient measurements have been carried out. A time series that is too short can be extended using statistical models and/or process models.
Statistical models are based on a model in which the groundwater level at a given time is a function of the groundwater level at the previous time plus other relevant information, for example the precipitation surplus in the intervening time. Process models have the advantage that the most relevant physical laws are explicitly part of the model. The disadvantages are that model making and calibration are more laborious and more input data are required. A point of attention is that calibration can cause GxG flattening when only the “fit” is considered and not the “noise” (including more extreme values).
Temporal upscaling can also be done on the basis of expert knowledge: with profile and field characteristics, a field expert can estimate a GxG. In practice, multiple aggregation methods are often combined, for example expert knowledge in combination with process models or a process model with a statistical model.
A measurement series sometimes contains data from several standpipes, because a standpipe has been moved or extended, sometimes supplemented with model calculated water levels (e.g. with a time series model). In temporal aggregation, the errors are partly averaged but new sources of errors can also be introduced. As a result, the error in the GxG of a standpipe location can reach a few decimeters and in some cases even more than one metre. Often the errors are greater at locations with short time series.

How is spatial interpolation and aggregation carried out?

Upscaling an area or a region to a groundwater characteristic is done using four methods: (i) expert knowledge; (ii) methods based on random sampling theory; (iii) geostatistical models; (iv) physical-mechanistic models.
Experts can estimate the average groundwater characteristic (GWC) for an area on the basis of their knowledge of the area. In doing so, they will not only use established GWCs at measuring locations within the area (and outside of the area if these locations are near the target area) but also additional information such as the height map, the soil map and water levels of surface water.
Random samples are only used for spatial aggregation. This involves choosing random locations for determining the GWC and then based on the sample, estimating the frequency distribution of GWCs or the parameters such as the average for the area in which the sample was taken.
Geostatistical models do not require sampling but there are requirements for the minimum number of measurement locations. Another difference is that the results only apply under certain model assumptions (e.g. linear relationships, constant variance of regression residues, stationarity of the semi-variogram).
Physical-mechanistic models have the advantage that all kinds of (often non-linear) physical relations can be included. With process models, it is possible to extrapolate to other situations (in space and time) and to calculate measures and/or scenarios. A disadvantage is that often a lot of the required input data and process parameters are not known and are based on assumptions.
Often temporal and spatial aggregation methods are combined. Quantification of model errors often proves difficult and is therefore often disregarded. With spatial interpolation, the interpolation error comes on top of errors in measurement and temporal aggregation, which means that the error can ultimately average about 20 to 50 centimetres depending of course on the area and the method used. On the other hand, spatial aggregation averages random errors in the point estimates of GWCs.

What have we learned?

Over the past sixty years, the methods for measuring and interpreting groundwater levels have changed in order to obtain groundwater characteristics (GWCs) for an area. Four different methods can be used for each of the three steps in this process. The combination that is used is usually based on the purpose of the research and the available data. The use of different combinations of methods and techniques has certainly led to a better understanding of the GWCs in the areas studied.
The impact on the final result of measurement errors and inaccuracies in temporal and/or spatial interpolation and aggregation has never been systematically investigated. Our analysis shows that changes in methods for measuring and interpreting groundwater levels lead to significant systematic differences in the GWCs. However, we were unable to assess the accuracy of the GWCs established partly because of the remedial interactions between the errors.
Determining groundwater characteristics is an iterative process in which each step must be repeated several times to check that the assumptions required to complete a particular step do not impede the next steps. An analysis should be made at the beginning of each study based on the research questions in order to select the most suitable measurement sites (datasets) and then the most suitable methods to arrive at a GWC. It is essential to assess whether the resulting accuracy is sufficient to answer the research questions.

Validation network

Our research shows that, although we have a national monitoring network for measuring groundwater levels, we do not know for sure whether drought stress or flooding in the Netherlands is under or overestimated. To objectively and process-independently determine the groundwater level, independent validation data are required at randomly selected locations. Unfortunately, existing monitoring wells rarely meet this criterion. The introduction of a validation monitoring network to objectively determine the reliability of models and maps used independently from existing monitoring networks would solve this problem.

Henk Ritzema
(Wageningen University)
Martin Knotters
(Wageningen Environmental Research)

Summary

In the Netherlands, groundwater levels are important for nature and agriculture and especially for foundations. However, different methods are used to measure groundwater levels and to obtain a regional and national picture. We have analysed how commonly used combinations of methods influence the resulting groundwater characteristic (GWC). We show that there has been no systematic assessment of the reliability of used measuring methods and models. As a result, it is uncertain whether drought stress or flooding in the Netherlands is underestimated or overestimated. We recommend a more systematic approach to reduce the impact of errors and we favour the introduction of a validation measurement network independent of existing networks.

Literature

Ritzema, H.P. et al (2018). Analysis of the methodologies used to derive groundwater characteristics for a specific area. Geoderma Regional, Published on line, doi: 10.1016/j.geodrs.2018.e00182

Ritzema, H.P. et al (2012). Meten en interpreteren van grondwaterstanden. Analyse van methodieken en nauwkeurigheid. Wageningen, Alterra report 2345. 122 pages.

This article is a summary of Wageningen Environmental Research's project “Verdroging” [“Desiccation”]. The project was initiated following discussions in recent years on “numerical desiccation” and was financed by WEnR and the Ministry of Economic Affairs’ Knowledge Base project “Duurzame ontwikkeling van de groenblauwe ruimte” [“Sustainable development of the green-blue space”].

Auteurs

Henk Ritzema
(Wageningen Universiteit)

Martin Knotters
(Wageningen Environmental Research)