# Reach Scale Hydrology

Constrained Kalman Filter (CKF)

## CKF Balance Enforcement

This is a statistical method to enforce (exactly or approximately) a balance constraint (e.g. mass balance, energy balance, or both at the same time) or an arbitrary equality constraint on an initial estimate of a vector variable given its error covariance matrix. It is also known as "Constrained Kalman Filter (CKF)" since the constraint is enforced through a Kalman filter update with the constraint being a "perfect" measurement (zero error).

### How it works

The methodology was first introduced for water balance constraining in Pan and Wood (2006) and later applied to a number of water budget studies like Pan et al. (2012), Sahoo et al. (2011), Zhang et al. (2016), Zhang et al. (2018), as well as Inverse Routing studies like Pan and Wood (2013), Yang et al. (2019), and Fisher et al. (2020). The basic operating principle is explained in Pan and Wood (2006).

Suppose our target of estimation (a vector variable) **x** is subject to an equality constraint: **Gx = g**. and our initial estimate of **x** doesn't satisfy that constraint. We look for a solution that satisfies the constraint while being closest to the initial estimate in a least squares sense, or in other words, to find the orthogonal projection of the guess on the constraint surface in Hilbert space. Note that under Gaussian and linear error assumptions, least squares solution, orthogonal projection, Lagrange multipliers, and Kalman Filter with a perfect measurement are all equivalent to each other. Note that the symbol for the measurement operator is usually **H** and we use **G** only to highlight its special purpose - there is no actual difference in terms of its role in the Kalman filer.

We assume we know the error covariance of the initial estimate **P**.

The math and operating principle of CKF is fairly simple, and most of the complexities and difficulties lie in the process to build the proper constraint measurement operator **G**. In some cases, this operator can be very simple and extremely complicated in others. We'll show three case studies to as examples from simple to advanced.

### Simple Study Case: 1-D Terrestrial Water Budget

This is to close the water budget over a single pixel/basin for four terrestrial budget terms: Precipitation (*P*), Evapotranspiration (*E*), Runoff (*Q*), and change in terrestrial water storage (*ΔS*). The water mass balance equations says:

*ΔS = P - E - Q* or *P - E - Q - ΔS* = 0

In this case, the target vector to estimate would be:

**x **= [*P*, *E*, *Q*, *ΔS*]^{T}

And the constraint operator would be:

**G **= [1, -1, -1, -1] and **g** = 0

Here are some results from Pan et al. (2012) for water budget terms over the Amazon River Basin before and after the mass balance constraining:

### More Advanced Study Case: Enforce Mass Balance on Reanalysis Moisture Fluxes

For various reasons like data assimilation increments, reanalysis data products usually do not close the water mass balance in the atmosphere. The atmospheric water balance can be written for every modeling grid pixel as, assuming the time scale is long enough such that the moisture storage (precipitable water) can be ignored:

div(*q*) = *E* - *P*

This differential equation needs to hold over the entire domain of interest. Let's assume that we trust the source/sink term *E - P* and only correct the vertically integrated moisture flux *q* (thus a 2-D variable just like *E* or *P*). Note that *q* is a vector and has two components *q*_{u} (eastward) and *q*_{v} (northward), thus the estimation target is *x* = [*q*_{u}, *q*_{v}]^{T}. We use bold letters **P**, **E**, **q** that include *P*, *E*, *q* values over all grid pixels of interest. And now:

**x** = **q** = [**q**_{u}, **q**_{v}]^{T}

**G** is the divergence operator that converts a vector field into a divergence field (a bit labor to build that up) such that:

**G** = div(**·**) or **Gx** = **Gq** = div(**q**)

Since all variables are evaluated in the model on a grid of pixels, **G** would in fact be a finite difference approximation for the div operator.

**g** = **E** - **P**

Here we took the MERRA-2 moisture fluxes and *E*, *P* data for a numerical experiment. The figure below shows the monthly moisture fluxes (top left panel) and *E* - *P* (top right panel) in MERRA-2 over the continental United States area in January 2020 (regridded to 1-degree resolution and units converted to mm/day). The bottom left panel shows the mass balance errors.

Once the constrain operator/divergence operator **G** has been constructed for the domain according to the width/height/area of each grid pixel (central difference scheme, see the sample Python code for details), we apply the constraining filter to update the moisture flux field. The figure bellows shows the constrained moisture fluxes and the divergence calculated from it (top left panel). The divergence values are now identical to the *E* - *P* (top right panel) and thus the balance errors are eliminated (bottom left panel). Finally, the bottom right panel shows how the filter makes adjustment on the flux field and its consequence on divergence. Pay attention to how under- and over-estimated divergence gets compensated (strengthened up or dissipated out) by the flux corrections.

Note that the adjusted flux field looks unreasonable along the boundaries (top left and bottom right panels). This is because the divergence is poorly constrained over the boundaries. Such boundary artifacts will improve for global (closed domain) applications or can be remedied by asserting additional constraints along the flux borders.

### Script for the MERRA-2/CONUS example analysis

This zip file on Google Drive contains the script, written in Python, and all the sample input data and output PNG figures.

Several Python libraries are required: numpy, netCDF4, and matplotlib. Miniconda is recommended for installing these libraries.

### Very Advanced Study Case

So far the most complicated application of the CKF strategy is the Inverse Streamflow Routing (ISR) and its downstream applications. ISR essentially builds an entire streamflow routing model into the measurement operator with state augmentation (in time) to accommodate the time delay between state and measurement. Follow the link to see details.

### Reference

The main CKF paper:

Pan M. and E. F. Wood, 2006: Data Assimilation for Estimating Land Water Budget Using a Constrained Ensemble Kalman Filter, *J. Hydromet.*, Vol. 7, No. 3, pp. 534–547, https://doi.org/10.1175/JHM495.1.

Application papers in water budget analysis:

Pan, M., A. K. Sahoo, T. J. Troy, R. Vinukollu, J. Sheffield and E. F. Wood, 2012: Multi-source estimation of long-term Terrestrial Water Budget for major global river basins. *Journal of Climate*, 25, 3191–3206, https://doi.org/10.1175/JCLI-D-11-00300.1.

Sahoo, A. K., M. Pan, T. J. Troy, R. K. Vinukollu, J. Sheffield, and E. F. Wood, 2011: Reconciling the global terrestrial water budget using satellite remote sensing. Remote Sens. Env., 115, 1850-1865, https://doi.org/10.1016/j.rse.2011.03.009.

Zhang, Y., M. Pan, and E. F. Wood, 2016: On Creating Global Gridded Terrestrial Water Budget Estimates from Satellite Remote Sensing. Surveys in Geophysics, https://doi.org/10.1007/s10712-015-9354-y.

Zhang, Y., M. Pan, J. Sheffield, A. Siemann, C.K. Fisher, M. Liang, H. Beck, N. Wanders, R. MacCracken, P.R. Houser, T. Zhou, D.P. Lettenmaier, Y. Ma, R.T. Pinker, J. Bytheway, C.D. Kummerow, E.F. Wood, 2018: A Climate Data Record (CDR) for the global terrestrial water budget: 1984-2010. Hydrol. Earth Syst. Sci., 22(1), https://doi.org/10.5194/hess-22-241-2018.

Application papers in Inverse Streamflow Routing:

Pan, M.. and E. F. Wood, 2013: Inverse Streamflow Routing. *Hydrol. Earth Syst. Sci.*, 17, 4577-4588, https://doi.org/10.5194/hess-17-4577-2013.

Fisher, C. K., M. Pan, and E. F. Wood, 2020: Spatiotemporal Assimilation/Interpolation of Discharge Records through Inverse Streamflow Routing. *Hydrol. Earth Syst. Sci.*, https://doi.org/10.5194/hess-24-293-2020.

Yang, Y., P. Lin, C. K. Fisher, M. Turmon, J. Hobbs, C. M. Emery, J. T. Reager, C. H. David, H. Lu, K. Yang, Y. Hong, E. F. Wood, and M. Pan, 2019: Enhancing SWOT Discharge Interpolation through Spatio-temporal Correlations. *Remote Sensing of Environment*, https://doi.org/10.1016/j.rse.2019.111450.