Group-based estimation of missing hydrological data

Thumbnail Image
Elshorbagy, Amin
Journal Title
Journal ISSN
Volume Title
Water resources planning and management require complete data sets of many variables, such as rainfall, streamflow, and temperature. Unfortunately, records of hydrologic processes are usually short and often have missing observations. Attracted by the importance of estimating missing data, hydrologic researchers have adopted and developed various models and techniques to in-fill missing data. The diversity of the adopted techniques does not necessarily indicate diversity in the approach. A major commonality exists in most of the applications of these techniques; that is, any hydrologic time series record is perceived as a sequence of single-valued observations irrespective of the time scale of the data or their underlying structure. In this research, the group approach, different from the traditional single-valued approach, is proposed. The approach perceives the periodic hydrologic data as sequence of groups rather than single-valued observations. The techniques suggested to handle the group approach, after modification, are regression, time series analysis, partitioning modeling, and artificial neural networks. Various models representing these four techniques are briefly presented and applied to single series and bi-series cases, respectively. Also group time series models are developed in this thesis for the same purpose. It turns out that the group approach is highly useful for estimating consecutive missing values, and possibly other applications, such as long-term forecast. On the other hand, in non-periodic data (e.g., daily flows) where seasonality does not play a major role and a definite number of repetitive low dimensional groups of observations cannot be found in the geophysical year, another approach of identifying and modeling groups is sought. The nonlinearity and dynamic behavior of non-periodic hydrologic data sets have been indicated in water resources literature as issues that influence the performance of modeling tools that ignore nonl nearity and dynamics inherent in the data structure. Consecutive missing streamflows are estimated, using the principles of chaos theory, in two steps. First, the existence of chaotic behavior in daily flows of the river is investigated. Second, the analysis of chaos is used to configure two models employed to estimate missing data: artificial neural networks and K-nearest neighbors. Also, another local linear model is applied for comparison purposes. The results highlight the utility of using the analysis of chaos for configuring the models. In an unprecedented trial, in the chaos literature in water resources, the effect of the chaotic behavior on the analysis of two cross-correlated time series is investigated. The effect of both nonlinearity and dynamics is shown through application to daily streamflows. Other issues such as noise reduction and the reliability of its application to hydrologic time series are discussed. It is recommended that current noise reduction algorithms should be applied with caution and used for better estimation of chaotic invariants. The raw data should always be the basis for any further hydrologic analysis. After decades of adopting stochastic hydrology, chaos analysis, which has been recently introduced to hydrology, provides challenges and opportunities in hydrologic research. It has the potential to change the way in which hydrologic, and other real, processes are perceived, analyzed, and interpreted. The phenomenon that used to be treated as random may turn out to be nonlinear deterministic (chaotic) process.