Exploring Hydrological Processes, Modeling Decisions, and Deep Learning for Improved Hydrologic Assessment
Abstract
Hydrological modeling is complicated by the intricate challenges and uncertainties associated with understanding hydrological processes. This thesis aims at answering some of the key research questions related to hydrological modeling. These questions include identification of dominant hydrological processes governing a given catchment, evaluating the impact of subjective modeling decisions and choices, and the intricate task of parameter estimation within a hydrological model. These questions led to the formulation of specific objectives of the thesis. For carrying out the research work, the process-based hydrological modeling framework, Structure for Unifying Multiple Modeling Alternatives (SUMMA) is used. Additionally, mizuRoute routing tool is utilized for the routing process.
Recent research suggests that hydrological models should be tailored to suit particular objectives and the specific basins under investigation. This underscores the importance of understanding the hydrologic processes that are relevant to the specific objectives. In the first part of the thesis, a sequential sensitivity analysis using the Efficient Elementary Effect (EEE) method is employed to select the sensitive parameters for each model structure considered. This allowed to capture the dominant hydrological process of the catchment under study, irrespective of the model structure choice. The study is carried out in the Netravathi basin of Karnataka, India. The findings pertaining to dominant processes obtained from this study are in alignment with the existing literature, thereby affirming the efficacy of SUMMA in modeling the hydrological processes specific to Indian conditions. Notably, this is the first-time application of SUMMA to Indian basins. Identifying the most sensitive parameters of the highly parameterized model, SUMMA, also helped to scale down the dimensionality of the problem in terms of computational demand and complexity. By giving precedence to parameters with the most substantial influence also help to mitigate the challenge of parameter equifinality. Furthermore, the selection of sensitive parameters eases model calibration.
The second part of the thesis deals with addressing the impact of subjective modeling decisions in hydrological modeling. The process of setting up a hydrological model requires the modeler to make various subjective decisions. These decisions include the choices like selecting the model's structure, discretizing the space, representing the forcing data spatially, determining the metrics for assessing performance during calibration and many more. The influence of these subjective modeling decisions is investigated in turn in a standardized framework to understand how differences in the decisions impact flood simulations. For this purpose, the capability of SUMMA model is made used as it allows a straightforward comparison of different model structure choices and can easily be reconfigured for different spatial organizations. Based on the various choices of model decisions, 36 unique model configurations are constructed and each of them are separately calibrated by the performance metrics. The impact and relative importance of modeling decisions are quantified by Analysis of Variance (ANOVA) and effect size respectively. As a test case, flood peaks that occurred in the Netravathi basin are accurately simulated (expressed as deviation in peak flow, deviation in peak time and relative volume error) to study the impact of the model decisions. For floods, the choice of spatial discretization of the modeling domain is the most impa= cting decision, followed by the choice of objective function during calibration, model structure and spatial representation of forcing, respectively, for the catchment. The results provide key insights regarding the ideal option a modeler must choose for each mode= ling decision for simulating floods in Netravathi. More generally, this study shows that model configuration decisions that are anecdotally often made based on convenience or habit can strongly impact simulation accuracy for specific modeling purposes. Alongside the need to quantify more traditional uncertainty sources, such as data and parameter uncertainty, there is a need to quantify the impact of these model configuration decisions.
The third part of the thesis focuses on addressing the challenges related to the calibration of hydrological models for parameter estimation. In hydrological modeling, parameter estimation is inevitable due to the challenge of directly measuring them, as most parameters are conceptual descriptions of physical processes. Modellers commonly employ optimization algorithms for calibrating hydrological models. However, these algorithms often pose computational challenges, especially when dealing with complex physics-based and distributed models. In this study, a novel approach called hydroCNN+DDS is introduced. By leveraging the strengths of Convolutional Neural Networks (CNN) and the Dynamically Dimensioned Search (DDS) algorithm, hydroCNN+DDS simplifies the model calibration process in complex physics-based models. This approach enables to capture the general patterns and relationships between discharge time series and parameters without compromising the underlying physics. HydroCNN+DDS is used to estimate parameters in the highly parameterized hydrological model, SUMMA using hourly observed discharge. Notably, hydroCNN quickly generates sub-optimal parameters, serving as a good initial solution for DDS. This initialization aids DDS in converging faster towards an optimal solution. One of the notable advantages of the hydroCNN+DDS approach is its potential for spatial and temporal transferability. This feature proves valuable in dynamic systems and regions with limited historical data, expanding the applicability of the methodology. Furthermore, the proposed methodology is versatile and can be applied to any simple or complex models, accommodating any variables of interest. The best practices of good model calibration are followed in this approach. The methodology is demonstrated for the CAMELS (catchment Attributes and Meteorology for Large-sample Studies) basins of CONUS (Contiguous United States).
The thesis comprises of six chapters. Chapter 1 introduces the research problem and outlines the objectives of the thesis. Chapter 2 details the hydrological modeling framework and routing tool used in the work. A detailed overview of the initial data preparation steps and the setup of the models are provided. Additionally, the computational demands of the research and the use of high-performance computing, facilitated by a supercomputer is discussed. In Chapter 3, the approach for identifying sensitive parameters of SUMMA employing the Efficient Elementary Effect (EEE) method is provided. This chapter also discusses the connection between these sensitive parameters and the dominant hydrological processes governing the catchment. In Chapter 4, the experimental strategy, which focuses on understanding the influence of subjective decisions made during hydrological modeling, particularly in the context of flood simulation is provided. Statistical methods for assessing significance and determining relative importance of the modeling decisions and choices are provided. In Chapter 5, methodology for efficiently estimating hydrological model parameters is introduced using the hydroCNN+DDS approach. The temporal and spatial applicability of this method is demonstrated while upholding the best practices of model calibration. Finally, Chapter 6, serves as conclusion, summarizing the key findings and contributions of this research and future scope based on the insights gained from the study.
All simulations and analyses presented in Chapter 3, 4 and 5 of the thesis were conducted on hourly scale, leading to a considerable volume of data points for each model simulation. Moreover, the extensive model runs required to screen out sensitive parameters of SUMMA model from about hundred parameters required thousands of model runs. The details of these analyses are given in Chapter 3. Furthermore, the calibration strategy adopted for assessing the impact of 36 model configurations as explained in Chapter 4 required each of the configurations to be run thousand times resulting in a very high computational demand. The process of hyperparameter tuning necessary to establish the optimal hydroCNN architecture, as detailed in Chapter 5, involved exploring multiple hyperparameter options. Processing each of these tasks serially were not practically feasible. Hence, we parallelly processed all computationally demanding tasks on Param Pravega supercomputer. Param Pravega supercomputer is a part of National Supercomputer Mission, integrated into high performance computing class of systems at the Supercomputer Education and Research Centre at the Indian institute of Science, Bangalore, India. It holds the distinction of being one of India’s most powerful supercomputers and it has a mix of heterogeneous nodes built of Intel Xeon Cascade Lake processors for the CPU nodes and NVIDIA Tesla V100 cards on the GPU nodes. The methodology and findings presented in this study contribute to advancement in hydrological modeling and indicate promising and practical avenues for future research and progress.
Collections
- Civil Engineering (CiE) [349]