|dc.description.abstract||Floods cause widespread damage to property and life in different parts of the world. Hence there is a paramount need to develop effective methods for design flood estimation to alleviate risk associated with these extreme hydrologic events. Methods that are conventionally considered for analysis of floods focus on estimation of continuous frequency relationship between peak flow observed at a location and its corresponding exceedance probability depicting the plausible conditions in the planning horizon. These methods are commonly known as at-site flood frequency analysis (FFA) procedures.
The available FFA procedures can be classified as parametric and nonparametric. Parametric methods are based on the assumption that sample (at-site data) is drawn from a population with known probability density function (PDF). Those procedures have uncertainty associated with the choice of PDF and the method for estimation of its parameters. Moreover, parametric methods are ineffective in modeling flood data if multimodality is evident in their PDF. To overcome those artifacts, a few studies attempted using kernel based nonparametric (NP) methods as an alternative to parametric methods. The NP methods are data driven and they can characterize the uncertainty in data without prior assumptions as to the form of the PDF. Conventional kernel methods have shortcomings associated with boundary leakage problem and normal reference rule (considered for estimation of bandwidth), which have implications on flood quantile estimates. To alleviate this problem, focus of NP flood frequency analysis has been on development of new kernel density estimators (kdes).
Another issue in FFA is that information on the whole hydrograph (e.g., time to the peak flow, volume of the flood flow and duration of the flood event) is needed, in addition to
peak flow for certain applications. An option is to perform frequency analysis on each of the variables independently. However, these variables are not independent, and hence there is a need to perform multivariate analysis to construct multivariate PDFs and use the corresponding cumulative distribution functions (CDFs) to arrive at estimates of characteristics of design flood hydrograph. In this perspective, recent focus of flood frequency analysis studies has been on development of methods to derive joint distributions of flood hydrograph related variables in a nonparametric setting.
Further, in real world scenario, it is often necessary to estimate design flood quantiles at target locations that have limited or no data. Regional Flood Frequency analysis (RFFA) procedures have been developed for use in such situations. These procedures involve use of a regionalization procedure for identification of a homogeneous group of watersheds that are similar to watershed of the target site in terms of flood response. Subsequently regional frequency analysis (RFA) is performed, wherein the information pooled from the group (region) forms basis for frequency analysis to construct a CDF (growth curve) that is subsequently used to arrive at quantile estimates at the target site. Though there are various procedures for RFFA, they are largely confined to only univariate framework considering a parametric approach as the basis to arrive at required quantile estimates.
Motivated by these findings, this thesis concerns development of a linear diffusion process based adaptive kernel density estimator (D-kde) based methodologies for at-site as well as regional FFA in univariate as well as bivariate settings. The D-kde alleviates boundary leakage problem and also avoids normal reference rule while estimating optimal bandwidth by using Botev-Grotowski-Kroese estimator (BGKE). Potential of the proposed methodologies in both univariate and bivariate settings is demonstrated by application to synthetic data sets of various sizes drawn from known unimodal and bimodal parametric populations, and to real world data sets from India, USA, United Kingdom and Canada.
In the context of at-site univariate FFA (considering peak flows), the performance of D- kde was found to be better when compared to four parametric distribution based methods (Generalized extreme value, Generalized logistic, Generalized Pareto, Generalized Normal), thirty-two ‘kde and bandwidth estimator’ combinations that resulted from application of four commonly used kernels in conjunction with eight bandwidth estimators, and a local polynomial–based estimator.
In the context of at-site bivariate FFA considering ‘peakflow-flood volume’ and ‘flood duration-flood volume’ bivariate combinations, the proposed D-kde based methodology was shown to be effective when compared to commonly used seven copulas (Gumbel-Hougaard, Frank, Clayton, Joe, Normal, Plackett, and student’s-T copulas) and Gaussian kernel in conjunction with conventional as well as BGKE bandwidth estimators. Sensitivity analysis indicated that selection of optimum number of bins is critical in implementing D-kde in bivariate setting.
In the context of univariate regional flood frequency analysis (RFFA) considering peak flows, a methodology based on D-kde and Index-flood methods is proposed and its performance is shown to be better when compared to that of widely used L-moment and Index-flood based method (‘regional L-moment algorithm’) through Monte-Carlo simulation experiments on homogeneous as well as heterogeneous synthetic regions, and through leave-one-out cross validation experiment performed on data sets pertaining to 54 watersheds in Godavari river basin, India. In this context, four homogeneous groups of watersheds are delineated in Godavari river basin using kernel principal component analysis (KPCA) in conjunction with Fuzzy c-means cluster analysis in L-moment framework, as an improvement over heterogeneous regions in the area (river basin) that are currently being considered by Central Water Commission, India.
In the context of bivariate RFFA two methods are proposed. They involve forming site-specific pooling groups (regions) based on either L-moment based bivariate homogeneity test (R-BHT) or bivariate Kolmogorov-Smirnov test (R-BKS), and RFA based on D-kde. Their performance is assessed by application to data sets pertaining to stations in the conterminous United States. Results indicate that the R-BKS method is better than R-BHT in predicting quantiles of bivariate flood characteristics at ungauged sites, although the size of pooling groups formed using R-BKS is, in general, smaller than size of those formed using R-BHT. In general, the performance of the methods is found to improve with increase in size of pooling groups.
Overall the results indicate that the D-kde always yields bona fide PDF (and CDF) in the context of univariate as well as bivariate flood frequency analysis, as probability density is nonnegative for all data points and integrates to unity for the valid range of the data. The performance of D-kde based at-site as well as regional FFA methodologies is found to be effective in univariate as well as bivariate settings, irrespective of the nature of population and sample size.
A primary assumption underlying conventional FFA procedures has been that the time series of peak flow is stationarity (temporally homogeneous). However, recent studies carried out in various parts of the World question the assumption of flood stationarity. In this perspective, Time Varying Gaussian Copula (TVGC) based methodology is proposed in the thesis for flood frequency analysis in bivariate setting, which allows relaxing the assumption of stationarity in flood related variables. It is shown to be effective than seven commonly used stationary copulas through Monte-Carlo simulation experiments and by application to data sets pertaining to stations in the conterminous United States for which null hypothesis that peak flow data were non-stationary cannot be rejected.||en_US