A comparison of missing value imputation methods applied to daily precipitation in a semi-arid and a humid region of Mexico
Main Article Content
Climatological data with unreliable or missing values is an important area of research, and multiple methods are available to fill in missing data and evaluate data quality. Our study aims to compare the performance of different methods for estimating missing values explicitly designed for precipitation and multipurpose hydrological data. The climate variable used for the analysis was daily precipitation. We considered two different climate and orographic regions to evaluate the effects of altitude, precipitation regime, and percentage of missing data on the Mean Absolute Error of imputed values and performed a homogeneity evaluation of meteorological stations. We excluded meteorological stations with more than 25% missing data from the analysis. In the semi-arid region, ReddPrec (optimal for nine stations) and GCIDW (optimal for eight stations) were the best-performing methods for the 23 stations, with average MAE values of 1.63 mm/day and 1.46 mm/day, respectively. In the humid region, GCIDW was optimal in ~59% of stations, EM in ~24%, and ReddPrec in ~17%, with average MAE values of ~6.0 mm/day, 6.5 mm/day, and ~9.8 mm/day, respectively. This research makes a valuable contribution to identifying the most appropriate methods to impute daily precipitation in different climatic regions of Mexico based on efficiency indicators and homogeneity evaluation.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Once an article is accepted for publication, the author(s) agree that, from that date on, the owner of the copyright of their work(s) is Atmósfera.
Reproduction of the published articles (or sections thereof) for non-commercial purposes is permitted, as long as the source is provided and acknowledged.
Authors are free to upload their published manuscripts at any non-commercial open access repository.