Venema, V., O. Mestre, E. Aguilar, I. Auer, J.A. Guijarro, P. Domonkos, G. Vertacnik, T. Szentimrey, P. Stepanek, P. Zahradnicek, J. Viarre, G. Müller-Westermeier, M. Lakatos, C.N. Williams, M.J. Menne, R. Lindau, D. Rasol, E. Rustemeier, K. Kolokythas, T. Marinova, L. Andresen, F. Acquaotta, S. Fratianni, S. Cheval, M. Klancar, M. Brunetti, Ch. Gruber, M. Prohom Duran, T. Likso, P. Esteban, Th. Brandsma. Benchmarking homogenization algorithms for monthly data. Climate of the Past, 8, pp. 89-115, doi: 10.5194/cp-8-89-2012, 2012.
Abstract. The COST (European Cooperation in Science and Technology) Action ES0601: advances in homogenization methods of climate series: an integrated approach (HOME) has executed a blind intercomparison and validation study for monthly homogenization algorithms. Time series of monthly temperature and precipitation were evaluated because of their importance for climate studies and because they represent two important types of statistics (additive and multiplicative). The algorithms were validated against a realistic benchmark dataset. The benchmark contains real inhomogeneous data as well as simulated data with inserted inhomogeneities. Random independent break-type inhomogeneities with normally distributed breakpoint sizes were added to the simulated datasets. To approximate real world conditions, breaks were introduced that occur simultaneously in multiple station series within a simulated network of station data. The simulated time series also contained outliers, missing data periods and local station trends. Further, a stochastic nonlinear global (network-wide) trend was added.
Participants provided 25 separate homogenized contributions as part of the blind study. After the deadline at which details of the imposed inhomogeneities were revealed, 22 additional solutions were submitted. These homogenized datasets were assessed by a number of performance metrics including (i) the centered root mean square error relative to the true homogeneous value at various averaging scales, (ii) the error in linear trend estimates and (iii) traditional contingency skill scores. The metrics were computed both using the individual station series as well as the network average regional series. The performance of the contributions depends significantly on the error metric considered. Contingency scores by themselves are not very informative. Although relative homogenization algorithms typically improve the homogeneity of temperature data, only the best ones improve precipitation data. Training the users on homogenization software was found to be very important. Moreover, state-of-the-art relative homogenization algorithms developed to work with an inhomogeneous reference are shown to perform best. The study showed that automatic algorithms can perform as well as manual ones.