Climate patterns of political division units obtained using automatic classification trees

Sergio R. Coria, Carlos Gay-García, Lourdes Villers-Ruiz, Adolfo Guzmán-Arenas, Óscar Sánchez-Meneses, Oswaldo R. Ávila-Barrón, Mónica Pérez-Meza, Xóchitl Cruz-Núñez, Gilberto Lorenzo Martínez-Luna

Abstract

This article proposes a methodology to discover patterns in observed climatologic data, particularly temperatures and rainfall, in subnational political division units using an automatic classification algorithm (a decision tree produced by the C4.5 algorithm). Thus, the patterns represent classification trees, assuming that: (1) every political division unit contains at least one climatological station, and (2) the recording periods of the stations are relatively similar in duration and in their initial and ending years. A series of classification models are produced by using different subsets from an experimental dataset. This dataset contains information from 3606 climatological stations in Mexico with recording periods whose durations, initial and ending years are diverse. The target (dependent) variable in all these models is the name of the political unit (i.e., the state). The predictors are 36 monthly features per each climatological station: 12 features corresponding to a minimum temperature, 12 to a maximum temperature, and 12 to cumulative rainfall. The altitude feature is also used as one of the predictors, in addition to the other 36; however, it is used only to quantify its additional contribution to the modelling. The results show that classification trees are effective models for describing and representing non-trivial patterns to characterize the political division units based on their monthly temperatures and rainfalls. One of the remarkable findings is that the cumulative rainfall of May is the feature with highest discrimination capability to the characterization task, which is consistent with the theoretical background on Mexican climatology. In addition, classification trees offer higher expressivity to non-experts in machine learning.

Keywords

Climate patterns; political division; Mexico climate; data mining; data science; classification algorithms; classification trees; C4.5 algorithm

Full Text:

PDF