Using clustering algorithms and GPM data to identify spatial precipitation patterns over southeastern Brazil

Main Article Content

Bruno Guerreiro Miranda
Rogério Galante Negri
Luana Albertani Pampuch


Southeastern Brazil comprises an important geoeconomic and populous region in South America. Consequently, it is essential to analyze and understand the precipitation profiles in this region. Among different data sources and techniques available to perform such study, the use of clustering algorithms and information from the Global Precipitation Measurement (GPM) project rises as a convenient yet few exploited alternative. Precisely, this study employs the K-Means, the Hierarchical Ward, and the Self-Organizing Maps methods to cluster the annual and seasonal precipitation data from GPM project recorded from 2001 to 2019. The adopted methods are compared in terms of quantitative measures and the number of clusters defined through a well-established rule. The results demonstrate that the annual and seasonal periods are organized according to different number of clusters. Moreover, the results allow: identify the presence of a spatially heterogeneous distribution in the study area; to conclude that the K-Means algorithm is a suitable clustering method in the context of this investigation when compared to Ward’s Hierarchical and Self-Organizing Maps methods in terms of the Calinski-Harabasz and Davies-Bouldin measures; and that the spatial precipitation distribution over Southeastern Brazil is represented by 10 clusters in annual and summer periods, 11 clusters in autumn and spring and 9 clusters in winter period.


Download data is not yet available.

Article Details

Author Biographies

Rogério Galante Negri, São Paulo State University (UNESP), Institute of Science and Technology, São José dos Campos, 12245-000, São Paulo, Brazil.

Graduate Program in Natural Disasters, São Paulo State University (UNESP), National Center for Monitoring and Early Warning of Natural Disasters (CEMADEN), São José dos Campos, 12247-004, São Paulo, Brazil.

Luana Albertani Pampuch, São Paulo State University (UNESP), Institute of Science and Technology, São José dos Campos, 12245-000, São Paulo, Brazil.

Graduate Program in Natural Disasters, São Paulo State University (UNESP), National Center for Monitoring and Early Warning of Natural Disasters (CEMADEN), São José dos Campos, 12247-004, São Paulo, Brazil.

Sharing on:


Ahmad, N. H., Othman, I. R., and Deni, S. M. Hierarchical cluster approach for regionalization of peninsular Malaysia based on the precipitation amount. Journal of Physics: Conference Series, 423, Jan. 2013. URL

Akhisha, P. A., Rao, V. S., Umar, S. K. N., and Devi, K. U. Determination of rainfall regions among the districts of Kerala state. MAUSAM: Quaterly Journal of Meteorology, Hydrology and Geophysics, 69(3):433–436, July 2018.

Akogul, S. and Erisoglu, M. A comparison of information criteria in clustering based on mixture of multivariate normal distributions. Mathematical and Computational Applications, 21(3) 2016. ISSN 2297-8747. doi: 10.3390/mca21030034.

Ambrizzi, T., Jacobi, P. R., and Dutra, L. M. M. Ciência das Mudanças Climáticas e sua Interdisciplinaridade. Anna Blume Editora, 1st edition, Nov. 2015. ISBN 978-85-391-0714-8.

ANA. Ana divulga região hidrográfica Atlântico Sul nas redes sociais, 2014.

Bombardi, R. J. and V., C. L. M. The South Atlantic dipole and variations in the characteristics of the South American monsoon in the WCRP-CMIP3 multi-model simulations. Clim Dyn, May 2009. doi: 10.1007/s00382-010-0836-9.

Bombardi, R. J., Marx, L., Zhu, J., and B., H. Evaluation of the CFSv2 CMIP5 decadal predictions. Climate Dynamics, 44(1-2), Jan. 2015. doi: 10.1007/s00382-014-2360-9.

Carvalho, L. M. V., Jones, C., and Liebmann, B. Extreme precipitation events in Southeastern South America and large-scale convective patterns in the South Atlantic Convergence Zone. Journal of Climate, 15:2377–2394, Sept. 2002. doi:;2.

Coelho, C. A., Oliveira, C. P., Ambrizzi, T., Reboita, M. S., Carpenedo, C. B., Campos, J. L. P. S., Tomazzielo, A. C. N., Pampuch, L. A., Custódio, M. S., Dutra, L. M. M., Rocha, R. P., and Rehbein, A. The 2014 southeast Brazil austral summer drought: regional scale mechanisms and teleconnections. Climate Dynamics, 45:1–16, 2015.

Comunello, E., Araújo, L. B., Sentelhas, P. C., Araújo, M. F. C., Dias, C. T. S., and Fietz, C. R. O uso da análise de cluster no estudo de caracterı́sticas pluviométricas. Sigmae, 2(3) 29–37, 2013. ISSN 2317-0840.

Debbarma, N., Choudhury, P., and P., R. Identification of homogeneous rainfall regions using agenetic algorithm involving multi-criteria decisionmaking techniques. Water Supply, 19(5): 1491–1499, Jan. 2019.

Dourado, C. S., Oliveira, S. R. M., and Avila, A. M. H. Analysis of homogeneous zones in time series of precipitation in the state of Bahia. Brazilian Journal of Meteorology, 2013.

Drumond, A., Gimeno, L., Nieto, R., and Ambrizzi, T. A lagrangian identification of major sources of moisture over central Brazil and La Plata basin. Journal of Geophysical Research Athmospheres, 113, July 2008. doi: 0.1029/2007JD009547.

Freitas, E. d. S. Avaliação do uso do IMERG (Intrgrated Multisatellite Retrievals for GPM) para determinação de eventos chuvosos e suas propriedades no Brasil: uma análise na escala subdiária. Master’s thesis, Universidade Federal da Paraı́ba, Feb. 2019.

Gadelha, A. N. Análise da missão GPM (Global Precipitation Measurement) na estimativa da precipitação sobre território brasileiro. Master’s thesis, Universidade Federal da Paraı́ba, Mar. 2018.

Gadelha, A. N., Almeida, C. N., Freitas, E. S., Coelho, V. H. R., and Barbosa, L. R. Comparision of the estimated precipitation from GPM with data from rain gaugein the coast of the Paraı́ba state – Brazil. XX Simpósio Brasileiro de Recursos Hı́dricos, 2007.

GDAL/OGR contributors. GDAL/OGR Geospatial Data Abstraction software Library. Open Source Geospatial Foundation, 2021. URL

Gimeno, L., Drumond, A., Nieto, R., Trigo, R. M., and Stohl, A. On the origin of continental precipitation. Geophysical Research Letter, 37, July 2010. doi: 10.1029/2010GL043712.

Gonçalves, S., Brasil Neto, R., Santos, C., and Silva, R. Análise da variabilidade espaço-temporal da precipitação no Cariri Paraibano utilizando dados do satélite TRMM. Simpósio Brasileiro de Recursos Hı́dricos, 11 2017. InHan, J., Kamber, M., and J., P. Data Mining: Concepts and Techniques. Elsevier Inc., 3 edition, 2012.

Haykin, S. Neural Networks and Learning Machines. Number v. 10 in Neural networks and learning machines. Prentice Hall, 2009. ISBN 9780131471399.

Huffman, G. J., Bolvin, D. T., Nelkin, E. J., Wolff, D. B., Adler, R. F., Gu, G., Hong, Y., Bowman, K. P., and & Stocker, E. F. The trmm multisatellite precipitation analysis (tmpa): Quasi-global, multiyear, combined-sensor precipitation estimates at fine scales, journal of hydrometeorology. Journal of Hydrometeorology, 8(1):38–55, 2007.

IBGE. Instituto Brasileiro de Geografia e Estatı́stica - produto interno bruto - pib, 2021. URL

Jain, A. K. and Dubes, R. C. Algorithms for Clustering Data. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1988. ISBN 0-13-022278-X.

Ketchen, D. J. and Shook, C. L. The application of cluster analysis in strategic management research: an analysis and critique. Strategic Management Journal, 17(6):441–458, 1996. doi:

Kodinariya, T. and Makwana, P. Review on determining of cluster in K-means clustering. International Journal of Advance Research in Computer Science and Management Studies, 1:90–95, 01 2013.

Kuswanto, H., Setiawan, D., and Sopaheluwakan, A. Clustering of precipitation pattern in Indonesia using TRMM satellite data. Engineering, Technology and Applied Science Research, 2019.

Lohmann, M., Cunico, C., and Maganhotto, R. F. Neural network of the type SOM (Self Organizing Map) as a tool for identifying rain patterns. In I National Symposium on Geography and Territorial Management and XXXIV Geography Week at the State University of Londrina, 2018.

Machado, L. A. Classificação climática para Minas Gerais por meio do método de agrupamento não hierárquico de K-Means. Cadernos do Leste - Artigos Cientı́ficos, 14(14), Dec. 2014.

Malfatti, M. G. L., Cardoso, A. O., and Hamburger, D. S. Identificação de regiões pluviométricas homogêneas na bacia hidrográfica do Rio Paraná. Geociências UNESP, 37(2): 409 – 421, 2018.

Marengo, J. A., Liebmann, B., Grimm, A. M., Misra, V., Dias, P. L. S., Cavalcanti, I. F. A., and Alves, L. M. Recent developments on the South American monsoon system. International Journal of Climatology, 32(1):1–21, 2010. doi: 10.1002/joc.2254.

Marengo, J. A., Nobre, C. A., Seluchi, M. E., Cuartas, L. A., and Alves, L. M. Some characteristics and impacts of the drought and water crisis in southeastern Brazil during 2014 and 2015. Journal of Water Resource and Protection, 8:252–262, Feb. 2016. doi: 10.4236/jwarp.2016.82022.

Miljković, D. Brief review of self-organizing maps. In 40th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pages 1061–1066, 2017. doi: 10.23919/MIPRO.2017.7973581.

Murtagh, F. and Legendre, P. Ward’s hierarchical clustering method: Clustering criterion and agglomerative algorithm. Journal of Classification, (31):274–295, 11 2011. doi: https://doi. org/10.1007/s00357-014-9161-z.

NASA. Integrated multi-satellite retrievals for GPM (IMERG), version 4.4. NASA’s Precipitation Processing Center, Mar. 2015. URL

Neto, S. A. J. Decálogo da climatologia do sudeste brasileiro. Revista Brasileira de Climatologia, 1:43–60, May 2011. doi: 10.5380/abclima.v1i1.25232.

Nogués-Paegle, J. and Mo, K. C. Alternating wet and dry conditions over South America during summer. Monthly Weather Review, 125:279–291, Feb. 1997. doi:

Nunes, L. H. and Rampazo, N. A. M. Tendências da precipitação diária no estado de São Paulo a partir do Índice de Concentração (IC). July 2017. ISBN 978-85-85369-16-3. doi: 10.20396/sbgfa.v1i2017.2312.

Otto, F. E. L., King, C. A. S., Perez, E. C., Wada, Y., van Oldenborgh, G. J., Haarsma, R., Haustein, K., Uhe, P., van Aalst, M., Aravequia, J. A., Almeida, W., and Cullen, H. Factors other than climate change, main drivers of 2014/15 water shortage in southeast brazil. Bulletin of the American Meteorological Society, 96(12):51–56, 2015. doi:10.1175/BAMS-D-15-00120.1.

Pampuch, L. A., Drumond, A., Gimeno, L., and Ambrizzi, T. Anomalous patterns of SST and moisture sources in the South Atlantic Ocean associated with dry events in southeastern Brazil. International Journal of Climatology, 36:4913–4928, Mar. 2016. doi:

Pansera, W. A., Gomes, B. M., Vilas Boas, M. A., and Mello, E. L. Clustering rainfall stations aiming regional frequency analysis. Journal of Food, Agriculture and Environment, 2013.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct):2825–2830, 2011.

Pereira, G., Silva, M. E. S., Moraes, E. C., and Cardozo, F. S. Avaliação dos dados de precipitação estimados pelo satélite TRMM para o Brasil. Revista Brasileira de Recursos Hı́dricos, 18(3):139–148, July 2013.

Raju, K. S. and Kumar, D. N. Classification of indian meteorological stations using cluster and fuzzy cluster analysis, and kohonen artificial neural networks. Nordic Hidrology, 2007.

Reboita, M. S. and Souza, C. A. Ferramenta para o monitoramento dos padrões deteleconexão na América do Sul. Terrae Didática, 17:1–13, Feb. 2021. doi: 10.20396/td.v17i0.8663474.

Reboita, M. S., Gan, M. A., Rocha, R. P., and Ambrizzi, T. Regimes de precipitação na América do Sul: uma revisão bibliográfica. Revista Brasileira de Meteorologia, 25(2):185–204, Oct. 2010.

Salles, L., Satgé, F., Roig, H., Almeida, T., Olivetti, D., and Ferreira, W. Seasonal effect onspatial and temporal consistency of the new GPM-based IMERG-v5 and GSMaP-v7 satellite precipitation estimates in Brazil’s central plateau region. Water, 2019.

Santos, C. A. G., Moura Brasil Neto, R., Silva, R. M., and Costa, S. G. F. Cluster analysis applied to spatiotemporal variability of monthly precipitation over Paraı́ba state using Tropical Rainfall Measuring Mission(TRMM) data. Remote Sensing, 2019.

Seluchi, M. E. and Chou, S. C. Synoptic patterns associated with landslide events in the Serra do Mar, Brazil. Theoretical and Applied Climatology, 98:66–67, Sept. 2009. doi: 0.1007/s00704-008-0101-x.

Silva, L. J., Reboita, M. S., and da Rocha, R. P. Relação da passagem de frentes frias na regiãosul de Minas Gerais (RSMG) com a precipitação e eventos de geada. Revista Brasileira de Climatologia, 14, 2014.

Sugar, C. A. and James, G. M. Finding the number of clusters in a dataset: An information theoretic approach. Journal of the American Statistical Association, 98(463):750–763, 2003. ISSN 01621459. URL

Uda, P. K., L., F. A. C., G., Q., B., B. N., and M., K. Análise de cluster da precipitação na bacia do rio iguaçu, região sul do brasil. XXI Simpósio Brasileiro de Recursos Hı́dricos, Nov. 2015. ISSN 2318-0358.

Van Der Walt, S., Colbert, S. C., and Varoquaux, G. The NumPy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2):22, 2011.

van Rossum, G. and Drake, F. L. The Python Language Reference Manual. Network Theory Ltd., 2011. ISBN 1906966141, 9781906966140.

Vasconcellos, F. C. and Reboita, M. S. Clima das regiões brasileiras e variabilidade climática. In: Iracema F. A. Cavalcanti, Nelson J. Ferreira. (Org.), volume 1. Oficina de Textos, São Paulo, 1 edition, 2021.

Verma, P. and Ghosh, S. K. Study of GPM-IMERG rainfall data product for Gangtori glacier. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-5, 2018.

Webb, A. R. and Copsey, K. D. Statistical Pattern Recognition. John Wiley & Sons, Ltd, 3rd edition, 2011. ISBN 9780470682289. doi: 10.1002/9781119952954.

Zhang, Y., Moges, S., and Block, P. Optimal cluster analysis for objective regionalization of seasonal precipitation in regions of high spatial-temporal variability: Application to western Ethiopia. Journal of Climate, 29(10):3697–3717, May 2016. doi: