Revista Mexicana de Ingeniería Química, Vol. 22, No. 2 (2023), IA235


Comprehensive assessment of groundwater quality in Mexico and application of new water classification scheme based on machine learning

L. Díaz-González, M. Rosales-Rivera, L.A. Chávez-Almazán

https://doi.org/10.24275/rmiq/IA235

Material suplementario


Abstract

 

This study conducted a comprehensive evaluation of groundwater quality at 1,068 monitoring sites across all hydrologic-administrative regions in Mexico. Based on the analysis of 14 physicochemical and microbiological parameters, which include fluorides, fecal coliforms, nitrate-nitrogen, arsenic, cadmium, chromium, mercury, lead, manganese, iron, alkalinity, conductivity, water hardness, and total dissolved solids, it was found that 41% of the sites exhibited good water quality Additionally, 23% of the sites presented regular water quality, while 36% of the sites showed poor water quality. Sites with good water quality exhibited lower concentrations of major ions (Ca, Mg, Na, K, SO4, Cl, and HCO3) compared to sites with regular and poor water quality. Water nomenclature was also estimated using the VL model based on Support Vector Machines with linear kernel, statistical techniques, and Monte Carlo simulation. This model classified 87% of the monitoring sites into four basic water classes: Na HCO3 (47%); Na Cl (18%); Ca HCO3 (17%); and Na SO4 (5%). Furthermore, the t-SNE computational algorithm was applied to reduce the dimensionality of the data and visualize it in a 2D plot; in this context, the data corresponds to the chemical concentrations of major ions and contaminants. This algorithm obtained a clustering consistent with the water nomenclature estimated by the VL model. The contaminant study results revealed that all hydrologic-administrative regions presented at least one physicochemical-microbiological parameter that exceeded the acceptable levels defined by regulations of Mexico. Therefore, the implementation of environmental sanitation strategies is crucial to ensure the availability of high-quality water resources that are safe for human health.

Keywords: Support Vector Machine, Gradient Boosting, Log-ratio transform, Hill-Piper diagram, Visualización 2D t-SNE.

 

References

  • Aguilar-Vilchis, R., Hernández-Rodríguez, I.A., González-Blanco, G., Hernández-Soto, L.M., Aguirre-Garrido, J.F. and Beristain-Cardoso R. (2023). Characterization of sediments from the upper basin of the Lerma River, Mexico: Microbiome and biomethane potential. Revista Mexicana de Ingeniería Química 22, IA2330. https://doi.org/10.24275/rmiq/IA2330
  • Ahmad, N., Sen, Z. and Ahmad, M. (2003). Ground water quality assessment using multirectangular diagrams. Groundwater 41, 828-832. https://doi.org/10.1111/j.1745-6584.2003.tb02423.x
  • Aitchison, J., 1986. The Statistical Analysis of Compositional Data. Chapman and Hall, London, UK.
  • Aitchison, J. and Egozcue, J.J. (2005). Compositional data analysis: where are we and where should we be heading? Mathematical Geology 37, 829-850. https://doi.org/10.1007/s11004-005-7383-7.
  • Al-Bassam, A.M. and Khalil, A.R. (2012). DurovPwin: a new version to plot the expanded Durov diagram for hydro-chemical data analysis. Computers & geosciences 42, 1-6. https://doi.org/10.1016/j.cageo.2012.02.005
  • Amiri, V. and Nakagawa, K. (2021). Using a linear discriminant analysis (LDA)-based nomenclature system and self-organizing maps (SOM) for spatiotemporal assessment of groundwater quality in a coastal aquifer. Hydrogeology Journal 603, 127082. https://doi.org/10.1016/j.jhydrol.2021.127082
  • Apostol, G.L.C., Valenzuela, S. and Seposo, X. (2022). Arsenic in Groundwater Sources from Selected Communities Surrounding Taal Volcano, Philippines: An Exploratory Study. Earth 3, 448-459. https://doi.org/10.3390/earth3010027
  • Butler, J.C. (1979). Trends in ternary petrologic variation diagrams; fact or fantasy? American Mineralogist 64, 1115-1121.
  • Canul-Chan, M., Rodas-Junco, B.A., Uribe-Riestra, E. and Houbron E. (2023). Biodegradation of crude oil present in wastewaters: evaluation of biosurfactant production and catechol 2,3 dioxygenase activity. Revista Mexicana de Ingeniería Química 22, Bio2932. https://doi.org/10.24275/rmiq/Bio2932
  • Chadha, D.K. (1999). A proposed new diagram for geochemical classification of natural waters and interpretation of chemical data. Hydrogeology Journal 7, 431-439. https://doi.org/10.1007/s100400050216.
  • CONAGUA (2022). Comisión Nacional del Agua de México. Informe técnico de calidad del agua en México. Available at: https://www.gob.mx/conagua/articulos/calidad-del-agua  accessed: December 9, 2022.
  • DOF (2022). Diario Oficial de la Federación. NOM-127-SSA1-1994, Salud Ambiental. Agua para uso y consumo humano. Límites permisibles de calidad y tratamientos a que debe someterse el agua para su potabilización. Available at: https://www.dof.gob.mx/nota_detalle.php?codigo=2063863&fecha=22/11/2000 accessed: January 30, 2022.
  • Díaz-González, L., Uscanga-Junco, O.A. and Rosales-Rivera, M. (2021). Development and comparison of machine learning models for water multidimensional classification.  Journal of Hydrology 598, 126234. https://doi.org/10.1016/j.jhydrol.2021.126234
  • Díaz-González, L., Uscanga-Junco, O.A., and Rosales-Rivera, M. (2022). WCSystem – A new computer program for water classification through five new multidimensional models and its application to geosciences. In: Geochemical Treasures and Petrogenetic Processes, (Armstrong-Altrin, J., Kailasa, P., and Verma S. eds) Springer.
  • Durov, S.A. (1948). Natural waters and graphic representation of their composition. Doklady Akademii Nauk SSSR. 59, 87-90.
  • Elhag, A.B. (2017). New diagram useful for classification of groundwater quality. Journal of Geology & Geophysics 6, 279. https://doi.org/10.4172/2381-8719.1000279
  • Géron, A. (2019). Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow Concepts, Tools, and Techniques to Build Intelligent Systems (Second ed.). Canada: O'Reilly Media.
  • Giménez-Forcada, E. (2010). Dynamic of sea water interface using hydrochemial facies evolution diagram. Groundwater 48, 212-216. https://doi.org/10.1111/j.1745-6584.2009.00649.x
  • Güler, C., Thyne, G.D., McCray, J.E., and Turner, A.K. (2002). Evaluation of graphical and multivariate statistical methods for classification of water chemistry data. Hydrogeology Journal 10, 455-474. https://doi.org/10.1007/s10040-002-0196-6.
  • Handa, B.K. (1965). Modified Hill-Piper diagram for classification of groundwater in arid and semi-arid regions. Geochemical Society of India Bulletin 1, 20-24.
  • Hill, R.A. (1940). Geochemical patterns in Coachella Valley. American Geophysical Union Trans. Union Part I. 21, 46-49.
  • Krishan, G., Taloor, A.K., Sudarsan, N., Bhattacharya, P., Kumar, S., Ghosh, N.C, Singh, S., Sharma, A., Rao, M. S., Mittal, S., Sidhu, B. S., Vasisht, R., and Kour, R. (2021). Occurrences of potentially toxic trace metals in groundwater of the state of Punjab in northern India. Groundwater for Sustainable Development15, 100655.  https://doi.org/10.1016/j.gsd.2021.100655
  • Law, A.M., and Kelton, W.D. (2000). Simulation modeling and analysis. McGraw Hill, Boston.
  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vnderplas, J., Passos, A., Cournapeau, D., Brucher, M., Pérez-Espinosa, R., Pandarinath, K., and Hernández-Campos, F.J. (2019). CCWater-A computer program for chemical classification of geothermal waters. Geosciences Journal 23, 621-635. https://doi.org/10.1007/s12303-018-0064-6
  • Piper, A.M. (1944). A graphic procedure in the geochemical interpretation of water analyses. Transactions of the American Geophysical Union 25, 914-923. https://doi.org/10.1029/TR025i006p00914
  • Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V., and Gulin, A. (2018). CatBoost: unbiased boosting with categorical features. Advances in neural information processing systems, 31.
  • Ray, R.K, and Mukherjee, R. (2008). Reproducing the Piper trilinear diagram in rectangular coordinates. Groundwater 46, 893-896. https://doi.org/10.1111/j.1745-6584.2008.00471.x
  • Rencher, A.C. (2002). Methods of Multivariate Analysis. Wiley-Interscience, New York.
  • Shelton, J.L., Englea, M.A., Buccianti, A., and Blondes, M.S. (2018). The isometric log-ratio (ilr)-ion plot: a proposed alternative to the Piper diagram. Journal of Geochemical Exploration 190, 130-141. https://doi.org/10.1016/j.gexplo.2018.03.003.
  • Teng, W.C., Fong, K.L., Shenkar, D., Wilson, J.A., and Foo, D.C.Y. (2016). Piper diagram – a novel visualisation tool for process design. Chemical Engineering Research and Design112, 132-145. https://doi.org/10.1016/j.cherd.2016.06.002
  • Tharwat, A., Gaber, T., Ibrahim, A., and Hassanien, A. E. (2017). Linear discriminant analysis: A detailed tutorial. AI Communications 30, 169-190. https://doi.org/10.3233/AIC-170729
  • Van der Maaten, L., and Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research9, 2579-2605.
  • Verma, S.P. (2015). Monte Carlo comparison of conventional ternary diagrams with new log-ratio bivariate diagrams and an example of tectonic discrimination. Geochemical Journal 49, 393-412. https://doi.org/10.2343/geochemj.2.0364
  • Verma, S.P., Rivera-Gomez, M.A., Díaz-González, L., and Quiroz-Ruiz, A. (2016). Log-ratio transformed major-element based multidimensional classification for altered high-Mg igneous rocks. Geochemistry, Geophysics, Geosystems 17, 4955-4972. https://doi.org/10.1002/2016GC006652
  • Verma, S.P., Uscanga-Junco, O.A., and Díaz-González, L. (2021). A statistically coherent robust multidimensional classification scheme for water. Science of the Total Environment 750, 141704. https://doi.org/10.1016/j.scitotenv.2020.141704
  • Verma, S.P., and Quiroz-Ruiz A. (2006). Critical values for six Dixon tests for outliers in normal samples up to sizes 100, and applications in science and engineering. Revista Mexicana de Ciencias Geológicas 23, 133-161.