Skip to main navigation menu Skip to main content Skip to site footer


Vol. 13 No. 2 (2021)

Quantitative structure activity relationship of bioconcentration factor of polychlorinated biphenyls in fish species using machine learning: Relación cuantitativa estructura actividad del factor de bioconcentración de los bifenilos policlorados

May 11, 2021


Polychlorinated biphenyls (PCBs) are persistent pollutants that greatly affect marine ecosystems. Machine learning techniques were used to build quantitative structure activity-relationship (QSAR) models that predict PCBs"™ bioconcentration factor (BCF). These models were built from topographic 2D and 3D descriptors calculated for the molecular structures optimized at molecular mechanics level of theory. After the analysis of their statistical parameters, it was determined that two models are robust enough for predicting logBCF. The models selected were: M_4_LR, built with two molecular descriptors and showed values of r2 = 0,9154, Q2LOO = 0,8944, y Q2ext = 0,9119, and M_13, built with four molecular descriptors and showed values of r2 = 0,9375, Q2LOO = 0,9155, y Q2ext = 0,844. Both models passed the double validation phase, and they satisfied the criteria from the Tropsha"™s test. This implies that predictions for logBCF were quite accurate as it is showed in the results from the present study.

viewed = 554 times


  1. Santos, L. L., Miranda, D., Hatje, V., Albergaria-Barbosa, A. C. R., & Leonel, J. (2020). PCBs occurrence in marine bivalves and fish from Todos os Santos Bay, Bahia, Brazil. Marine Pollution Bulletin, 154, 111070.
  2. Ai, H., Wu, X., Zhang, L., Qi, M., Zhao, Y., Zhao, Q., Zhao, J., & Liu, H. (2019). QSAR modelling study of the bioconcentration factor and toxicity of organic compounds to aquatic organisms using machine learning and ensemble methods. Ecotoxicology and Environmental Safety, 179, 71-78. https://doi.Org/10.1016/j.ecoenv.2019.04.035
  3. Bartalini, A., Muñoz-Arnanz, J., Baini, M., Panti, C., Galli, M., Giani, D., Fossi, M. C., & Jiménez, B. (2020). Relevance of current PCB concentrations in edible fish species from the Mediterranean Sea. Science of The Total Environment, 737, 139520.
  4. Soni, A. K., Sahu, V. K., & Sahu, S. (2017). DFT-Based Prediction of Bioconcentration Factors of Polychlorinated Biphenyls in Fish Species Using Atomic Descriptors. Asian Journal of Chemistry, 29(11), 2515-2521.
  5. Safe, S. H. (1994). Polychlorinated Biphenyls (PCBs): Environmental Impact, Biochemical and Toxic Responses, and Implications for Risk Assessment. Critical Reviews in Toxicology, 24(2), 87-149.
  6. Lunghini, F., Marcou, G., Azam, P., Enrici, M. H., Van Miert, E., & Varnek, A. (2020). Publicly available QSPR models for environmental media persistence. SAR and QSAR in Environmental Research, 31(7), 493-510.
  7. Liu, H., Liu, H., Sun, P., &Wang, Z. (2014). QSAR studies of bioconcentration factors of polychlorinated biphenyls (PCBs) using DFT, PCS and CoMFA. Chemosphere, 114, 101-105.
  8. Devriese, L. I., De Witte, B., Vethaak, A. D., Hostens, K., & Leslie, H. A. (2017). Bioaccumulation of PCBs from microplastics in Norway lobster (Nephrops norvegicus): An experimental study. Chemosphere, 186, 10-16.
  9. Yeo, B. G., Takada, H., Yamashita, R., Okazaki, Y., Uchida, K., Tokai, T., Tanaka, K., & Trenholm, N. (2020). PCBs and PBDEs in microplastic particles and zooplankton in open water in the Pacific Ocean and around the coast of Japan. Marine Pollution Bulletin, 151, 110806.
  10. Soni, A. K., Singh, P., & Sahu, V. K. (2020). DFT-Based Prediction of Bioconcentration Factors of Polychlorinated Biphenyls in Fish Species Using Molecular Descriptors. Advances in Biological Chemistry, 10(01), 1-15.
  11. Mikolajczyk, S., Warenik-Bany, M., Maszewski, S., & Pajurek, M. (2020). Dioxins and PCBs - Environment impact on freshwater fish contamination and risk to consumers. Environmental Pollution, 263, 114611.
  12. Gad, S. C. (2005). Toxicity Testing, Aquatic. En P. Wexler (Ed.), Encyclopedia of Toxicology (Second Edition) (pp. 233­239). Elsevier.
  13. Schmitz, K. S. (2018). Chapter 4—Life Science. En K. S. Schmitz (Ed.), Physical Chemistry (pp. 755-832). Elsevier.
  14. Peake, B. M., Braund, R., Tong, A. Y. C., & Tremblay, L. A. (2016). 5—Impact of pharmaceuticals on the environment. En B. M. Peake, R. Braund, A. Y. C. Tong, & L. A. Tremblay (Eds.), The Life-Cycle of Pharmaceuticals in the Environment (pp. 109-152). Woodhead Publishing.
  15. Lunghini, F., Marcou, G., Azam, P., Patoux, R., Enrici, M. H., Bonachera, F., Horvath, D., & Varnek, A. (2019). QSPR models for bioconcentration factor (BCF): Are they able to predict data of industrial interest? SAR and QSAR in Environmental Research, 30(7), 507-524.
  16. Marigómez, I. (2014). Environmental Risk Assessment, Marine. En P. Wexler (Ed.), Encyclopedia of Toxicology (Third Edition) (pp. 398-401). Academic Press.
  17. Silakari, O., & Singh, P. K. (2021). Chapter 2 - QSAR: Descriptor calculations, model generation, validation and their application. En O. Silakari & P. K. Singh (Eds.), Concepts and Experimental Protocols of Modelling and Informatics in Drug Design (pp. 29-63). Academic Press.
  18. Muratov, E. N., Bajorath, J., Sheridan, R. P., Tetko, I. V., Filimonov, D., Poroikov, V., Oprea, T. I., Baskin, I. I., Varnek, A., Roitberg, A., Isayev, O., Curtalolo, S., Fourches, D., Cohen, Y., Aspuru-Guzik, A., Winkler, D. A., Agrafiotis, D., Cherkasov, A., & Tropsha, A. (2020). QSAR without borders. Chemical Society Reviews, 49(11), 3525-3564.
  19. Chandrasekaran, B., Abed, S. N., Al-Attraqchi, O., Kuche, K., & Tekade, R. K. (2018). Chapter 21—Computer-Aided Prediction of Pharmacokinetic (ADMET) Properties. En R. K. Tekade (Ed.), Dosage Form Design Parameters (pp. 731-755). Academic Press.
  20. Gund, T. (1996). 3—Molecular Modeling of Small Molecules. En N. C. Cohen (Ed.), Guidebook on Molecular Modeling in Drug Design (pp. 55-92). Academic Press.
  21. Errol G. Lewars. (2011). Computational Chemistry: Introduction to the Theory and Applications of Molecular and Quantum Mechanics (2a ed.). Springer Netherlands.
  22. Tosco, P., Stiefl, N., & Landrum, G. (2014). Bringing the MMFF force field to the RDKit: Implementation and validation. Journal of Cheminformatics, 6(1), 37.
  23. García-Jacas, C. R., Marrero-Ponce, Y., Acevedo-Martínez, L., Barigye, S. J., Valdés-Martiní, J. R., & Contreras-Torres, E. (2014). QuBiLS-MIDAS: A parallel free-software for molecular descriptors computation based on multilinear algebraic maps. Journal of Computational Chemistry, 35(18), 1395-1409.
  24. Echols, K. R., Gale, R. W., Schwartz, T. R., Huckins, J. N., Williams, L. L., Meadows, J. C., Morse, D., Petty, J. D., Orazio, C. E., & Tillitt, D. E. (2000). Comparing Polychlorinated Biphenyl Concentrations and Patterns in the Saginaw River Using Sediment, Caged Fish, and Semipermeable Membrane Devices. Environmental Science & Technology, 34(19), 4095-4102.
  25. Geyer, H. J., Scheunert, I., Brüggemann, R., Steinberg, C., Korte, F., & Kettrup, A. (1991). QSAR for organic chemical bioconcentration in Daphnia, algae, and mussels. Science of The Total Environment, 109-110, 387-394.
  26. Devillers, J., Bintein, S., & Domine, D. (1996). Comparison of BCF models based on log P. Chemosphere, 33(6), 1047­1065.
  27. Wei, D., Zhang, A., Wu, C., Han, S., & Wang, L. (2001). Progressive study and robustness test of QSAR model based on quantum chemical parameters for predicting BCF of selected polychlorinated organic compounds (PCOCs). Chemosphere, 44(6), 1421-1428.
  28. Sa^an, M. T., Erdem, S. S., Özpinar, G. A., & Balcioglu, I. A. (2004). QSPR Study on the Bioconcentration Factors of Nonionic Organic Compounds in Fish by Characteristic Root Index and Semiempirical Molecular Descriptors. Journal of Chemical Information and Computer Sciences, 44(3), 985-992.
  29. Lu, X., Tao, S., Cao, J., & Dawson, R. W. (1999). Prediction of fish bioconcentration factors of nonpolar organic pollutants based on molecular connectivity indices. Chemosphere, 39(6), 987-999.
  30. Lu, X., Tao, S., Hu, H., & Dawson, R. W. (2000). Estimation of bioconcentration factors of nonionic organic compounds in fish by molecular connectivity indices and polarity correction factors. Chemosphere, 41(10), 1675-1688.
  31. Fox, K., Zauke, G. P., & Butte, W. (1994). Kinetics of Bioconcentration and Clearance of 28 Polychlorinated Biphenyl Congeners in Zebrafish (Brachydanio rerio). Ecotoxicology and Environmental Safety, 28(1), 99-109.
  32. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1), 10-18.
  33. Thirumalai, K., Singh, A., & Ramesh, R. (2011). A MATLABTM code to perform weighted linear regression with (correlated or uncorrelated) errors in bivariate data. Journal of the Geological Society of India, 77(4), 377-380.
  34. Seeger, M. (2004). Gaussian processes for machine learning. International Journal of Neural Systems, 14(02), 69­106.
  35. Cabrera, N., Mora, J. R., & Marquez, E. A. (2019). Computational Molecular Modeling of Pin1 Inhibition Activity of Quinazoline, Benzophenone, and Pyrimidine Derivatives. Journal of Chemistry, 2019, 1-11.
  36. Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32.
  37. Li, C., & Jiang, L. (2006). Using Locally Weighted Learning to Improve SMOreg for Regression. En Q. Yang & G. Webb (Eds.), PRICAI 2006: Trends in Artificial Intelligence (pp. 375-384). Springer.
  38. Bugeac, C. A., Ancuceanu, R., & Dinu, M. (2021). QSAR Models for Active Substances against Pseudomonas aeruginosa Using Disk-Diffusion Test Data. Molecules, 26(6), 1734.
  39. Veerasamy, R., Rajak, H., Jain, A., Sivadasan, S., Varghese, C. P., & Agrawal, R. K. (2011). Validation of QSAR Models—Strategies and Importance. International Journal of Drug Design and Discovery, 2(3), 511-519.
  40. Gramatica, P., Chirico, N., Papa, E., Cassani, S., & Kovarich, S. (2013). QSARINS: A new software for the development, analysis, and validation of QSAR MLR models. Journal of Computational Chemistry, 34(24), 2121-2132.
  41. Cabrera, N., Mora, J. R., Márquez, E., Flores-Morales, V., Calle, L., & Cortés, E. (2021). QSAR and molecular docking modelling of anti-leishmanial activities of organic selenium and tellurium compounds. SAR and QSAR in Environmental Research, 32(1), 29-50.
  42. Montgomery, D. C., & Runger, G. C. (2014). Applied Statistics and Probability for Engineers (6a ed.). John Wiley & Sons.
  43. Mao, J. X. (2014). Atomic Charges in Molecules: A Classical Concept in Modern Computational Chemistry. Journal of Postdoctoral Research, 2(2), 4.
  44. Gupta, V. P. (2016). 12—Characterization of Chemical Reactions. En V. P. Gupta (Ed.), Principles and Applications of Quantum Chemistry (pp. 385-433). Academic Press.
  45. House, J. E. (2013). Chapter 9—Acid-Base Chemistry. En J. E. House (Ed.), Inorganic Chemistry (Second Edition) (pp. 273-312). Academic Press.