Enhancing Predictions in Ungauged Basins Using Machine Learning to Its Full Potential


  • Takudzwa Fadziso Institute of Lifelong Learning and Development Studies, Chinhoyi University of Technology, ZIMBABWE


Machine learning
ungauged basins
long short-term memory (LSTM) networks
Sacramento Soil Moisture Accounting (SAC-SMA) model


In ungauged basins, long short-term memory (LSTM) networks provide unparalleled precision in prediction. Using k-fold validation, we trained and tested various LSTMs on 531 basins from the CAMELS data set, allowing us to make predictions in basins with no training data. The training and test data set contained 30 years of daily rainfall-runoff data from US catchments ranging in size from 4 to 2,000 km2, with aridity indexes ranging from 0.22 to 5.20, and 12 of the 13 IGPB vegetated land cover classes. Over a 15-year validation period, this effectively "ungauged" model was compared to the Sacramento Soil Moisture Accounting (SAC-SMA) model as well as the NOAA National Water Model reanalysis. Each basin's SAC-SMA was calibrated separately using 15 years of daily data. Across the 531 basins, the out-of-sample LSTM exhibited greater median Nash-Sutcliffe Efficiencies (0.69) than either the calibrated SAC-SMA (0.64) or the National Water Model (0.64). (0.58). This means that there is usually enough information in available catchment attributes data about similarities and differences between catchment-level rainfall-runoff behaviors to generate out-of-sample simulations that are generally more accurate than current models under ideal (i.e., calibrated) conditions. We discovered evidence that adding physical restrictions to the LSTM models improves simulations, which we believe should be the focus of future physics-guided machine learning research.


Download data is not yet available.


Metrics Loading ...


Achar, S. (2015). Requirement of Cloud Analytics and Distributed Cloud Computing: An Initial Overview. International Journal of Reciprocal Symmetry and Physical Sciences, 2, 12–18. https://upright.pub/index.php/ijrsps/article/view/70 DOI: https://doi.org/10.18034/ijrsps.v2.70

Achar, S. (2016). Software as a Service (SaaS) as Cloud Computing: Security and Risk vs. Technological Complexity. Engineering International, 4(2), 79-88. https://doi.org/10.18034/ei.v4i2.633 DOI: https://doi.org/10.18034/ei.v4i2.633

Achar, S. (2017). Asthma Patients’ Cloud-Based Health Tracking and Monitoring System in Designed Flashpoint. Malaysian Journal of Medical and Biological Research, 4(2), 159-166. https://doi.org/10.18034/mjmbr.v4i2.648 DOI: https://doi.org/10.18034/mjmbr.v4i2.648

Achar, S. (2018a). Data Privacy-Preservation: A Method of Machine Learning. ABC Journal of Advanced Research, 7(2), 123-129. https://doi.org/10.18034/abcjar.v7i2.654 DOI: https://doi.org/10.18034/abcjar.v7i2.654

Achar, S. (2018b). Security of Accounting Data in Cloud Computing: A Conceptual Review. Asian Accounting and Auditing Advancement, 9(1), 60–72. https://4ajournal.com/article/view/70

Addor, N., Newman, A. J., Mizukami, N., & Clark, M. P. (2017). The CAMELS data set: Catchment attributes and meteorology for large-sample studies. Hydrology and Earth System Sciences (HESS), 21(10), 5293– 5313. DOI: https://doi.org/10.5194/hess-21-5293-2017

Addor, N., Newman, A., Mizukami, N., & Clark, M. P. (2017). Catchment attributes for large-sample studies. https://doi.org/10.5065/D6G73C3Q DOI: https://doi.org/10.5065/D6G73C3Q

Blöschl, G. (2016). Predictions in ungauged basins—where do we stand? Proceedings of the International Association of Hydrological Sciences, 373, 57– 60. DOI: https://doi.org/10.5194/piahs-373-57-2016

Bynagari, N. B. (2014). Integrated Reasoning Engine for Code Clone Detection. ABC Journal of Advanced Research, 3(2), 143-152. https://doi.org/10.18034/abcjar.v3i2.575 DOI: https://doi.org/10.18034/abcjar.v3i2.575

Bynagari, N. B. (2015). Machine Learning and Artificial Intelligence in Online Fake Transaction Alerting. Engineering International, 3(2), 115-126. https://doi.org/10.18034/ei.v3i2.566 DOI: https://doi.org/10.18034/ei.v3i2.566

Bynagari, N. B. (2016). Industrial Application of Internet of Things. Asia Pacific Journal of Energy and Environment, 3(2), 75-82. https://doi.org/10.18034/apjee.v3i2.576 DOI: https://doi.org/10.18034/apjee.v3i2.576

Bynagari, N. B. (2017). Prediction of Human Population Responses to Toxic Compounds by a Collaborative Competition. Asian Journal of Humanity, Art and Literature, 4(2), 147-156. https://doi.org/10.18034/ajhal.v4i2.577 DOI: https://doi.org/10.18034/ajhal.v4i2.577

Bynagari, N. B. (2018). On the ChEMBL Platform, a Large-scale Evaluation of Machine Learning Algorithms for Drug Target Prediction. Asian Journal of Applied Science and Engineering, 7, 53–64. Retrieved from https://upright.pub/index.php/ajase/article/view/31

Duan, Q., Gupta, V. K., & Sorooshian, S. (1993). Shuffled complex evolution approach for effective and efficient global minimization. Journal of optimization theory and applications, 76(3), 501– 521. DOI: https://doi.org/10.1007/BF00939380

Fekete, B. M, Robarts, R. D., Kumagai, M., Nachtnebel, H.-P., Odada, E., & Zhulidov, A. V. (2015). Time for in situ renaissance. Science, 349(6249), 685– 686. DOI: https://doi.org/10.1126/science.aac7358

Ganapathy, A. (2018). Cascading Cache Layer in Content Management System. Asian Business Review, 8(3), 177-182. https://doi.org/10.18034/abr.v8i3.542 DOI: https://doi.org/10.18034/abr.v8i3.542

Ganapathy, A. (2018). UI/UX Automated Designs in the World of Content Management Systems. Asian Journal of Applied Science and Engineering, 7(1), 43-52.

Gandomi, A. and Haider, M. 2015. “Beyond the hype: Big data concepts, methods, and analytics”, International Journal of Information Management, 35(2): 137-144, http://dx.doi.org/10.1016/J.IJINFOMGT.2014.10.007 DOI: https://doi.org/10.1016/j.ijinfomgt.2014.10.007

Goswami, M., Oconnor, K., & Bhattarai, K. (2007). Development of regionalization procedures using a multi-model approach for flow simulation in an ungauged catchment. Journal of Hydrology, 333(2-4), 517– 531. DOI: https://doi.org/10.1016/j.jhydrol.2006.09.018

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735– 1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735

Hsu, K.-l., Gupta, H. V., & Sorooshian, S. (1995). Artificial neural network modeling of the rainfall-runoff process. Water resources research, 31(10), 2517– 2530. DOI: https://doi.org/10.1029/95WR01955

Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing In Science & Engineering, 9(3), 90– 95. DOI: https://doi.org/10.1109/MCSE.2007.55

Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Kirchner, J. W. (2006). Getting the right answers for the right reasons: Linking measurements, analyses, and models to advance the science of hydrology. Water Resources Research, 42, W03S04. https://doi.org/10.1029/2005WR005362 DOI: https://doi.org/10.1029/2005WR004362

Klemeš, V. (1986). Dilettantism in hydrology: Transition or destiny? Water Resources Research, 22(9S), 177S– 188S. DOI: https://doi.org/10.1029/WR022i09Sp0177S

Kratzert, F., Herrnegger, M., Klotz, D., Hochreiter, S., & Klambauer, G. (2018). Do internals of neural networks make sense in the context of hydrology? In Proceedings of the 2018 AGU fall meeting. Washington, DC.

Kratzert, F., Klotz, D., Brenner, C., Schulz, K., & Herrnegger, M. (2018). Rainfall–runoff modelling using long short-term memory (LSTM) networks. Hydrology and Earth System Sciences, 22(11), 6005– 6022. DOI: https://doi.org/10.5194/hess-22-6005-2018

Liu, Y., Racah, E., Correa, J., Khosrowshahi, A., Lavers, D., Kunkel, K., Wehner, M., Collins, W., et al. (2016). Application of deep convolutional neural networks for detecting extreme weather in climate datasets. arXiv preprint arXiv:1605.01156.

Mayr, A., Klambauer, G., Unterthiner, T., & Hochreiter, S. (2016). Deeptox: Toxicity prediction using deep learning. Frontiers in Environmental Science, 3, 80. DOI: https://doi.org/10.3389/fenvs.2015.00080

McAfee, A., & Brynjolfsson, E. (2017). Machine, platform, and crowd: Harnessing our digital future. New York, NY: WW Norton & Company.

McKinney, W. (2010). Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference, 1697900(Scipy), 51– 56. DOI: https://doi.org/10.25080/Majora-92bf1922-00a

Milly, P. C. D., Betancourt, J., Falkenmark, M., Hirsch, R. M., Kundzewicz, Z. W., Lettenmaier, D. P., & Stouffer, R. J. (2008). Stationarity is dead: Whither water management? Science, 319(5863), 573– 574. DOI: https://doi.org/10.1126/science.1151915

Nearing, G. S., & Gupta, H. V. (2015). The quantity and quality of information in hydrologic models. Water Resources Research, 51, 524– 538. https://doi.org/10.1002/2014WR015895. DOI: https://doi.org/10.1002/2014WR015895

Nearing, G. S., Mocko, D. M., Peters-Lidard, C. D., Kumar, S. V., & Xia, Y. (2016). Benchmarking NLDAS-2 soil moisture and evapotranspiration to separate uncertainty contributions. Journal of Hydrometeorology, 17(3), 745– 759. DOI: https://doi.org/10.1175/JHM-D-15-0063.1

Nearing, G. S., Ruddell, B. L., Clark, M. P., Nijssen, B., & Peters-Lidard, C. (2018). Benchmarking and process diagnostics of land models. Journal of Hydrometeorology, 19(11), 1835– 1852. DOI: https://doi.org/10.1175/JHM-D-17-0209.1

Newman, A. J., Mizukami, N., Clark, M. P., Wood, A. W., Nijssen, B., & Nearing, G. (2017). Benchmarking of a physically based hydrologic model. Journal of Hydrometeorology, 18(8), 2215– 2225. DOI: https://doi.org/10.1175/JHM-D-16-0284.1

Newman, A., Sampson, K., Clark, M. P., Bock, A., Viger, R. J., & Blodgett, D. (2014). A large-sample watershed-scale hydrometeorological dataset for the contiguous USA. Boulder, CO: UCAR/NCAR. https://doi.org/10.5065/D6MW2F4D DOI: https://doi.org/10.5065/D6MW2F4D

Parajka, J., Viglione, A., Rogger, M., Salinas, J., Sivapalan, M., & Blöschl, G (2013). Comparative assessment of predictions in ungauged basins—Part 1: Runoff-hydrograph studies. Hydrology and Earth System Sciences, 17(5), 1783– 1795. DOI: https://doi.org/10.5194/hess-17-1783-2013

Paruchuri, H. (2018). AI Health Check Monitoring and Managing Content Up and Data in CMS World. Malaysian Journal of Medical and Biological Research, 5(2), 141-146. https://doi.org/10.18034/mjmbr.v5i2.554 DOI: https://doi.org/10.18034/mjmbr.v5i2.554

Paruchuri, H., & Asadullah, A. (2018). The Effect of Emotional Intelligence on the Diversity Climate and Innovation Capabilities. Asia Pacific Journal of Energy and Environment, 5(2), 91-96. https://doi.org/10.18034/apjee.v5i2.561 DOI: https://doi.org/10.18034/apjee.v5i2.561

Razavi, T., & Coulibaly, P. (2012). Streamflow prediction in ungauged basins: Review of regionalization methods. Journal of Hydrologic Engineering, 18(8), 958– 975. DOI: https://doi.org/10.1061/(ASCE)HE.1943-5584.0000690

Sellars, S. (2018). “Grand challenges” in big data and the earth sciences. Bulletin of the American Meteorological Society, 99(6), ES95– ES98. DOI: https://doi.org/10.1175/BAMS-D-17-0304.1

Sivapalan, M. (2003). Prediction in ungauged basins: A grand challenge for theoretical hydrology. Hydrological Processes, 17(15), 3163– 3170. DOI: https://doi.org/10.1002/hyp.5155

Vadlamudi, S. (2016). What Impact does Internet of Things have on Project Management in Project based Firms?. Asian Business Review, 6(3), 179-186. https://doi.org/10.18034/abr.v6i3.520 DOI: https://doi.org/10.18034/abr.v6i3.520

Vadlamudi, S. (2018). Agri-Food System and Artificial Intelligence: Reconsidering Imperishability. Asian Journal of Applied Science and Engineering, 7(1), 33-42.

Van Der Walt, S., Colbert, S. C., & Varoquaux, G. (2011). The NumPy array: A structure for efficient numerical computation. Computing in Science and Engineering, 13(2), 22– 30. DOI: https://doi.org/10.1109/MCSE.2011.37

van Rossum, G. (1995). Python tutorial (Technical Report CS-R9526). Amsterdam: Centrum voor Wiskunde en Informatica (CWI).

Vaze, J., Chiew, F., Hughes, D., & Andréassian, V. (2015). Preface: Hs02–hydrologic non-stationarity and extrapolating models to predict the future. Proceedings of the International Association of Hydrological Sciences, 371, 1– 2. DOI: https://doi.org/10.5194/piahs-371-1-2015

Vrugt, J. A., Gupta, H. V., Dekker, S. C., Sorooshian, S., Wagener, T., & Bouten, W. (2006). Application of stochastic parameter optimization to the Sacramento Soil Moisture Accounting Model. Journal of Hydrology, 325(1-4), 288– 307. DOI: https://doi.org/10.1016/j.jhydrol.2005.10.041




How to Cite

Fadziso, T. (2019). Enhancing Predictions in Ungauged Basins Using Machine Learning to Its Full Potential. Asian Journal of Applied Science and Engineering, 8(1), 35–50. https://doi.org/10.18034/ajase.v8i1.10