A machine-learning framework for virus early warning from weather and urban sustainability signals
Cities shape mosquito and arboviral risk through the way heat, water, vegetation, and drainage are managed. We present a fully reproducible early-warning pipeline that predicts trap–week West Nile virus (WNV) positivity one to two weeks ahead using programmatically accessible sources (Socrata WNV pools and Meteostat weather), and a graph-aware post-hoc smoothing that enforces spatial coherence of predicted risks. Because direct API access to the Chicago portal requires a Socrata App Token, all reported experiments use an offline, synthetic trap–week panel that mirrors a Chicago-like spatio-seasonal structure, while we ship turn-key code to re-run the identical analysis on the real data once a token is provided. Across forward-chaining evaluations, gradient boosting with Laplacian smoothing delivers discrimination comparable to that of a strong elastic-net baseline, as assessed by the area under the receiver operating characteristic curve, but substantially better probabilistic calibration (lower Brier and improved reliability). Feature profiles emphasize antecedent heat and relative dryness, aligning with ecological priors, and enable tiered, probability-based operational playbooks for vector control and sustainability co-actions (cooling, drainage, vegetation). The pipeline is designed for transparency, portability, and policy relevance: calibrated probabilities support graded interventions and top-K targeting under budgets, while code parity between synthetic and real modes facilitates external replication.
- CDC (U.S. Centers for Disease Control and Prevention). Data and maps for West Nile. Posted August 19, 2025. Available from: https://www.cdc.gov/west-nile-virus/datamaps/index.html [Last accessed on 2025 Oct 25].
- CDC (U.S. Centers for Disease Control and Prevention). West Nile Virus – Current year data (2025). Available from: https://www.cdc.gov/west-nile-virus/data-maps/currentyear-data.html [Last accessed on 2025 Oct 25].
- CDC (U.S. Centers for Disease Control and Prevention). West Nile Virus - Historic data (1999–2024). Available from: https://www.cdc.gov/west-nile-virus/data-maps/historicdata.html [Last accessed on 2025 Oct 25].
- Li Z, Meng F, Wu B, et al. Reviewing the progress of infectious disease early warning systems and planning for the future. BMC Public Health. 2024;24(1):3080. doi: 10.1186/s12889-024-20537-2
- Pham CT, Nguyen HT, Le HHTC, et al. Challenges and strategies for the development and implementation of climate-informed early warning systems for vector-borne diseases: A systematic review. Trop Med Int Health. 2025;31(1):10-21. doi: 10.1111/tmi.70045
- Farooq Z, Rocklöv J, Wallin J, et al. Artificial intelligence to predict West Nile virus outbreaks with eco-climatic drivers. Lancet Reg Health Eur. 2022;17:100370. doi: 10.1016/j.lanepe.2022.100370
- El-Sayed E, Eid MM, Abualigah L. Machine learning in public health forecasting and monitoring the zika virus. Metaheuristic Optim Rev. 2024;1(2):1-11. doi: 10.54216/MOR.010201
- Villanueva-Miranda I, Xiao G, Xie Y. Artificial intelligence in early warning systems for infectious disease surveillance: A systematic review. Front Public Health. 2025;13:1609615. doi: 10.3389/fpubh.2025.1609615
- Kache PA, Santos-Vega M, Stewart-Ibarra AM, Cook EM, Seto KC, Wasser MAD. Bridging landscape ecology and urban science to respond to the rising threat of mosquito-borne diseases. Nat Ecol Evol. 2022;6(11):1601-1616. doi: 10.1038/s41559-022-01876-y
- Zawarus P. Green infrastructure for mosquito control. In: Architectural Factors for Infection and Disease Control. Routledge; 2022:109-125. doi: 10.4324/9781003214502-9
- Erazo D, Grant L, Ghisbain G, et al. Contribution of climate change to the spatial expansion of West Nile virus in Europe. Nat Commun. 2024;15(1):1196. doi: 10.1038/s41467-024-45290-3
- Mandalapu A, Seong K, Jiao J. Evaluating urban fire vulnerability and accessibility to fire stations and hospitals in Austin, Texas. PLOS Clim. 2024;3(7):e0000448. doi: 10.1371/journal.pclm.0000448
- You J, Hu J, Jiang B. Non-stationarity and spatial spillover effects in artificial intelligence development: Implications for sustainable urban transformation. Sustain Cities Soc. 2025;131:106746. doi: 10.1016/j.scs.2025.106746
- Kazasidis O, Jacob J. Machine learning identifies straightforward early warning rules for human Puumala hantavirus outbreaks. Sci Rep. 2023;13(1):4882. doi: 10.1038/s41598-023-30596-x
- Malik I, Khattak WA. Enhancing urban health: machine learning applications in environmental management. J Sustain Infrastruct Cities Soc. 2023;8(1):1-21.
- Zhang Y, Chen K, Weng Y, Chen Z, Zhang J, Hubbard R. An intelligent early warning system of analyzing Twitter data using machine learning on COVID-19 surveillance in the US. Expert Syst Appl. 2022;198:116882. doi: 10.1016/j.eswa.2022.116882
- Sun H, Chen S, Li X, Cheng L, Luo Y, Xie L. Prediction and early warning model of mixed exposure to air pollution and meteorological factors on death of respiratory diseases based on machine learning. Environ Sci Pollut Res. 2023;30(18):53754-53766. doi: 10.1007/s11356-023-26017-1
- Keshava Murthy R. Early Detection and Prediction of Zoonotic Disease Events Using Event-Based Surveillance and Machine Learning. PhD dissertation. Washington State University; 2023. doi: 10.7273/000005349
- Treash J. Bridging urban planning and public health: Investigating the relationship between land use change and vector-borne disease risks in Ontario. Master’s thesis. Queen’s University; July 2022. Available from: https://queensu.scholaris.ca/server/api/core/bitstreams/1a3a60a2-ac95-4b1d-afed-207643712d55/content [Last accessed on 2022 Jul 01].
- Alarcón JA. Exploring Relationships between Vector-Borne Diseases and Landscape Architecture: Aedes aegypti, Aedes albopictus and Landscape Architecture. Master’s thesis. University of Washington. Available from: https://digital.lib.washington.edu/researchworks/items/c80d830b-e153-4f32-a023-5e5a5c0971ad [Last accessed on 2016 Sep 22].
- Qiu Y. Scalable and Efficient Material Point Methods on Modern Computational Platforms. PhD dissertation. University of California; 2024. Available from: https://escholarship.org/uc/item/5t1436px [Last accessed on 2024 Jul 01].
- Lesk C, Anderson W, Rigden A, et al. Compound heat and moisture extreme impacts on global crop yields under climate change. Nat Rev Earth Environ. 2022;3(12):872-889. doi: 10.1038/s43017-022-00368-8
- Liu C, Kershaw T, Eames ME, Coley DA. Future probabilistic hot summer years for overheating risk assessments. Build Environ. 2016;105:56-68. doi: 10.1016/j.buildenv.2016.05.028
- Chen X, Moraga P. Dengue forecasting and outbreak detection in brazil using LSTM: Integrating human mobility and climate factors. medRxiv. Preprint posted online March 3, 2025. doi: 10.1101/2025.03.02.25323168
- Verjee F. An assessment of the utility of GIS-based analysis to support the coordination of humanitarian assistance. PhD dissertation. The George Washington University; 2007. Available from: https://www.proquest.com/openview/63ac77f794afc21758eb2a7bae87d486/1?pqorigsite=gscholar&cbl=18750 [Last accessed on 2024 Jul 01].
- Baldwin JW, Benmarhnia T, Ebi KL, Jay O, Lutsko NJ, Vanos JK. Humidity’s Role in Heat-Related Health Outcomes: A Heated Debate. Environ Health Perspect. 2023;131(5):055001. doi: 10.1289/ehp11807
- Feng L, Lu J, Hu J, Irfan M, Wu K. Divergent carbon emission mitigation pathways toward sustainable development: Heterogeneous effects of the digital economy in urban centers versus boundary regions. Sustain Cities Soc. 2025;132:106808. doi: 10.1016/j.scs.2025.106808
- Tan L, Yang Z, Irfan M, Ding CJ, Hu M, Hu J. Toward low-carbon sustainable development: Exploring the impact of digital economy development and industrial restructuring. Bus Strategy Environ. 2023;33(3):2159-2172. doi: 10.1002/bse.3584
- Xue H, Cai M, Liu B, Di K, Hu J. Sustainable development through digital innovation: Unveiling the impact of big data comprehensive experimental zones on energy utilization efficiency. Sustain Dev. 2024;33(1):177-189. doi: 10.1002/sd.3112
- Ördek B, McGree J, Corry P, Spreafico C. Investigation of failures in rotational moulding using historical production dataset and machine learning. Int J Adv Manuf Technol. 2025;141(7-8):4291-4309. doi: 10.1007/s00170-025-16925-6
- Zhu H, Chen S, Irfan M, Hu M, Hu J. Exploring the role of the belt and road initiative in promoting sustainable and inclusive development. Sustain Dev. 2023;32(1):712-723. doi: 10.1002/sd.2705
- Davis J, Goadrich M. The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning - ICML ‘06. ACM Press; 2006:233-240. doi: 10.1145/1143844.1143874
- Saito T, Rehmsmeier M. The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLoS One. 2015;10(3):e0118432. doi: 10.1371/journal.pone.0118432
- Brier GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev. 1950;78(1):1-3. doi: 10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2
- Niculescu-Mizil A, Caruana R. Predicting good probabilities with supervised learning. In: Proceedings of the 22nd International Conference on Machine Learning - ICML ‘05. ACM Press; 2005:625-632. doi: 10.1145/1102351.1102430
- Tashman LJ. Out-of-sample tests of forecasting accuracy: An analysis and review. Int J Forecast. 2000;16(4):437-450. doi: 10.1016/S0169-2070(00)00065-0
- Afeld, Munoz C. sodapy 2.2.0. Posted August 31, 2022. Available from: https://pypi.org/project/sodapy/ [Last accessed on 2025 Oct 25].
- SODA Developers. Getting started with the SODA Consumer API. Available from: https://dev.socrata.com/consumers/getting-started.html [Last accessed on 2025 Oct 25].
- Meteostat Developers. Meteostat Python. Available from: https://dev.meteostat.net/python/ [Last accessed on 2025 Oct 25].
- Meteostat Developers. API reference. Available from: https://dev.meteostat.net/python/api/ [Last accessed on 2025 Oct 25].
- Meteostat Developers. Formats & units. Available from: https://dev.meteostat.net/formats.html [Last accessed on 2025 Oct 25].
- Chen T, Guestrin C. XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2016:785-794. doi: 10.1145/2939672.2939785
- Lundberg S, Lee SI. A Unified Approach to Interpreting Model Predictions. arXiv. Preprint posted online 2017. doi: 10.48550/arXiv.1705.07874
- Smola AJ, Kondor R. Kernels and Regularization on Graphs. In: Learning Theory and Kernel Machines (Lecture Notes in Computer Science). Springer; 2003:144-158. doi: 10.1007/978-3-540-45167-9_12
