American Journal of Mathematics and Statistics
p-ISSN: 2162-948X e-ISSN: 2162-8475
2026; 15(2): 21-30
doi:10.5923/j.ajms.20261502.01
Received: May 6, 2026; Accepted: May 28, 2026; Published: Jun. 4, 2026

Nicole I. Rodriguez1, Aarya Satardekar1, Namit Choudhari2, Spuritha Bhandaru1, Anusha Parajuli1, Rishil Shah3, Benjamin G. Jacob1
1Samuel P. Bell III College of Public Health, University of South Florida, Tampa, Florida, United States of America
2School of Geosciences, College of Arts & Sciences, University of South Florida, Tampa, Florida, United States of America
3Department of Computer Science and Engineering, Bellini College of Artificial Intelligence, Cybersecurity and Computing, University of South Florida, Tampa, Florida, United States of America
Correspondence to: Aarya Satardekar, Samuel P. Bell III College of Public Health, University of South Florida, Tampa, Florida, United States of America.
| Email: | ![]() |
Copyright © 2026 The Author(s). Published by Scientific & Academic Publishing.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Chlamydia remains one of the most prevalent sexually transmitted infections in the United States, with substantial geographic and demographic disparities at fine spatial scales. Traditional surveillance methods often lack the environmental and contextual resolution needed to identify fine-scale spatial clustering patterns in disease burden. Emerging integration of satellite remote sensing and machine-learning approaches offers new opportunities for high-resolution spatial risk mapping. This study aimed to develop a ZCTA-level spatial modeling framework that integrates Sentinel-2 multispectral satellite data with machine-learning algorithms to identify and map exploratory spatial clustering patterns in estimated chlamydia vulnerability, while stratifying risk by socioeconomic, sociodemographic, and racial characteristics. Due to the absence of ZCTA-level chlamydia case data, incidence was approximated using proportional allocation based on population distributions, and subsequently linked with Sentinel-2 satellite spectral bands. Socioeconomic, sociodemographic, and racial covariates were incorporated from census-derived data sources. Three supervised machine-learning algorithms - Random Forest, Support Vector Machine, and Extreme Gradient Boosting (XGBoost) were evaluated to explore spatial patterns in estimated chlamydia vulnerability proxies across ZCTAs. The analyses revealed pronounced disparities, with persistent high-risk clusters concentrated in socioeconomically disadvantaged and racially marginalized ZCTAs. Integrating Sentinel-2 satellite-derived environmental covariates with machine-learning models may support exploratory fine-scale mapping of estimated chlamydia vulnerability patterns. This approach supports targeted surveillance and intervention strategies and provides a scalable framework for studying chlamydia in relation to environmental and social determinants of health.
Keywords: Chlamydia, Sentinel-2, Machine Learning, Poisson, XGBoost, Random Forest, Spatial Hotspot Analysis, Health Disparities
Cite this paper: Nicole I. Rodriguez, Aarya Satardekar, Namit Choudhari, Spuritha Bhandaru, Anusha Parajuli, Rishil Shah, Benjamin G. Jacob, Exploratory Spatial Risk Modeling of Chlamydia Vulnerability at the ZCTA Level Using Spatial Filtering and Sentinel-2 Data, American Journal of Mathematics and Statistics, Vol. 15 No. 2, 2026, pp. 21-30. doi: 10.5923/j.ajms.20261502.01.
![]() | Figure 1. Study Site Map of Hillsborough County, Florida |
where
represented the estimated chlamydia case count in ZCTA
was the population offset, and
included the sampled online, socioeconomic, racial, and sociodemographic predictor variables, machine learning–derived Sentinel-2 features, and spatial covariates. Overdispersion was assessed. Spatial dependence was further incorporated through random effects and spatial filtering terms derived from eigenfunctions.First, pixel-level spectral features were aggregated to ZCTA boundaries using spatial averaging and histogram-based descriptors (mean, variance, and texture metrics). Second, to enable inference at broader spatial scales (county-level), we apply machine–learning–based spectral upscaling models. Specifically, Random Forest regression, Support Vector Regression (SVR), and Gradient Boosting Machines (GBM) were trained as independent supervised learning models to capture nonlinear relationships between environmental features and estimated chlamydia vulnerability proxies. Model performance was evaluated using cross-validated R² and root mean squared error (RMSE), and the best-performing model was used to generate continuous county-level environmental surfaces from ZCTA inputs.
|
|
|
![]() | Figure 2. Spatial Autocorrelation Report |
![]() | Figure 3. Hot and cold spot map of Chlamydia cases in Hillsborough County by ZCTA |
|