Uncertainty-Aware Machine Learning for Ambient Air-Pollution Exposure Surfaces in Biomedical Research: From Data Fusion to Neuroepidemiology-Ready Inference

Betekhtin AA

doi:10.37871/jbres2245

ISSN: 2766-2276

2025 December 30;6(12):1984-1995. doi: 10.37871/jbres2245.

Subject area(s)

| | |

Mini Review

Uncertainty-Aware Machine Learning for Ambient Air-Pollution Exposure Surfaces in Biomedical Research: From Data Fusion to Neuroepidemiology-Ready Inference

Betekhtin AA*

ITMO University, Lomonosova St., 9, 191002, Saint Petersburg, Russia

*Corresponding authors: Betekhtin AA, ITMO University, Lomonosova St., 9, 191002, Saint Petersburg, Russia E-mail:

Received: 12 December 2025 | Accepted: 29 December 2025 | Published: 30 December 2025

How to cite this article: Betekhtin AA. Uncertainty-Aware Machine Learning for Ambient Air-Pollution Exposure Surfaces in Biomedical Research: From Data Fusion to Neuroepidemiology-Ready Inference. J Biomed Res Environ Sci. 2025 Dec 30; 6(12): 1996-2001. doi: 10.37871/jbres2245, Article ID: jbres2245

Keywords

Air pollution
Exposure modelling
Machine learning
Uncertainty quantification
Spatiotemporal deep learning
PM_2.5
NO₂
NO₂
Epidemiology

Find and get this Article from other databases

Export Citation CrossMark Publons Harvard Library HOLLIS GrowKudos Search IT Google Scholar Academic Microsoft Scilit Semantic Scholar Universite de Paris UW Libraries SJSU King Library NUS Library McGill DET KGL BIBLiOTEK JCU Discovery Universidad De Lima WorldCat DTU VU on WorldCat ResearchGate

Abstract

Ambient air pollution remains a major, preventable driver of cardio metabolic and neurological disease burden. For biomedical studies, the central methodological bottleneck is not only prediction of pollutant concentrations, but trustworthy exposure assessment: leakage-safe validation, Uncertainty Quantification (UQ), transportable models in low-monitor regions, and transparent propagation of exposure uncertainty into health-effect estimates. This mini-review synthesizes recent advances in global and regional PM_2.5 mapping, spatiotemporal deep learning, virtual monitoring stations, and gap-filling, and links these developments to the rapidly expanding evidence on dementia risk. We provide a practical checklist and worked calculations that translate modern Machine Learning (ML) exposure products into epidemiology-ready inputs.

Key Points (What Biomedical Reviewers Usually Look for)

Exposure surfaces: ML models must be evaluated with spatial and temporal cross-validation that matches the target use (e.g., out-of-region prediction), not only random splits [1,2].
Uncertainty: point predictions are insufficient; credible intervals (or full predictive distributions) are needed to propagate exposure error into health-effect inference [3].
Transportability: hybrid “physics + ML” approaches and geophysical priors reduce degradation far from monitors.
Open data: harmonized monitoring streams (e.g., OpenAQ) and standardized metadata improve reproducibility, but versioning and API changes must be documented [4].
Biomedical relevance: recent systematic reviews and large cohorts support associations between long-term pollution exposure and incident dementia, motivating higher-resolution and better-validated exposure models [5-9].

Introduction (Why “ML for air quality” is Now a Biomedical Methods Topic)

The 2021 WHO Global Air Quality Guidelines substantially tightened recommended levels for key pollutants, including PM_2.5 (Annual mean 5µg/m³; 24-hour 15µg/m³) [10]. In Europe, updated indicators continue to report a large burden attributable to PM_2.5 exposures [2]. Regulatory tightening (e.g., the EU recast Ambient Air Quality Directive) and new accountability mechanisms (Including legal avenues for affected citizens) increase demand for transparent, uncertainty-aware evidence [11-15].

For biomedical research, the key deliverable is an exposure surface: a spatial–temporal field x(s,t) that can be linked to participants by location history. Modern surfaces are typically produced by data fusion (Monitors + satellite AOD + chemical transport models + meteorology + land use) and increasingly by spatiotemporal deep learning [16-18] However, an exposure model that minimizes mean squared error can still be unsafe for epidemiology if it leaks information across space/time, fails in low-monitor regions, or provides no UQ.

Core Definitions (Terms Used Consistently in This Paper)

Exposure surface x(s,t): Estimated pollutant concentration at location s and time t, aligned to the health-study time scale (Daily, monthly, annual).
Data fusion: Combining multiple information sources (Monitors, satellites, CTMs, land-use predictors) to estimate x(s,t) [18,19].
Spatial cross-validation: Validation that withholds entire regions (or monitors) to test transportability; contrasts with random splits that can overestimate performance [1,2].
Uncertainty quantification (UQ): Reporting predictive uncertainty (e.g., standard deviation s(s,t) or predictive intervals) and propagating it into downstream analyses [3].

What the Last Wave of Global PM_2.5 ML Mapping Changed (2019-2025)

Three trends dominate recent high-impact exposure modelling:

Global, long-term PM_2.5 fields with consistent methodology

High-resolution, long-term global PM_2.5 products now combine satellites, models, and monitors with statistical/ML layers, enabling decade-scale exposure assessment [16-18]. These surfaces are attractive for cohort studies because they offer wide coverage and consistent back-casting.

“Physics + ML” to improve low-monitor transportability

Purely data-driven models often degrade far from monitors. Incorporating geophysical a priori estimates into deep learning explicitly targets this failure mode [1]. The implication for biomedical studies is straightforward: improved out-of-sample performance reduces differential exposure misclassification between urban (Monitor-rich) and rural (Monitor-sparse) participants.

Epidemiology-facing UQ and reproducibility

Methodological work increasingly emphasizes uncertainty-aware fusion and explicit validation protocols [3]. In parallel, open monitoring infrastructures facilitate reproducible pipelines, but only if API versions, licensing, and provenance are recorded [4,20].

Practical checklist for an epidemiology-ready ML exposure model

Table 1 summarizes failure modes that frequently trigger reviewer pushback.

Table 1: Epidemiology-ready checklist for ML exposure surfaces.
Item	What to report / do
Target time scale	Define t (daily / monthly / annual) and justify for disease latency (e.g., dementia: multi-year means) [5,6]
Spatial CV	Report region-holdout / monitor-holdout performance (Not only random CV) [1,2]
Uncertainty	Provide predictive intervals or distributions; show calibration (Coverage) [3]
Data provenance	Document monitoring sources and versions (e.g., OpenAQ v3; retired v1/v2 endpoints) [4]
Missingness	Describe gap-filling strategy for monitors/time series if used [25]
Non-stationarity	Address trend/drift (Policy changes, emissions shifts) in training/validation [18]
Leakage controls	Ensure no future data inform past predictions; avoid spatial “bleed” from nearby monitors in random splits [2]

Worked Examples / Calculations (With Sanity Checks)

Example 1: Exceedance probability using a predictive distribution

Suppose an ML surface provides, for a given day and location, a predictive mean µ and standard deviation σ for daily PM_2.5. To estimate the probability of exceeding the WHO 24-hour guideline g = 15µg/m³ , a simple (Often used) approximation is a normal predictive distribution:

P(exceed), (1)

where Φ is the standard normal CDF.

Numerical example (units and sanity check). Let µ = 12µg/m³ and σ = 4µg/m³. Then

(Exceed) ≈ 1 − Φ(0.75) ≈ 1 − 0.773 = 0.227.

Sanity check: since µ < g, exceedance probability should be < 0.5; 22.7% is plausible.

Example 2: Attenuation of a health-effect estimate by classical exposure error

Let the (Unobserved) true long-term exposure be X∗ and the estimated exposure be X = X∗ + ε with independent noise ε. In classical measurement error, regression coefficients are attenuated approximately by

(2)

Thus, a “true” association β∗ may be observed as β ≈ λβ∗. This is a central motivation for UQ and transportability-focused modelling.

Numerical example. Assume between-person long-term exposure variability SD(X∗) = 6µg/m³, so Var(X∗) = 36. If the exposure model has RMSE ≈ 3µg/m³, a rough proxy is Var(ε) ≈ 9. Then

Sanity check: better models (Smaller RMSE) increase λ toward 1, reducing attenuation.

Example 3: Monte Carlo propagation of exposure uncertainty into a Cox model

When an exposure surface provides (µi,σi) for participant i, a simple uncertainty-propagation workflow is:

For m = 1,...,M draws, sample (or use the model’s predictive distribution).
Fit the health model (e.g., Cox) to each draw to obtain bˆ(m).
Report the distribution of bˆ(m) (mean, CI), separating statistical uncertainty from exposure uncertainty.

Why Dementia is a Compelling “biomedical endpoint” for ML Exposure Methods

The evidence base linking long-term ambient pollution to incident dementia has expanded rapidly in recent years. A 2025 systematic review and meta-analysis synthesized the growing observational literature [21], complementing earlier broad syntheses. Large cohort studies report associations between long-term PM_2.5/NO2 exposure and dementia/Alzheimer’s disease incidence. Mechanistically adjacent neurodegenerative outcomes are also being investigated; for example, a 2025 Science study reported links between long-term PM_2.5 exposures and Lewy body dementia.

For such endpoints, the methodological requirement is stronger than for short-latency outcomes: multi-year averaging, sensitivity analyses to mobility, and robust out-of-region exposure prediction become essential. Hence, “physics + ML” transportability gains and UQ are not cosmetic features; they directly affect bias and interpretability.

Emerging Methods That Reviewers Now Expect You to Cite

Beyond global mapping, biomedical submissions increasingly cite:

Forecasting architectures that couple decomposition + graph learning + sequence models (Useful for short-term health endpoints and operational warnings).
Virtual monitoring stations that estimate concentrations in unmonitored locations using ML (Relevant when residential geocoding is fine-grained).
Gap-filling benchmarks for incomplete monitoring time series (Important if you build local fusion models from raw monitors).
Map recovery / sparse sensing concepts that formalize reconstruction from limited sensors.
Policy context that motivates thresholds and public-health interpretation (WHO guidelines; EU Directive 2024/2881) [22-31].

Conclusion

Machine learning has shifted ambient air-pollution exposure assessment from coarse averages to high-resolution, global and regional surfaces. For biomedical research, the next bar is trust: spatially honest validation, calibrated uncertainty, and transparent propagation of exposure error into health models. These requirements align with regulatory tightening and a rapidly growing neuroepidemiology literature on dementia risk. A pragmatic path for submissions in ML-focused biomedical journals is to present exposure modelling as an inference pipeline rather than a pure prediction task: data provenance (e.g., OpenAQ), transportability (Physics + ML), UQ, and sensitivity analyses that match the disease time scale.

Data Availability Statement

This mini-review used publicly accessible documentation and published literature. No new human subject data were collected.

References

Previous article in issue

Next article in issue

Content Alerts

SignUp to our
Content alerts.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Online Submission

2025 December 30;6(12):1984-1995. doi: 10.37871/jbres2245.