Theoretical and Applied Climatology · 2026

Visibility nowcasting in South Korea: a machine learning approach to class imbalance and distribution shift

Bong Gyun Shin1, Chan Sik Lee2, Hyesun Suh1

1Daejin University · 2Soongsil University

Volume 157, Article 283 · Published 10 April 2026 · DOI: 10.1007/s00704-026-06219-6

Abstract

Atmospheric visibility affects transportation safety, aviation operations, and environmental risk management, but low-visibility events are rare and arise from intertwined meteorological and air-pollution conditions. This project studies visibility nowcasting for six major South Korean cities—Seoul, Busan, Incheon, Daegu, Daejeon, and Gwangju—using observed weather and air-quality data from 2018 to 2021.

The paper combines ASOS meteorological observations with AirKorea air-quality measurements, handles class imbalance in the 2018–2020 training period with SMOTENC and CTGAN-based augmentation, and evaluates five machine-learning and deep-learning model families with a CSI-focused objective. Its central finding is balanced: augmentation and ensembling help the modeling workflow, while performance drops on the 2021 test period reveal temporal distribution shift that needs explicit attention in operational nowcasting.

Method

Overall visibility-nowcasting framework from data collection and augmentation to model training, ensemble prediction, and distribution analysis.
Overall framework. The workflow starts by merging KMA ASOS meteorological observations with AirKorea air-quality measurements, fills missing values, applies time-aware training splits, augments rare low-visibility classes, and evaluates the resulting models with a distribution-shift analysis.

The study treats visibility nowcasting as a tabular time-series classification problem in which the minority low-visibility classes carry the operational risk. Instead of summarizing the pipeline as separate cards, the page follows the paper's sequence: build a regional observation table, address missing values and class imbalance, train machine-learning and deep-learning models, then examine why validation performance does not always transfer to the 2021 test period.

SMOTENC and CTGAN are used because the dataset combines meteorological, air-quality, temporal, and location-related variables, while augmentation targets the minority visibility classes. The modeling stage compares XGBoost, LightGBM, ResNet-like, FT-Transformer, and DeepGBM families, while ensemble voting is used to combine optimized model probabilities without claiming that a single augmentation method universally dominates.

Map highlighting Seoul, Busan, Incheon, Daegu, Daejeon, and Gwangju as the six study cities.
Study area. The nowcasting experiments focus on six major South Korean cities—Seoul, Busan, Incheon, Daegu, Daejeon, and Gwangju—where ASOS weather observations can be paired with nearby AirKorea air-quality measurements.

Results

Because rare low-visibility detection matters more than majority-class accuracy, the paper emphasizes the Critical Success Index (CSI). Visibility is grouped into three classes: Class 0 for severe low visibility below 1 km, Class 1 for reduced visibility from 1–5 km, and Class 2 for normal visibility above 5 km. The result figures are curated here to keep the page close to the DoLQ reading style: first the main performance signal, then the distribution-shift diagnosis that explains why cross-validation gains can weaken on a later test period.

CSI performance improvement from data augmentation for XGBoost and ResNet-like models across South Korean cities.
CSI summary. The paper reports city- and model-dependent CSI changes after SMOTENC, CTGAN, and hybrid augmentation. This representative comparison shows the evaluation focus without turning the project page into a full catalog of every model-family plot.

The reported pattern is deliberately nuanced. Augmentation can help rare-event detection, and ensembling can stabilize predictions, but the 2021 test period still exposes a temporal distribution shift. The paper first makes that drop explicit by comparing each region's validation CSI with its held-out test CSI.

Table 8 from the accepted paper comparing validation CSI, test CSI, performance gap, and percent change for six South Korean cities.
Table 8. The final ensemble loses CSI on the 2021 test period in every region. Seoul and Daegu show the largest drops, Daejeon also degrades substantially, while Busan, Incheon, and Gwangju decline more modestly.

To understand whether the drop comes from confusing reduced and normal visibility, the paper narrows the evaluation to Classes 1 and 2. This keeps the argument in the same order as the published Section 4.4: overall test degradation first, then the class-pair behavior that motivates the distribution-shift analysis.

Table 9 from the accepted paper comparing Class 1 and Class 2 validation and test CSI across regions.
Table 9. When Classes 1 and 2 are evaluated as a two-class problem, Seoul, Daegu, and Daejeon again show the sharpest CSI declines, supporting the paper's focus on class-boundary instability.

The next step is to connect those performance changes to the feature space. The paper uses SHAP to identify influential variables and then measures how the distribution of relative humidity shifts between the training and test periods; the KDE figure below first shows the paper's synthetic-data fidelity check before the direct train-test shift evidence in Table 11.

KDE plots comparing original and augmented RH and PM2.5 distributions for Incheon fold 1.
Distribution diagnostics. This KDE plot evaluates synthetic-data fidelity for Incheon Fold 1 by comparing original and augmented distributions for relative humidity and PM2.5. The direct train-test distribution-shift evidence is then summarized by the RH Wasserstein analysis in Table 11.
Table 11 from the accepted paper comparing RH Wasserstein Dbase, Dshift, and percent change by region.
Table 11. The RH Wasserstein analysis quantifies the paper's distribution-shift explanation by comparing Dbase and Dshift. In severe-degradation regions such as Seoul, Daegu, and Daejeon, Dshift becomes smaller than Dbase, indicating that the Class 2 test distribution moved closer to the Class 1 training distribution.

This supports the paper's interpretation that the learned Class 1–Class 2 boundary became less reliable when applied to the 2021 test distribution.

BibTeX

@Article{Shin2026,
author="Shin, Bong Gyun
and Lee, Chan Sik
and Suh, Hyesun",
title="Visibility nowcasting in South Korea: a machine learning approach to class imbalance and distribution shift",
journal="Theoretical and Applied Climatology",
year="2026",
month="Apr",
day="10",
volume="157",
number="5",
pages="283",
issn="1434-4483",
doi="10.1007/s00704-026-06219-6",
url="https://doi.org/10.1007/s00704-026-06219-6"
}