Few-shot Cross-country Generalization of Tabular Machine Learning and Foundation Models for Childhood Anemia Prediction under Distribution Shift
Published in arXiv/preprint, 2026
Childhood anemia affects ~40% of children aged 6–59 months globally and is driven by heterogeneous, context-dependent factors that challenge model generalization. We evaluate a transformer-based tabular foundation model (TabPFN v2.6) against classical methods (Logistic Regression, XGBoost, LightGBM) using DHS data from 16 countries (n=68,856) under cross-country and data-scarce settings. Across LOCO, reverse-LOCO, and few-shot evaluations, TabPFN shows superior performance in low-data regimes, with improved calibration (Brier: 0.042; ECE: 0.203) and strong discrimination. While full-data performance differences are modest, results indicate that population heterogeneity dominates predictive performance, and foundation models offer advantages for robust, low-resource global health prediction. Read more
Recommended citation: Brima Y, Atemkeng M, Kallon LH, Niyukuri D, Vacavant A, Saidu S, Chen DG. Few-shot Cross-country Generalization of Tabular Machine Learning and Foundation Models for Childhood Anemia Prediction under Distribution Shift. arXiv preprint arXiv:2605.26589. 2026 May 26. https://arxiv.org/abs/2605.26589
