Serialization-Induced Prediction Drift

Khudran M. Alzhrani

doi:10.48084/etasr.18259

Authors

Khudran M. Alzhrani Computers Department, College of Engineering and Computing in Al-Qunfudhah, Umm Al-Qura University, Al-Qunfudhah, Mecca, Saudi Arabia https://orcid.org/0000-0003-2212-0233

Volume: 16 | Issue: 3 | Pages: 35359-35365 | June 2026 | https://doi.org/10.48084/etasr.18259

Received: 18 February 2026 | Revised: 6 March 2026, 22 March 2026, and 2 April 2026 | Accepted: 10 April 2026 | Online: 6 June 2026

Corresponding author: Khudran M. Alzhrani

Abstract

Tabular Machine Learning (ML) workflows often export and reload numeric features through formats, such as CSV and Parquet, sometimes rounding values or casting between floating-point precisions (e.g., float64 to float32). Although commonly treated as engineering details, these steps can introduce systematic numerical perturbations that propagate into model predictions. This study presents a methodology to quantify how routine data-representation changes affect prediction drift and performance. Starting from a float64 Parquet baseline, CSV round-trip variants with 6, 3, and 1 decimal places and a float32 Parquet variant are generated. Fixed train-validation-test splits are reused across treatments, and two scenarios are evaluated: train-on-variant and evaluation-only (baseline-trained, perturbed-test). Value-level drift, prediction drift (score drift, rank correlation, and classification churn), and performance deltas are measured, with the results aggregated across three random seeds with bootstrap confidence intervals and Wilcoxon signed-rank tests. Experiments on the Breast Cancer Wisconsin (Diagnostic) classification dataset and the Diabetes and California Housing regression datasets, using multiple model families, show that mild perturbations (CSV 6/3 decimals and float32) generally yield negligible drift and no meaningful performance change, while rounding to 1-decimal place triggers a sharp instability onset, including threshold-crossing effects in classification and marked drift amplification in the most sensitive regression settings. Sensitivity varied by model family under aggressive rounding, and the added analysis of representative linear models showed that 1-decimal rounding perturbs the internal linear score and can also change the coefficient structure learned during retraining.

Keywords:

prediction drift, data serialization, numerical precision, machine learning pipelines, tabular data

References

D. Sculley et al., "Hidden Technical Debt in Machine Learning Systems," in NIPS'15: Proceedings of the 29th International Conference on Neural Information Processing Systems, Montreal, Canada, Dec. 2015, vol. 2, pp. 2503–2511.

T. Gebru et al., "Datasheets for Datasets," Communications of the ACM, vol. 64, no. 12, pp. 86–92, Dec. 2021.

J. Yuan et al., "Understanding and Mitigating Numerical Sources of Nondeterminism in LLM Inference." arXiv, Oct. 24, 2025.

I. Gonzalez Pepe, Y. Chatelain, G. Kiar, and T. Glatard, "Numerical Stability of DeepGOPlus Inference," PLOS ONE, vol. 19, no. 1, Jan. 2024, Art. no. e0296725.

G. Kiar et al., "Numerical Uncertainty in Analytical Pipelines Lead to Impactful Variability in Brain Networks," PLOS ONE, vol. 16, no. 11, Nov. 2021, Art. no. e0250755.

G. Kiar et al., "Comparing Perturbation Models for Evaluating Stability of Neuroimaging Pipelines," The International Journal of High Performance Computing Applications, vol. 34, no. 5, pp. 491–501, Sep. 2020.

M. Andrysco, R. Jhala, and S. Lerner, "Printing Floating-Point Numbers: A Faster, Always Correct Method," in Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, St. Petersburg, FL, USA, Jan. 2016, pp. 555–567.

U. Adams, "Ryū: Fast Float-to-String Conversion," in Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, Philadelphia, PA, USA, Jun. 2018, pp. 270–282.

J. Champagne Gareau and D. Lemire, "Converting Binary Floating‐Point Numbers to Shortest Decimal Strings: An Experimental Review," Software: Practice and Experience, vol. 56, no. 4, pp. 462–478, Apr. 2026.

T. Johnson III and S. A. Mostafa, "Impact of Data Perturbation for Statistical Disclosure Control on the Predictive Performance of Machine Learning Techniques," Journal of Data Science, vol. 23, no. 2, pp. 312–331, Jan. 2025.

S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, "Deep Learning with Limited Numerical Precision," in Proceedings of the 32nd International Conference on Machine Learning, Lille, France, Jul. 2015, vol. 37, pp. 1737–1746.

P. Micikevicius et al., "Mixed Precision Training." arXiv, 2017.

D. Kalamkar et al., "A Study of BFLOAT16 for Deep Learning Training." arXiv, 2019.

B. Jacob et al., "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference," in IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, Jun. 2018, pp. 2704–2713.

A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, "A Survey of Quantization Methods for Efficient Neural Network Inference," in Low-Power Computer Vision, 1st ed., Boca Raton: Chapman and Hall/CRC, 2022, pp. 291–326.

W. Wolberg, O. Mangasarian, N. Street, W. Street, "Breast Cancer Wisconsin (Diagnostic)." UCI Machine Learning Repository, 1993.

W. N. Street, W. H. Wolberg, and O. L. Mangasarian, "Nuclear Feature Extraction for Breast Tumor Diagnosis," presented at the IS&T/SPIE's Symposium on Electronic Imaging: Science and Technology, San Jose, CA, USA, Jul. 1993, pp. 861–870.

T. Hastie, B. Efron, I. Johnstone, and R. Tibshirani, "Diabetes Data." North Carolina State University, 2004, [Online]. Available: https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html.

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, "Least Angle Regression," The Annals of Statistics, vol. 32, no. 2, Apr. 2004.

L. Torgo, "California Housing Prices." Kaggle, Apr. 2025, [Online]. Available: https://www.kaggle.com/datasets/camnugent/california-housing-prices/data.

R. K. Pace and R. Barry, "Sparse Spatial Autoregressions," Statistics & Probability Letters, vol. 33, no. 3, pp. 291–297, May 1997.

S. Sudianto, A. Sa'adah, and B. F. Arkana, "Utilization of Adaptive Machine Learning for Streaming Sentiment Analysis: The Effects of Batch and Drift Types," Engineering, Technology & Applied Science Research, vol. 16, no. 1, pp. 32384–32390, Feb. 2026.

Serialization-Induced Prediction Drift

Authors

Abstract

Keywords:

References

Downloads

How to Cite

Metrics

License

template

Download the latest version of our template (March 13, 2026)