Review of Medical Data Cleaning and Multimodal Fusion Methods for Prediction

Authors

  • Yue Teng Huaibei Normal University, Tongling City, Anhui Province, China

Keywords:

Medical data cleaning, Multimodal data fusion, K-means clustering, Principal Component Analy- sis (PCA), Dynamic Time Warping (DTW), Deep Neural Network (DNN), Attention mechanism, Average Length of Stay (ALOS) prediction

Abstract

With the explosive growth of biomedical data, the processes of medical data cleaning, fusion, and modeling have become central challenges in intelligent healthcare research. This paper systematically reviews existing studies on medical data preprocessing and multimodal data fusion, with a focus on key techniques such as K-means clustering, Principal Component Analysis (PCA), Dynamic Time Warping (DTW), and Deep Neural Networks with Attention Mechanisms (DNN-Attention). First, from the perspective of data cleaning, its summarizes methods for noise reduction, redundancy elimination, and clustering-guided dimensionality optimization, discussing their roles in improving data quality. Second, it reviews multimodal medical data fusion techniques, emphasizing DTW-based temporal alignment and spatiotemporal modeling strategies for heterogeneous data. Furthermore, recent advances in Average Length of Stay (ALOS) prediction models are summarized and compared, covering traditional statistical models, ensemble learning methods, and deep learning frameworks, while highlighting the advantages of attention mechanisms in capturing nonlinear relationships among multimodal features. Finally, this paper identifies key challenges—including the lack of standardized data formats, insufficient model interpretability, and limited cross-institutional generalization—and outlines future directions in explainable modeling, privacy-preserving computation, and real-time intelligent analysis. Overall, this review aims to provide systematic theoretical reference and methodological guidance for research and practice in intelligent healthcare systems, promoting data-driven medical decision-making and resource optimization.

Downloads

Published

2025-10-31