Data Imputation Methods - Handling Missing Values: Reviewing data imputation methods for handling missing values in datasets to prevent bias and improve predictive performance

Authors

  • Dr. Anke Helsloot Professor of Human-Computer Interaction, Eindhoven University of Technology, Netherlands Author

Keywords:

Data Imputation, Missing Values

Abstract

This research paper provides a comprehensive review of data imputation methods for handling missing values in datasets. Missing data is a common issue in various fields, including healthcare, finance, and social sciences, which can lead to biased results and reduced predictive performance if not handled properly. The paper examines the importance of addressing missing data, discusses the types and causes of missingness, and reviews popular imputation methods. These methods include traditional approaches such as mean imputation, median imputation, and regression imputation, as well as more advanced techniques such as k-nearest neighbors (KNN) imputation, multiple imputation, and matrix factorization-based imputation. The paper also discusses the advantages and limitations of each method and provides guidelines for selecting the most appropriate imputation method based on the characteristics of the dataset and the research objectives. Finally, the paper concludes with a discussion of future research directions in data imputation methods.

Downloads

Download data is not yet available.

References

Pulimamidi, Rahul. "Emerging Technological Trends for Enhancing Healthcare Access in Remote Areas." Journal of Science & Technology 2.4 (2021): 53-62.

Tillu, Ravish, Muthukrishnan Muthusubramanian, and Vathsala Periyasamy. "Transforming regulatory reporting with AI/ML: strategies for compliance and efficiency." Journal of Knowledge Learning and Science Technology ISSN: 2959-6386 (online) 2.1 (2023): 145-157.

K. Joel Prabhod, “ASSESSING THE ROLE OF MACHINE LEARNING AND COMPUTER VISION IN IMAGE PROCESSING,” International Journal of Innovative Research in Technology, vol. 8, no. 3, pp. 195–199, Aug. 2021, [Online]. Available: https://ijirt.org/Article?manuscript=152346

Tatineni, Sumanth. "Applying DevOps Practices for Quality and Reliability Improvement in Cloud-Based Systems." Technix international journal for engineering research (TIJER)10.11 (2023): 374-380.

Perumalsamy, Jegatheeswari, Muthukrishnan Muthusubramanian, and Selvakumar Venkatasubbu. "Actuarial Data Analytics for Life Insurance Product Development: Techniques, Models, and Real-World Applications." Journal of Science & Technology 4.3 (2023): 1-35.

Devan, Munivel, Lavanya Shanmugam, and Manish Tomar. "AI-Powered Data Migration Strategies for Cloud Environments: Techniques, Frameworks, and Real-World Applications." Australian Journal of Machine Learning Research & Applications 1.2 (2021): 79-111.

Sistla, Sai Mani Krishna, and Bhargav Kumar Konidena. "IoT-Edge Healthcare Solutions Empowered by Machine Learning." Journal of Knowledge Learning and Science Technology ISSN: 2959-6386 (online) 2.2 (2023): 126-135.

Pakalapati, Naveen, Bhargav Kumar Konidena, and Ikram Ahamed Mohamed. "Unlocking the Power of AI/ML in DevSecOps: Strategies and Best Practices." Journal of Knowledge Learning and Science Technology ISSN: 2959-6386 (online) 2.2 (2023): 176-188.

Krishnamoorthy, Gowrisankar, and Sai Mani Krishna Sistla. "Exploring Machine Learning Intrusion Detection: Addressing Security and Privacy Challenges in IoT-A Comprehensive Review." Journal of Knowledge Learning and Science Technology ISSN: 2959-6386 (online) 2.2 (2023): 114-125.

Gudala, Leeladhar, et al. "Leveraging Biometric Authentication and Blockchain Technology for Enhanced Security in Identity and Access Management Systems." Journal of Artificial Intelligence Research 2.2 (2022): 21-50.

Prabhod, Kummaragunta Joel. "Advanced Machine Learning Techniques for Predictive Maintenance in Industrial IoT: Integrating Generative AI and Deep Learning for Real-Time Monitoring." Journal of AI-Assisted Scientific Discovery 1.1 (2021): 1-29.

Tembhekar, Prachi, Lavanya Shanmugam, and Munivel Devan. "Implementing Serverless Architecture: Discuss the practical aspects and challenges." Journal of Knowledge Learning and Science Technology ISSN: 2959-6386 (online) 2.3 (2023): 560-580.

Devan, Munivel, Kumaran Thirunavukkarasu, and Lavanya Shanmugam. "Algorithmic Trading Strategies: Real-Time Data Analytics with Machine Learning." Journal of Knowledge Learning and Science Technology ISSN: 2959-6386 (online) 2.3 (2023): 522-546.

Tatineni, Sumanth, and Karthik Allam. "Implementing AI-Enhanced Continuous Testing in DevOps Pipelines: Strategies for Automated Test Generation, Execution, and Analysis." Blockchain Technology and Distributed Systems 2.1 (2022): 46-81.

Sadhu, Ashok Kumar Reddy. "Enhancing Healthcare Data Security and User Convenience: An Exploration of Integrated Single Sign-On (SSO) and OAuth for Secure Patient Data Access within AWS GovCloud Environments." Hong Kong Journal of AI and Medicine 3.1 (2023): 100-116.

Makka, A. K. A. “Administering SAP S/4 HANA in Advanced Cloud Services: Ensuring High Performance and Data Security”. Cybersecurity and Network Defense Research, vol. 2, no. 1, May 2022, pp. 23-56, https://thesciencebrigade.com/cndr/article/view/285.

Downloads

Published

15-02-2023

How to Cite

[1]
Dr. Anke Helsloot, “Data Imputation Methods - Handling Missing Values: Reviewing data imputation methods for handling missing values in datasets to prevent bias and improve predictive performance”, Distrib Learn Broad Appl Sci Res, vol. 9, pp. 58–70, Feb. 2023, Accessed: Dec. 22, 2024. [Online]. Available: https://dlabi.org/index.php/journal/article/view/68

Similar Articles

1-10 of 167

You may also start an advanced similarity search for this article.