โ Paper Details โ
-
Suhasnadh Reddy Veluru, Sai Teja Erukude, and Viswa Chaitanya Marella
- Computer Science and Engineering
- Paper ID: MIJRDV4I40014
- Volume: 04
- Issue: 04
- Pages: 121-130
- ISSN: 2583-0406
- Publication Year: 2025
-
Abstract โโ
Model architectures for machine learning have become increasingly strong as models have developed over the last decade. However, we have started to plateau anytime we try only to improve performance through model architecture. This trend has led to another area of focus: the data itself, a fundamental yet largely forgotten aspect of an AI system. Data-centric AI (DCAI) describes systematically improving datasets to improve machine learning performance. In this paper, we thoroughly examine the landscape of DCAI by synthesizing the perspectives of recent developments in literature published between 2022 and 2025, focusing on data quality, data cleaning, data labeling, data augmentation, and data monitoring. We discuss the methods and tools of DCAI, the successful utilization of DCAI in healthcare, computer vision, and privacy-preserving synthetic data, and the significant difficulties DCAI faces, including cost, bias, and evaluation. Lastly, we discuss exciting future directions, including automating the data pipeline and moving to a more holistic approach to dataset model co-design. The transition from model-centered to data-centered development is foundational to developing better, more reliable, fairer, and more beneficial AI systems everywhere.
Keywords โโ
Data-Centric AI, Data Quality, Data Augmentation, Bias Mitigation, Automation in AI Pipelines, Synthetic Data Generation.
Cite this Publication โโ
Suhasnadh Reddy Veluru, Sai Teja Erukude, and Viswa Chaitanya Marella (2025), Data-Centric AI: A Systematic Review of Methods, Challenges, and Future Directions. Multidisciplinary International Journal of Research and Development (MIJRD), Volume: 04 Issue: 04, Pages: 121-130. https://www.mijrd.com/papers/v4/i4/MIJRDV4I40014.pdf