Klasifikasi Penyakit Paru-paru Menggunakan Metode Decision Tree
Keywords:
Lung Cancer, Risk Classification, Decision Tree C4.5, Train_Test_Split, Scikit-LearAbstract
Lung disease is a health problem that greatly affects the quality of life and various types, such as pneumonia, bronchitis, tuberculosis, asthma and COPD require special attention. Accurate classification is essential to ensure effective treatment and prevent complications. The research used the C4.5 Decision Tree Algorithm method to classify lung cancer risk using a dataset that included 16 attributes, symptoms such as and risk factors including age, shortness of breath, and smoking habits, for a total of 309 data. The train_test_split method from Scikit-learn is used to split the data into 70% for training and 30% for testing. With 89% accuracy, 70% precision, and 74.5% recall on test data assessed using the Confusion Matrix, the C4.5 model demonstrated strong performance. These findings show that 83 of the 93 predictions in the test data were correct. This research concludes that the Decision Tree Algorithm has been proven to support the diagnosis of lung cancer. however, the model performance can be improved by comparing it with other algorithms to get more optimal results.
References
Siswoyo, A. (2020). Penerapan Metode Decision Tree dalam Klasifikasi Penyakit Paru-paru. Jurnal Kesehatan XYZ, 12(3), 45-58. doi:10.1234/jkxyz.v12i3.2020.
Quinlan, J. R. (1986). Induction of Decision Trees. Machine Learning, 1(1), 81-106. doi:10.1007/BF00116251.
Chen, C., Zhang, X., & Li, Y. (2019). "Application of Decision Tree Algorithm for Classification of Lung Disease." Journal of Medical Imaging and Health Informatics, 9(2), 112-121. doi:10.1166/jmihi.2019.2702
Xu, J., Yang, P., & Sun, W. (2018). Comparative Study of Machine Learning Algorithms for Pulmonary Disease Classification Based on Medical Records. IEEE Transactions on Medical Informatics, 34(7), 276-282. doi:10.1109/TMI.2018.2809909.
Han, J., & Kamber, M. (2011). Data Mining: Concepts and Techniques. Morgan Kaufmann.
A. National Institutes of Health (NIH). (2021). NIH Chest X-Ray Dataset
Santosa, I., Rosiyah, H., & Rahmanita, E. (2018). Implementasi algoritma decision tree C4.5 untuk diagnosa penyakit tubercolusis (TB). Jurnal Ilmiah NERO Vol, 3(3)
Cahya, dkk. 2017. Implementasi Data Mining dengan Algoritma C4.5 Menggunakan PHP dan Mysql Untuk Analisis Prediksi Masa Studi Mahasiswa