COMPARISON OF K-NEAREST NEIGHBOR AND NAÏVE BAYES FOR BREAST CANCER CLASSIFICATION USING PYTHON

Irma Handayani(1), Ikrimach Ikrimach(2),


(1) Department of Informatics, University of Technology Yogyakarta
(2) Department of Information System, University of Technology Yogyakarta
Corresponding Author

Abstract


Classification is widely used to determine decisions according to new knowledge gained from processing past data using algorithms. The number of attributes can affect the performance of an algorithm. Several data mining methods that are widely used for classification include the K-Nearest Neighbor and naïve Bayes algorithm. The best algorithm for one data type is not necessarily good for another data type. It is even possible that a good algorithm will be horrendous for other data types. To overcome this issue, this study will analyze the accuracy of the K-Nearest Neighbor and Naïve Bayes algorithms for the classification of breast cancer. So that patients with existing parameters can be predicted which are malignant and benign breast cancer. This pattern can be used as a diagnostic measure so that the cancer can be detected earlier and is expected to reduce the mortality rate from breast cancer.


Keywords


data mining; classification; k-nn algorithms; naive bayes algorithm, breast cancer

References


D. B. Abrams et al., “American Cancer Society,” Encycl. Behav. Med., pp. 79–81, 2013.

N. D, Rashmi & Lekha, A & Bawane, “Analysis of efficiency of classification and prediction algorithms (Naïve Bayes) for Breast Cancer dataset.,” 2015 Int. Conf. Emerg. Res. Electron. Comput. Sci. Technol. (pp. 108–113), 2015.

B. Aisyah and Y. Sulistyo, “Klasifikasi Kanker Payudara Menggunakan Algoritma Gain Ratio,” J. Tek. Elektro, vol. 8, no. 2, pp. 43–46, 2016.

G. I. Salama, M. B. Abdelhalim, and M. A. E. Zeid, “Experimental comparison of classifiers for breast cancer diagnosis,” Proc. - ICCES 2012 2012 Int. Conf. Comput. Eng. Syst., no. November, pp. 180–185, 2012.

E. S. Wahyuni, “Penerapan Metode Seleksi Fitur Untuk Meningkatkan Hasil Diagnosis Kanker Payudara,” Simetris J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 7, no. 1, p. 283, 2016.

I. H. Witten, E. Frank, and M. a Hall, Data Mining: Practical Machine Learning Tools and Techniques, 2011.

D. T. Larose, “Discovering An Introduction to Data Mining,” Discov. Knowl. Data, 2005.

B. Saçlı et al., “Microwave dielectric property based classification of renal calculi: Application of a kNN algorithm,” Comput. Biol. Med., vol. 112, no. January, 2019.

S. W. Binabar and Ivandari, “Optimasi Parameter K pada Algoritma KNN untuk Deteksi Penyakit Kanker Payudara,” IC-Tech, vol. XII, no. 2, 2017.

N. Salmi and Z. Rustam, “Naïve Bayes Classifier Models for Predicting the Colon Cancer,” IOP Conf. Ser. Mater. Sci. Eng., vol. 546, no. 5, 2019.

F. Gemci and T. Ibrikci, “Tumor Type Detection Using Naive Bayes Algorithm on Gene Expression Cancer RNA-Seq Data Set,” no. March, 2019.

B. Santoso and A. Umam, Data Mining dan Big Data Analytics, 2nd ed. Yogyakarta: Penebar Media Pustaka, 2018.

B. Sulistyo, Pengantar Ilmu Perpustakaan. Jakarta: PT. Gramedia Pustaka Utama, Jakarta, 1991.

P. E. H. T.M. COVER, “Nearest Neighbor Pattern Classfication,” vol. I, pp. 1–28, 2012.

K. Polat and S. Güneş, “Breast cancer diagnosis using least square support vector machine,” Digit. Signal Process. A Rev. J., vol. 17, no. 4, pp. 694–701, 2007.

F. Gorunescu, Data Mining: Concept, Model and Techniques. Heidelberg, Berlin: Springer, 2011.

M. Arhami and M. Nasir, DATA MINING Algoritma dan Implementasi, I. Yogyakarta: ANDI.

B. Raharjo, Python Untuk Aplikasi Desktop dan Web, Revisi. Bandung: Informatika Bandung, 2019.

I. Handayani, “Application of K-Nearest Neighbor Algorithm on Classification of Disk Hernia and Spondylolisthesis in Vertebral Column,” Indones. J. Inf. Syst., vol. 2, no. 1, p. 57, 2019.


Full Text: PDF

Article Metrics

Abstract View : 504 times
PDF Download : 146 times

DOI: 10.56327/ijiscs.v5i1.953

Refbacks