Research Article
BibTex RIS Cite

Rutin kan testleriyle Covid-19 tanı tahmininde makine öğrenmesi yöntemleriyle mobil uygulama geliştirilmesi

Year 2021, Volume: 60 Issue: 4, 384 - 393, 22.12.2021

Abstract

Amaç: Tüm dünya Aralık 2019'dan bu yana SARS-CoV-2 virüsü ile başa çıkmaya çalışmaktadır. Hastalığın erken belirtileri, soğuk algınlığı ve grip gibi diğer yaygın durumlarla örtüştüğünden, hekimler için erken teşhisin önemi büyüktür. Bu çalışmada, ortak paylaşıma açık anonim bir hastane verisi kullanılarak, rutin kan testleri sonuçları ile SARS-Cov-2 (pozitif / negatif) sonucunun makine öğrenmesi algoritmaları kullanılarak tahmin edilmesi için bir mobil uygulama geliştirilmesi amaçlanmaktadır.
Gereç ve Yöntem: Veri setinde yer alan, kayıp gözlem, sınıf dengesizliği, aykırı gözlem ve ilgisiz değişken problemleri giderildikten sonra makine öğrenmesi yöntemlerinin sınıflandırma performansları test edilmiş, ardından uygun değişkenlerle COVID-19 tespiti için lojistik regresyon modeli kurulmuştur. Bu model kullanılarak makine öğrenmesi tabanlı mobil uygulaması tasarlanmıştır.
Bulgular: Tanı koymada en iyi sonuç veren değişkenler, eozinofil, lökosit, trombosit, monosit, kırmızı kan hücresi, bazofil şeklindedir. Veri ön işleme problemleri giderildikten sonra kullanılan algoritmaların sınıflandırma performansları, ham verideki performans değerlerine göre oldukça yükselmiştir.
Sonuç: Geliştirilen mobil uygulama ile rutin kan testi sonuçları kullanılarak, hızlı ve kolay bir şekilde Covid-19 tanısı tahmininde bulunulması mümkündür.

References

  • 1. web 1: WHO Coronavirus (COVID-19) Dashboard Website https://covid19.who.int/
  • 2. Alballa, N., & Al-Turaiki, I. Machine Learning Approaches in COVID-19 Diagnosis, Mortality, and Severity Risk Prediction: A Review. Informatics in Medicine Unlocked 2021; 100564.
  • 3. Zhou, Z. H. Ensemble methods: Foundations and algorithms. In Ensemble Methods: Foundations and Algorithms 2012; https://doi.org/10.1201/b12207.
  • 4. Zhou F, Yu T, Du R, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. The Lancet 2020; 395(10229):1054-1062.
  • 5. Open Datasets and Machine Learning Projects | Kaggle [Internet]. Available from: https://www.kaggle.com/datasets
  • 6. García, S., Luengo, J., & Herrera, F. Data Preprocessing in Data Mining. Intelligent Systems Reference Library 2015; vol. 72
  • 7. Demirarslan, M., & Suner, A. A Proposal of New Feature Selection Method Sensitive to Outliers and Correlation 2021; bioRxiv.
  • 8. Gislason, P. O., Benediktsson, J. A., & Sveinsson, J. R. Random Forests for land cover classification. Taylor & Francis 2005; 27.4: 294-300. https://doi.org/10.1016/j.patrec.2005.08.011.
  • 9. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 2017; 30: 3146-3154.
  • 10. Chen, T., & Guestrin, C. XGBoost: A Scalable Tree Boosting System. Dl.Acm.Org 2016; 785–794. https://doi.org/10.1145/2939672.2939785.
  • 11. Ke G, Meng Q, Finley T, et al., editors. LightGBM: A highly efficient gradient boosting decision tree 2017; 30: 3146-3154.
  • 12. Prokhorenkova L, Gusev G, Vorobev A, et al., editors. Catboost: Unbiased boosting with categorical features 2018; arXiv:1706.09516, 201.
  • 13. Breiman,L. Bagging predictors. Machine Learning 1996; 24(2), 123–140. https://doi.org/10.1007/bf00058655.
  • 14. Ian Goodfellow, Yoshua Bengio, A. C. Deep Learning Book. Deep Learning 2015 https://doi.org/10.1016/B978-0-12-391420-0.09987-X.
  • 15. W, D. M. Evaluation: From Precision, Recall And F-Measure to Roc, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies 2011; 2(1), 37–63. http://dspace.flinders.edu.au/dspace/http://www.bioinfo.in/contents.php?id=51.
  • 16. Delgado, R., & Tibau, X. A. Why Cohen’s Kappa should be avoided as performance measure in classification 2019; 14.9: e0222916 https://doi.org/10.1371/journal.pone.0222916.
  • 17. Cohen, J. A Coefficient of Agreement for Nominal Scales. Journals.Sagepub.Com 1996; 11(1), 37–46. https://doi.org/10.1177/001316446002000104.
  • 18. Yavaş, M., Güran, A., & Uysal, M. Covid-19 Veri Kümesinin SMOTE Tabanlı Örnekleme Yöntemi Uygulanarak Sınıflandırılması. Avrupa Bilim ve Teknoloji Dergisi 2020; 258-264.
  • 19. BANERJEE, Abhirup, et al. Use of machine learning and artificial intelligence to predict SARS-CoV-2 infection from full blood counts in a population. International immunopharmacology, 2020; 86: 106705.
  • 20. YAŞAR, Ş.; Çolak, C. A Proposed Model Can Classıfy The Covıd-19 Pandemıc Based On The Laboratory Test Results. The Journal of Cognitive Systems 2020; 5.2: 60-63.

Development of mobile application by using machine learning methods for the predictıon of Covid-19 diagnosis with routine blood tests

Year 2021, Volume: 60 Issue: 4, 384 - 393, 22.12.2021

Abstract

Objective: The whole world has been trying to deal with the SARS-CoV-2 virus since December 2019. Early diagnosis is of great importance for physicians, as the early symptoms of the disease overlap with other common conditions such as cold and flu. It is aimed to develop a mobile application to predict the results of routine blood tests and SARS-Cov-2 (positive/negative) using machine learning algorithms with anonymous hospital data that is open to common sharing in this study.
Materials and Methods: After eliminating the missing observation, class imbalance, outlier observation, and unrelated variable problems in the data set, the classification performances of machine learning methods were tested, and then a logistic regression model was established for the detection of COVID-19 with appropriate variables. Using this model, a machine learning-based mobile application has been designed.
Results: The variables that gave the best results in diagnosis were eosinophils, leukocytes, thrombocytes, monocytes, red blood cells, and basophils. After solving the data pre-processing problems, the classification performance of the algorithms used has increased considerably compared to the performance values in the raw data.
Conclusion: With the developed mobile application, it is possible to estimate the diagnosis of Covid-19 quickly and easily by using routine blood test results.

References

  • 1. web 1: WHO Coronavirus (COVID-19) Dashboard Website https://covid19.who.int/
  • 2. Alballa, N., & Al-Turaiki, I. Machine Learning Approaches in COVID-19 Diagnosis, Mortality, and Severity Risk Prediction: A Review. Informatics in Medicine Unlocked 2021; 100564.
  • 3. Zhou, Z. H. Ensemble methods: Foundations and algorithms. In Ensemble Methods: Foundations and Algorithms 2012; https://doi.org/10.1201/b12207.
  • 4. Zhou F, Yu T, Du R, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. The Lancet 2020; 395(10229):1054-1062.
  • 5. Open Datasets and Machine Learning Projects | Kaggle [Internet]. Available from: https://www.kaggle.com/datasets
  • 6. García, S., Luengo, J., & Herrera, F. Data Preprocessing in Data Mining. Intelligent Systems Reference Library 2015; vol. 72
  • 7. Demirarslan, M., & Suner, A. A Proposal of New Feature Selection Method Sensitive to Outliers and Correlation 2021; bioRxiv.
  • 8. Gislason, P. O., Benediktsson, J. A., & Sveinsson, J. R. Random Forests for land cover classification. Taylor & Francis 2005; 27.4: 294-300. https://doi.org/10.1016/j.patrec.2005.08.011.
  • 9. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 2017; 30: 3146-3154.
  • 10. Chen, T., & Guestrin, C. XGBoost: A Scalable Tree Boosting System. Dl.Acm.Org 2016; 785–794. https://doi.org/10.1145/2939672.2939785.
  • 11. Ke G, Meng Q, Finley T, et al., editors. LightGBM: A highly efficient gradient boosting decision tree 2017; 30: 3146-3154.
  • 12. Prokhorenkova L, Gusev G, Vorobev A, et al., editors. Catboost: Unbiased boosting with categorical features 2018; arXiv:1706.09516, 201.
  • 13. Breiman,L. Bagging predictors. Machine Learning 1996; 24(2), 123–140. https://doi.org/10.1007/bf00058655.
  • 14. Ian Goodfellow, Yoshua Bengio, A. C. Deep Learning Book. Deep Learning 2015 https://doi.org/10.1016/B978-0-12-391420-0.09987-X.
  • 15. W, D. M. Evaluation: From Precision, Recall And F-Measure to Roc, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies 2011; 2(1), 37–63. http://dspace.flinders.edu.au/dspace/http://www.bioinfo.in/contents.php?id=51.
  • 16. Delgado, R., & Tibau, X. A. Why Cohen’s Kappa should be avoided as performance measure in classification 2019; 14.9: e0222916 https://doi.org/10.1371/journal.pone.0222916.
  • 17. Cohen, J. A Coefficient of Agreement for Nominal Scales. Journals.Sagepub.Com 1996; 11(1), 37–46. https://doi.org/10.1177/001316446002000104.
  • 18. Yavaş, M., Güran, A., & Uysal, M. Covid-19 Veri Kümesinin SMOTE Tabanlı Örnekleme Yöntemi Uygulanarak Sınıflandırılması. Avrupa Bilim ve Teknoloji Dergisi 2020; 258-264.
  • 19. BANERJEE, Abhirup, et al. Use of machine learning and artificial intelligence to predict SARS-CoV-2 infection from full blood counts in a population. International immunopharmacology, 2020; 86: 106705.
  • 20. YAŞAR, Ş.; Çolak, C. A Proposed Model Can Classıfy The Covıd-19 Pandemıc Based On The Laboratory Test Results. The Journal of Cognitive Systems 2020; 5.2: 60-63.
There are 20 citations in total.

Details

Primary Language Turkish
Subjects Health Care Administration
Journal Section Research Articles
Authors

Mert Demirarslan 0000-0001-8848-7340

Aslı Suner 0000-0002-6872-9901

Publication Date December 22, 2021
Submission Date April 28, 2021
Published in Issue Year 2021Volume: 60 Issue: 4

Cite

Vancouver Demirarslan M, Suner A. Rutin kan testleriyle Covid-19 tanı tahmininde makine öğrenmesi yöntemleriyle mobil uygulama geliştirilmesi. EJM. 2021;60(4):384-93.