Please use this identifier to cite or link to this item: http://localhost:8080/xmlui/handle/123456789/8668
Title: Evaluation of Corona Virus Mutations Using Deep Learning Algorithm
Authors: Anzi, Mohammed
Dawood, Ali
Keywords: DNA,
K-mer
TF-IDF
Coronavirus,
Machine Learning,
Deep Learning
Convolutional Neural Networks
Feature extraction
Issue Date: 1-Jan-2022
Publisher: University of Anbar
Abstract: Due to the fast spread of the new Coronavirus called SARS-CoV-2 (otherwise known as COVID-19 virus) worldwide and its continuous mutations, the high transmission rate of this pathogenic virus demands an early prediction and proper identification for the treatment. However, polymorphic nature of this virus allows it to adapt and sustain in different kinds of environment which makes it difficult to predict, such large viral outbreaks require early elucidation to determine the genetic sequence of the virus in order to design an effective system for identifying variants Different known and unknown virus. It is known that the genomic sequence carries the vast majority of genetic information about variations and variants of the Coronavirus. The goal of this work is to predict codon mutations likely to occur in sequences using historical DNA sequencing data. This work proposes a system that depends on deep learning to predict the mutations of complete coronavirus genomes It focuses heavily on codon mutation based on the Convolutional Neural Network (CNN) algorithm as an alignment-free method. The k-mer technology is applied to fragment the DNA of coronavirus mutants to create a unique vocabulary, the TF-IDF is then used later to extract the features from the sequence of virus mutants and the extracted features are fed as inputs to the proposed CNN model and the machine learning algorithms used in this work are LR, DT and RF. The results showed that the proposed CNN model achieved a high prediction rate with an accuracy rate of 99%. In contrast, the three machine learning algorithms achieved an average accuracy rate compared to the proposed CNN model, each with an accuracy of 75% in the RF algorithm, 48% in the LR algorithm, and 64% in the DT algorithm. The proposed approach can correctly predict and characterize other coronavirus strains, such as MERS-CoV, SARS-CoV-2, SARS-CoV, Alpha-CoV, Beta, Delt.
URI: http://localhost:8080/xmlui/handle/123456789/8668
Appears in Collections:قسم علوم الحاسبات

Files in This Item:
File Description SizeFormat 
Mohammed kareem.pdf4.58 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.