Analysis of Genomic Sequences for Prediction of Cancerous Cells using Wavelet Technique

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395 -0056

Volume: 04 Issue: 04 | Apr -2017

p-ISSN: 2395-0072

www.irjet.net

Analysis of Genomic sequences for prediction of Cancerous cells using Wavelet technique T.Thillai Gayathri PG Scholar, Dept. Of ECE,PSG College of Technology, Coimbatore, Tamilnadu, India. ---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - Nowadays, the researchers have been faced

many challenges in analyzing the vast DNA sequences. In analysis of enormous amount of data, Signal Processing concepts especially Wavelet Transform techniques play an important role. One of the main reasons for deadly disease like cancer is the Genetic abnormality. The abnormal changes in the genes (mutation of DNA sequences) can cause cancer. The main aim of this paper is to identify the major changes occur in the normal sequences due to mutation. In this paper, the EIIP representation technique is used which has major advantage of reducing computational overhead compared to other representation techniques. Here, MATLAB R2014a is used which supports bioinformatics toolbox. The DNA sequences have been collected from NCBI website for analysis. Key Words: Deoxy Ribo Nucleic Acid (DNA), Digital Signal

Processing (DSP), Discrete Wavelet Transform (DWT), Genomic Signal Processing(GSP), Electron Ion Interaction Potential(EIIP).

1.INTRODUCTION Genome Signal Processing [1] is an emerging technology in the research of deadly disease like cancer nowadays. Analysis of genes or genomic data by applying Digital Signal Processing techniques is known as GSP. Bioinformatics [2] is an interdisplinary field which is the combination of biology with computer science. This technology describes the phenomenon of the disease at the molecular level (gene level). It is the technology which improves the accuracy of the result, reduces the time of drug discovery and cost effective. Cancer, a malignant neoplasm (in medical term) is a deadly disease which is caused due to mutation (alteration) in DNA sequences [3]. Nowadays, mortality rate has been increased due to cancer. During Cell division, the cell gets divided to make new ones as human body grows. Cancer is also caused due to some environmental agents such as exposure to chemicals or radiations. According to scientific discoveries [4], DNA plays a major role in the study of cancer. Genomic data such as DNA Š 2017, IRJET

|

Impact Factor value: 5.181

|

basically in character form (A, T, G, C ) converted into digital form which is known as Genomic signal. In analyzing the DNA sequences, it is mandatory to convert the biological sequences into numerical sequences. There are so many mapping techniques [5] are available for performing conversion of DNA sequences. Some of them are Integer representation, Voss representation, Tetrahedron representation, Complex representation, etc‌ In [6], Abo-Zahhad explains the various types of conversion techniques clearly with its merits and demerits. Voss representation [7] is one among the numerical way of representation which converts one DNA sequence into four binary sequences. Since the result of this representation becomes four sequences, it increases computational complexity. To overcome this drawback, EIIP representation technique have been introduced. It reduces computational overhead by 75% [8] by applying corresponding EIIP values to each nucleotide (A, T, G, C). It exhibits periodicity property and also improves differentiation or discrimination capability of genes. The important factor in analysis of genomic sequences is to identify the protein coding region [9]. In [10], Anastassiou demonstrates the identification of protein coding and non coding regions by employing DSP techniques to the numerical values of the DNA strings. In [11], Jianchang Ning explains the wavelet applications in analyzing biological problems .The main point that they conclude in this paper is the wavelet technique, which is the best way of analyzing biological sequences. Wavelets provides the signals in multiscale way of representation. The paper is organized as follows. Section I describes the introduction part. Sections II presents the Molecular biology concepts and techniques of signal processing. Section III introduces the proposed methodology. Section IV depicts the algorithm of the proposed method. In Section V, the results and simulation by using algorithm of the proposed method is reviewed, which clearly shows the discrimination between the normal and the mutated genes. And Finally, section VI concludes the paper.

ISO 9001:2008 Certified Journal

|

Page 1071


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.