Skip to main content

Classification And Prediction Of Epitopes Using Machine Learning Algorithms

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 05 | May 2024

p-ISSN: 2395-0072

www.irjet.net

Classification And Prediction Of Epitopes Using Machine Learning Algorithms Pradeep Kumar H S1, Amrutha C Bhat2, Anagha A S3, Isha N4, Khushi M S5 1,2,3,4,5

The National Institute of Engineering, Mysuru-570008, Karnataka, India -------------------------------------------------------------------------***----------------------------------------------------------------------

Abstract - Epitope classification stands as a

BCEs is crucial for pinpointing the exact regions that trigger the immune response [10]. This knowledge is instrumental in designing targeted vaccines that stimulate a robust immune response against specific pathogens [10]. Furthermore, BCE classification can offer insights into autoimmune diseases and allergies, aiding in the development of therapeutic interventions [10].

cornerstone in vaccine development. The accurate classification of epitopes and non-epitopes significantly influences vaccine effectiveness. This project employs feature engineering techniques to raise the quality of input data. Various Machine learning algorithms, such as Support Vector Machine, k-Nearest Neighbours, Logistic Regression, Random Forest and XGBoost alongside deep learning algorithms like Convolutional Neural Network undergo rigorous evaluation and comparison to identify the most effective methods for precise epitope prediction. The exploration further extends to the integration of transfer learning methodologies, leveraging preexisting knowledge to enhance epitope classification performance. The primary objective is to assess the precision of epitope classification, a critical aspect in immunology and vaccine development. Clearly separating epitopes from non- epitopes is crucial in designing vaccines. This ensures the vaccine prompts the right immune reactions while minimising unwanted responses. Utilising feature engineering techniques and systematic algorithm evaluation, the project strives to optimise the accuracy of epitope classification.

In silico prediction of linear B-cell epitopes (BCEs) has evolved from rudimentary sequence-based methods to more advanced Machine Learning (ML) techniques, yet substantial challenges remain [11]. Earlier models primarily focused on compositional properties and physicochemical characteristics of proteins, including antigenicity, torsion, and surface accessibility [12]. Although some, like PREDITOP [13], BcePred [14], BEPITOPE [15], and PEOPLE [16], achieved seemingly high performance, later studies revealed overestimations in their predictive accuracy [17, 11]. The availability of expanding proteomic data has led to the application of various ML techniques for BCE prediction, aiming to address limitations of prior methods. BepiPred 2.0, for instance, achieved an Area Under the Curve (AUC) of 0.671 through a Hidden Markov Model (HMM) incorporating secondary structure and hydrophilicity propensity scales [18]. ABCpred, established in 2006, utilises recurrent neural networks (RNNs) and achieves an accuracy (ACC) of 0.66 and a Matthews Correlation Coefficient (MCC) of 0.319 with a sliding window size of 16 [19]. However, it relies on a combination of biochemical and physicochemical features that may not fully capture the complexities of BCEs.

Keywords - Machine Learning; B-cell Epitopes; Immunology; Development

Feature

Engineering;

Vaccine

1. INTRODUCTION Immunoinformatics is a swiftly advancing field, pivotal to vaccine development and immune response. At its heart are B-cell epitopes (BCEs), which are precise sections on antigens, the proteins showcased by pathogens, prompting the creation of antibodies, the immune system's defenders [9]. Understanding BCEs is paramount for designing effective vaccines, antibodies, and immunotherapies [9]. Traditional experimental methods for identifying BCEs are laborious and timeconsuming [10]. This research addresses this gap by proposing a novel approach to BCE classification using machine learning (ML) algorithms.

Support Vector Machines (SVMs) have also been implemented, like SVMTriP, which integrates tri-peptide composition and propensity scales to forecast linear antigenic B-cell epitopes, achieving a precision (Prec) of 55.20% and an AUC value of 0.702 [20]. LBtope utilises a wider range of primary sequence-based features and achieves an accuracy range of 58.39% to 66.7% and AUC values ranging from 0.60 to 0.73 [21]. Deep learning approaches have recently shown significant promise. NetBCE, a deep learning framework, outperforms conventional methods by a substantial margin, achieving an AUC of 0.8400. It attributes this success to its use of feature analysis, encoding, and a ten-layer architecture with CNN, BLSTM, and attention mechanisms [22]. SEMA, another recent development, focuses on antigen B-cell conformational epitope prediction using deep transfer learning. It achieves an ROC AUC of 0.76, demonstrating

BCEs can be classified into two main types: linear, where binding occurs on a continuous surface chain, and conformational, involving folded protein chains with discontinuous amino acids [10]. Understanding linear

© 2024, IRJET

|

Impact Factor value: 8.226

|

ISO 9001:2008 Certified Journal

|

Page 837


Turn static files into dynamic content formats.

Create a flipbook
Classification And Prediction Of Epitopes Using Machine Learning Algorithms by IRJET Journal - Issuu