Predicting ATP binding sites for protein sequences

Predicting binding sites between ATP and proteins holds significant importance in the realms of Biology and Medicine. Traditionally, extensive research in this field relied on time and resource-consuming ‘wet experiments’ conducted in laboratories. However, in recent years, there has been a shift towards leveraging computational methods, specifically employing advanced Deep Learning and Natural Language Processing (NLP) algorithms.

This project is centered around enhancing existing algorithms dedicated to classifying ATP-Protein binding sites. Our approach involves conducting a series of experiments primarily utilizing Position-Specific Scoring Matrices (PSSMs) and Word Embeddings as key features. The primary Deep Learning algorithms employed in our study include 2D Convolutional Neural Networks (CNNs) and LightGBM classifiers. The outcomes of our experiments demonstrated incremental progress compared to the benchmarked results available in the field.