SOMPNN: An Efficient Non-Parametric Model for Predicting Transmembrane Helices

Introduction

Accurately predicting the transmembrane helices (TMH) in a helical membrane protein is an important but challenging task. Recent researches have demonstrated that statistics-based methods are promising routes to improve the TMH prediction accuracy. However, most of existing TMH predictors are parametric models and they have to make assumptions of several or even hundreds of adjustable parameters based on the underlying probability distribution, which is difficult when no a-priori knowledge is available. Besides the performances of these parametric predictors significantly depend on the estimated parameters, some of them need to exploit the entire training dataset in the prediction stage, which will lead to low prediction efficiency and this problem will become even worse when dealing with large-scale dataset. In this paper, we propose a novel SOMPNN model for prediction of TMH that features by minimal parameter assumptions requirement and high computational efficiency. In the SOMPNN model, a self-organizing map (SOM) is used to adaptively learn the helices distribution knowledge hidden in the training data, and then a probabilistic neural network (PNN) is adopted to predict TMH segments based on the knowledge learned by SOM. Experimental results on two benchmark datasets show that the proposed SOMPNN outperforms existing popular TMH predictors and is promising to be extended to deal with other complicated biological problems.

Code and Dataset

Click here to download the datasets and source codes of SOMPNN.

Link

MemBrain: Improving the Accuracy of Predicting Transmembrane Helices.

Reference

Dong-Jun Yu, Hong-Bin Shen, Jing-Yu Yang, SOMPNN: An Efficient Non-Parametric Model for Predicting Transmembrane Helices, Amino Acids, 2012, 42: 2195-2205.