SOMRuler: A Novel Interpretable Transmembrane Helices Predictor
Introduction
Transmembrane helices (TMH) identification is one of the most important steps in membrane protein structure prediction.
Existing TMH predictors tend to pursue accurate computational models without carefully considering the interpretability of these models
and thus act as a black box. In this paper, a novel TMH predictor called SOMRuler with excellent interpretability while possessing high
prediction accuracy is presented. The SOMRuler uses a self-organizing map (SOM) to learn helices distribution knowledge,
which is encoded in the codebook vectors of the trained SOM, from the training samples. Human interpretable fuzzy rules are then
extracted from the codebook vectors of the trained SOM. By extracting fuzzy rules from the learned knowledge rather than the original
training samples, on the one hand, the computational burden of extracting fuzzy rules can be greatly reduced; on the other hand, the
reliability of the extracted rules can also be enhanced since noise contained in the original samples can be smoothened by the learning procedure of SOM.
The validity of the fuzzy rules extracted by SOMRuler is qualitatively and quantitatively analyzed. Experimental results on the benchmark
dataset show that the SOMRuler outperforms most existing popular TMH predictors and is flexible to suite for a wide variety of problems in bioinformatics.
SOMRuler software package
Figure 1 shows the flowchart of SOMRuler and click here to download the whole software package, the benchmark dataset,
and the mined fuzzy rules of SOMRuler.

Figure 1. Flowchart of SOMRuler.
Mined Fuzzy Rules
Figure 2 shows 4 mined fuzzy rules by SOMRuler for predicting TMHs and more results are availabe by clicking here.

Figure 2. Mined fuzzy rules by SOMRuler for TMH prediction.
Link
MemBrain: Improving the Accuracy of Predicting Transmembrane Helices.
Reference
Dong-Jun Yu, Hong-Bin Shen, Jing-Yu Yang, SOMRuler: A Novel Interpretable Transmembrane Helices Predictor, IEEE Transactions on NanoBioscience, 2011, 10: 121-129.
|
|