GraphLoc: a graph neural network model for predicting protein subcellular localization from immunohistochemistry images


Introduction


Recognition of protein subcellular distribution patterns and identification of location biomarker proteins in cancer tissues are important for understanding protein functions and related diseases. Immunohistochemical (IHC) images enable visualizing the distribution of proteins at the tissue level, providing an important resource for the protein localization studies. In the past decades, several image-based protein subcellular location prediction methods have been devel-oped, but the prediction accuracies still have much space to improve due to the complexity of protein patterns resulting from multi-label proteins and variation of location patterns across cell types or states. Here, we propose a multi-label multi-instance model based on deep graph convolutional neural networks, GraphLoc, to recognize protein subcellular location patterns. GraphLoc builds a graph of multiple IHC images for one protein, learns protein-level representations by graph con-volutions, and predicts multi-label information by a dynamic threshold method. Our results show that GraphLoc is a promising model for image-based protein subcellular location prediction with model interpretability. Through applying GraphLoc to detect candidate location biomarkers and to predict new potential members for protein networks, some of the mined results are supported by literature and others also provide an informative candidate set for further experimental screening. These results suggest the effectiveness and utility of our method.

Fig. 1. Flowchart of the proposed model GraphLoc in this paper.



Supplementary Data and Code

The supplementary data and code are contained in the following compressed files:

Click here to download the supplementary files (28MB), including our dataset information and parts of results. Click here to download the source code (433Kb, Last updated: 12 Jun 2022). The code package has been tested using Python 3.6 under Ubuntu 14.04 in a 64bit architecture.



Reference

Jin-Xian Hu, Yang Yang, Ying-Ying Xu, and Hong-Bin Shen. GraphLoc: a graph neural network model for predicting protein subcellular localization from immunohistochemistry images.