DimiG: Inferring disease-associated microRNAs using semi-supervised multi-label graph convolutional networks
[Introduction]
[Code and Dataset]
Introduction
MicroRNAs (miRNAs) play crucial roles in many biological processes involved in diseases. The associations between diseases and protein coding genes (PCGs) have been well investigated, and further the miRNAs interact with PCGs to trigger them to be functional. Thus, it is imperative to computationally infer disease-miRNA associations under the context of interaction networks.
In this study, we present a computational method, DimiG, to infer miRNA-associated diseases using semi-supervised Graph Convolutional Network model (GCN). DimiG is a multi-label framework to integrate PCG-PCG interactions, PCG-miRNA interactions, PCG-disease associations and tissue expression profiles. DimiG is trained on disease-PCG associations and a graph constructed from interaction networks of PCG-PCG and miRNA-PCG using semi-supervised GCN, which is further used to score associations between diseases and miRNAs. We evaluate DimiG on a benchmark set collected from verified disease-miRNA associations. Our results demonstrate that the new DimiG yields promising performance and outperforms the best published baseline methodnot trained on disease-miRNA associations by 11% and is also comparable to two state-of-the-art supervised methods trained on disease-miRNA associations. Three case studies of prostate cancer, lung cancer and Inflammatory bowel disease further demonstrate the efficacy of DimiG, where the top miRNAs predicted by DimiG for them are supported by literature or databases.
Figure 1. The flowchart of DimiG with 2-layers GCN. Each node (gene) is represented a vector of expression values across tissues from GTEx and the network is constructed from PCG-PCG and PCG-miRNA interactions. When doing forward propagation, the embedding of the red node in each network is the weighted sum of the embedding of its neighbors, where all nodes in the network are updated simultaneously. The label is multi-hot vector indicating the presence of diseases. In the end, DimiG can infer the probability between diseases and unlabeled miRNAs.
Code and Datasets
The program package consists of main python program and disease-PCG association, PCG-PCG interactions and PCG-miRNA interactions and tissue expression profiles. To
install the programs, download the source code from github (DimiG)
Reference
Xiaoyong Pan and Hong-Bin Shen, Inferring disease-associated microRNAs using semi-supervised multi-label graph convolutional networks. Submitted. |
|
|