In this study, we present ToxDL 2.0, a novel multimodal deep learning model that integrates both evolutionary and structural data for protein toxicity prediction. ToxDL 2.0 model consists of three key componentsmodules: (1) a Graph Convolutional Network (GCN) module for generating protein graph embeddings, (2) a domain embedding module for capturing protein domain representations, and (3) a dense module that combines these embeddings to predict toxicity using a multilayer perceptron. We first construct a large toxicity benchmark dataset, and experiment results on both test and independent test sets demonstrate that ToxDL 2.0 outperforms existing state-of-the-art methods. Furthermore, we apply integrated gradient to discover known toxic motifs associated with protein toxicity.
Availability: The ToxDL 2.0 is available at www.csbio.sjtu.edu.cn/bioinf/ToxDL2.
Figure 1. The flowchart of the proposed ToxDL 2.0
If you are interested in our previous version of ToxDL, you can access it via http://www.csbio.sjtu.edu.cn/bioinf/ToxDL