Background
Circular RNAs (circRNAs) interact with RNA-binding proteins (RBPs) to modulate gene expression. To date, most computational methods for predicting RBP binding sites on circRNAs focus on circRNA fragments instead of circRNAs. These methods detect whether an circRNA fragment contains a binding site, but cannot determine where is the binding site and how many binding sites on the whole circRNA. We report a hybrid deep learning-based tool, called CircSite, to predict RBP binding sites at single-nucleotide resolution and detect key contributed sequence contents on circRNAs. CircSite takes advantages of convolutional neural network (CNN) and Transformer for learning local and global representations, respectively. We construct 37 datasets for RBP-binding circRNAs and the experimental results show that CircSite offers accurate predictions of RBP binding nucleotides and detects known binding motifs. To the best of our knowledge, CircSite is the first computational tool to explore the binding nucleotides of RBPs on circRNAs. The source code of CircSite can also be found at CircSite
Method
We design a hybrid deep network consisting of CNN, BiGRU and Transformer to predict RBP binding nucleotides on a circRNA. First, we use a sliding window to scan the circRNAs into fragments with a step size of one, and these fragments are represented as one-hot encoded matrix, which are first fed into 1-D CNN, followed by the BiGRU and transformer, respectively. Then, the two learned representations are concatenated into the MLP classifier to obtain binding scores for individual fragments. Finally, these scores are post-processed using a median filter and threshold binarization to obtain the binding nucleotides on the RNAs.
Circular RNAs (circRNAs) interact with RNA-binding proteins (RBPs) to modulate gene expression. To date, most computational methods for predicting RBP binding sites on circRNAs focus on circRNA fragments instead of circRNAs. These methods detect whether an circRNA fragment contains a binding site, but cannot determine where is the binding site and how many binding sites on the whole circRNA. We report a hybrid deep learning-based tool, called CircSite, to predict RBP binding sites at single-nucleotide resolution and detect key contributed sequence contents on circRNAs. CircSite takes advantages of convolutional neural network (CNN) and Transformer for learning local and global representations, respectively. We construct 37 datasets for RBP-binding circRNAs and the experimental results show that CircSite offers accurate predictions of RBP binding nucleotides and detects known binding motifs. To the best of our knowledge, CircSite is the first computational tool to explore the binding nucleotides of RBPs on circRNAs. The source code of CircSite can also be found at CircSite
Method
We design a hybrid deep network consisting of CNN, BiGRU and Transformer to predict RBP binding nucleotides on a circRNA. First, we use a sliding window to scan the circRNAs into fragments with a step size of one, and these fragments are represented as one-hot encoded matrix, which are first fed into 1-D CNN, followed by the BiGRU and transformer, respectively. Then, the two learned representations are concatenated into the MLP classifier to obtain binding scores for individual fragments. Finally, these scores are post-processed using a median filter and threshold binarization to obtain the binding nucleotides on the RNAs.