Benchmark Data
Online Supporting Information A.This learning dataset for Virus-mPLoc that includes 252 viral protein sequences (207 different proteins), classified into 6 subcellular locations. Among the 207 different proteins, 165 belong to one subcellular location, 39 to two locations, and 3 to three locations. Both the accession numbers and sequences are given. None of the proteins has more than 25% sequence identity to any other in the same subset (except the Capsid subset). See the text of the paper for further explanation.
