Online Supporting Information A. This benchmark dataset for Gpos-mPLoc that includes 523 Gram-positive bacterial protein sequences (519 different proteins), classified into 4 subcellular locations. Among the 519 different proteins, 515 belong to one location; and 4 to two locations. Both the accession numbers and sequences are given. None of the proteins has more than 25% sequence identity to any other in the same subset (subcellular location). See the text of the paper for further explanation.
Click Supp-A to download the dataset.