Benchmark Data
Online Supporting Information A. This dataset contains 6,181 locative protein sequences (5,618 different proteins), classified into 22 eukaryotic subcellular locations. Among the 5,618 different proteins, 5,091 belong to one subcellular location, 495 to two locations, 28 to three locations, and 4 to four locations. Both the accession numbers and sequences are given. None of the proteins has more than 25% sequence identity to any other in the same subset except for the following three locations: acrosome, melanosome and synapse; otherwise, the numbers of proteins remaining in the three subsets would be too few to have statistical significance. See the reference given on the top page of the web-server for further explanation. Click Supp-A to download the dataset.