Online Supporting Information A. This dataset contains 6,181 locative protein sequences
(5,618 different proteins), classified into 22 eukaryotic subcellular locations. Among the 5,618
different proteins, 5,091 belong to one subcellular location, 495 to two locations, 28 to three
locations, and 4 to four locations. Both the accession numbers and sequences are given. None of
the proteins has more than 25% sequence identity to any other in the same subset except for the following three
locations: acrosome, melanosome and synapse; otherwise, the numbers of proteins remaining in the three
subsets would be too few to have statistical significance. See the reference given on the top page of the web-server for further
explanation. Click Supp-A to download the dataset.