Online Supporting Information A. The benchmark dataset(HumB) extract human proteins from SWISS-PROT released on January 2012, which includes 4,229 protein sequences (3,129 different proteins), classified into 12 human subcellular locations. Among the 3,129 different proteins, 2,306 of them belong only to 1 location; 595 of them belong to 2 locations; 186 of them belong to 3 locations; 36 of them belong to 4 locations; 5 of them belong to 5 locations and 1 of them belong to 6 locations. Both the accession numbers and sequences are given. None of the proteins has more than 25% sequence identity to any other in this benchmark.
See the text of the paper for further explanation.
Click Supp-A to download the bench mark dataset (HumB).
|