Benchmark Datasets
Online Supporing Infomation A:  The benchmark dataset used to discriminate secretory proteins from non-secretory proteins and identify the signal peptides of the secretory proteins. The dataset consists of six subsets: (1a) 3,694 eukaryotic secretory proteins, (1b) 3,785 eukaryotic non-secretory proteins, (2a) 699 Gram-negative secretory proteins, (2b) 721 Gram-negative non-secretory proteins, (3a) 303 Gram-positive secretory proteins and (3b) 306 Gram-positive non-secretory proteins.Each entry has three lines: (1) the accession number and the position of the first and last amino acids of the signal peptide; (2) the first N-terminal 100 amino acids; and (3) a symbol line with 'S' to indicate the corresponding residues belonging to the signal peptide and with 'M' to the mature protein. So download the data in the Online Supporting Information A, click Supplementary.