NLSExplorer Nuclear Localization Signal Exploration

NLSExplorer

Nuclear Localization Signal Exploration

Nuclear Localization Signals (NLSs) are crucial amino acid sequences within proteins that regulate their transport into the cell nucleus. The entry into the nucleus mediated by these signals represents an essential biological import process involved in nuclear transport. Adding to or cleaving these segments from the protein can remarkably lead to variations in its nuclear localization. A profound exploration and localization of these signal peptides not only elucidates the intricacies of nuclear transport but also holds the key to unveiling fundamental principles governing these biological phenomena.

Nuclear Localization Signals Candidates Library

Interactive NLS map discovered by NLSExplorer

Search, projection and collection about interested NLS segments

The first time you access this map, it will take some time to preload the data, thanks for your patience

Interactive Nuclear Transport Pattern
Map among species

The pattern map displays the patterns of nuclear transport segments mined from various species according to their occurence. By using the tools on the upright corner of each map, you can view the map in various perspectives.

NLS prediction

Step 1 : Input protein sequence (Example):

Note that this prediction is to detect NLS based on the protein sequence, which is a context-relay process. When the input is interested segment or NLS pattern , we suggest using the search function of our NLS candidate library above, which is a context-free application. We will keep the prediction results for two days as shown on the result board below. Please note that the submitted sequence should be less than 1022 amino acids.

You can also upload your sequence file (in Fasta format), and the sequence length of each protein should be less than 1022.
We suggest you cut the sequence to pieces if you have a sequence longer than 1022 amino acids.

Step 2 : Parameters choose (optional) ? This step is optional and allows you to choose parameters to run NLSEXplorer. We use "Exploration cofactor equal 0.2, No filter, No stretch" as the default.

Exploration cofactor Indicate the scale of exploration, high value bring more chance for NLS detection, but aslo introduce noise.

Filter single residues If you choose not, we will random stretch single residues to form an output segment.

Stretch prediction Decide whether stretching the prediction segment in a predefined distribution.

Step 3 : Input protein structure (in PDB format) for 3D-attention Visualization (optional).

If you choose to input PDB file, make sure the structure file is of the correct format (click the example button below for details).

You can also upload your protein file (in PDB format) below:

Step 4 : Email to receive the results (optional):

Additional Information

We have developed various online applications to advance the research for NLS universe. The NLS candidate library is built by leveraging the promising exploration capabilities of NLSExplorer and includes not only known NLSs but also potential ones. The similarity between sequences and structures is reflected in the neighborhood’s relationship of this map based on the cosine similarity of segment embeddings. This helps build a comprehensive landscape for each prediction by automatically searching for the nearest neighbors and provides customizable parameters to meet various usage requirements. NLSExplorer-SCNLS provides a powerful tool to highlight the core amino acids of NLSs and discover discontinuous NLS patterns. It facilitates NLS template finding and uncovering novel types of NLSs patterns. The Nuclear Transport pattern map, mined by the SCNLS algorithm, provides a reference for potential NLS patterns and other key segments important for nuclear transport. The map helps analyze the potential NLS characteristic and uncover the evolution relationship of NLSs among species. In addition, it offers the possibility of promoting advancements in various applications like targeted drug delivery, novel treatments for nuclear-protein-related diseases, and the development of new nuclear proteins for biological research.

For a given protein segment, the significance of each signal peptide fragment varies depending on the function of interest. Let's first assume that an expert already possesses sufficient knowledge and understanding. When presented with a set of materials, the expert's gaze will naturally focus on areas of personal interest. Simultaneously, we can assume the presence of a recorder that logs and analyzes the frequency of these patterns, thereby reflecting the expert's attention distribution throughout the test. Now, let’s consider a different scenario: the expert's gaze is directed according to specific requirements. For example, if the task is to determine whether a protein is localized within the nucleus, the expert's attention will shift to focus on nucleus-related information. Our model operates under the assumption that language models possess a substantial amount of knowledge. In this context: The knowledgeable individual is replaced by a language model. The task presented to the language model is to identify nuclear localization proteins. The tools used to record the patterns are A2KA and SCNLS.

If you use NLSExplorer, please cite the following paper:

Yi-Fan Li, Xiaoyong Pan, and Hong-Bin Shen, Discovering nuclear localization signal universe through a novel deep learning model with interpretable attention units, Patterns, 2025, 6: 101262.

The software is free to academic users ONLY; For commercial usage, please contact with us.

Contact @ Hongbin Shen(hbshen@sjtu.edu.cn)