DeepQs

DeepQs is an efficient approach for estimating the local quality of cryo-EM 3D density maps using deep learning, which is parameter-free for the user. In this work, we aim to utilize the map-model score, Q-score, in the resolution estimation. To achieve this, we use a deep learning 3D Vision Transformer (ViT) network that omits the input of atomic models in the map-model fit score calculation. The entire program is implemented in Python using the PyTorch package.

Download   Learn more

Training pipeline

1.Prepare dataset

The raw dataset for this study consists of a collection of cryo-EM maps obtained from the Electron Microscopy Data Bank (EMDB) and their corresponding atomic models obtained from the Protein Data Bank (PDB).

2.Rescale and Q-score calculation

To ensure consistency, all training maps should be resized to a grid size of 1Å using trilinear interpolation (if use the DeepQs025A.py, maps should be resized to a grid size of 0.25Å). The Q-score can then be calculated for each atom to determine its resolvability using the rescaled map and corresponding PDB file.

3.Dataset preparation

Boxes with a size of 9×9×9 are sampled based on the atomic coordinates. Each coordinate is rounded, and if the distance between the original coordinate and its rounded counterpart is larger than 0.2Å, the coordinate will be filtered out.

4.Training

All data has been processed completely, and network training can begin. In this work, we have chosen the Vision Transformer network (ViT) to train using on the PyTorch framework. The 9×9×9 box is used as the network input, with the Q-score serving as the label. The mean squared error (MSE) loss function is used to optimize the network.

Instructions

1.Environment for Python Source Code

The whole program is implemented in Python3 using PyTorch package. Python packages used in the program is listed in 'requirements.txt' under the source code folder. We suggest the users may use Conda to create Python environment for using the source code.

2.Input and Output

When a new cryo-EM map is obtained, please resize the map to 1Å using our program. In order to obtain the final result, a map and mask are required. The program can use the recommended contour available on each EMDB site to create a mask, or you can provide a binary mask. The final output includes two maps: the network's original output score map and a local resolution map, which is converted from the former map.

3.Dataset

The EMDB code and PDB ID of the maps and atomic models used in network training are shown in the supplementary Excel file.

Code&Data download

Please complete the relevant information to help us improve this program and get additional support (not mandatory). Click to get the ALL_FILES. Richer comments and intructions are provided in source code to help user understand the program.