Skip to content

Relion 3D classification

Michael A. Cianfrocco edited this page Jun 7, 2018 · 1 revision

Home > 3D classification & refinement > Relion 3D classification

##Relion 3D classification

The premise of 3D classification is to use an initial starting model to sort your particles into 3 or more groups. By using the 3D model, Relion uses maximum likelihood methods to create homogenous classes of particles that will belong to a given group.

3D classification in Relion computationally 'purifies' your dataset, allowing you to further analyze a homogenous class

This procedure can run on CTF-uncorrected or CTF-corrected data. Commonly, negative stain data will not be CTF corrected, but cryo-EM data will be CTF-corrected.

The following steps outline the options needed to input into the Relion GUI.

  1. Inputs for 3D classification in Relion
  2. Launching & using Relion's GUI
  3. Submitting job to cluster
  4. Example cluster submission script
  5. Outputs
  6. Continuing stopped Relion run

####Inputs for 3D classification in Relion

After extracting your particles in Relion, you will have all the inputs necessary to run 3D classification. This includes:

  • .star file generated with per-particle CTF and micrograph information. This is named [rootname].star by the extraction step.
  • .mrcs and .star files for each micrograph within the folder Particles/Micrographs

####Launching & using Relion's GUI

$ relion & 

Which will open this window:

  • Input pixel size for your data. This will need to be adjusted to a new pixel size if you binned your data during extraction
  • The diameter selected must extend beyond the protein density that you expect, but if it is too wide, you may include too much information and lead to artifacts during analysis.

Now, click on the 3D classification tab on the left hand list. You will need to enter the following inputs:

  • Particle star file - this file was output after particle extraction and is the rootname specified during extraction with the .star file extension
  • Output file name - by default all outputs will be put into a 'Class3D' folder. You can change the run rootname to something more descriptive other than 'run1' if you want.
  • Number of classes - this is an important input. The more classes requested, the longer the 3D classification run will take. A good number is 3 to 5 classes. Usually, a few of the classes will be 'bad' and will not make much sense at the end of classification, whereas 1 or 2 will be real classes that should be more homogenous.

Clicking on the next tab Reference, you will provide inputs for the 3D model that you will use as an initial model:

  • Reference map - This must be a 3D model in .mrc format that has the same box size and pixel size as your single particle data.
  • Greyscale? - Since Relion uses image statistics during image analysis, the algorithms expect that the data conform to specific conventions for their average and standard deviation values. If this initial model did NOT come from Relion, then answer 'No' since the map is not on 'absolute greyscale.'
  • Initial low-pass filter - For 3D classification and refinement, it is important to input a filtered 3D model in order to avoid model bias. This means that you can 1) filter your model outside of Relion or 2) you could filter it using Relion as specified here. You should filter your model to 50 - 60 Angstroms to prevent model bias.
  • Symmetry - Input symmetry for your molecule. Asymmetric single particles are 'C1'.

Clicking on to the next tab CTF, Relion asks if you will be performing CTF correction:

  • CTF - correction? - You can only perform CTF correction if you extracted particles using Relion with CTFFIND3 input log files. CTF-correction is recommended for cryo-EM data, and is not necessary for negative stain data.
  • Has reference been CTF corrected? - If the 3D model came from Relion as a CTF-corrected output model, then you can answer 'Yes' to this question. Otherwise, 'No'.
  • Have data been phase-flipped? - If you are analyzing negative stain data without CTF-correction, you can input phase flipped particles here and answer 'Yes' to this question. Normally, the answer is 'No'.
  • Ignore CTFs until first peak? - In case the CTF confidence is low for only low-resolution information, then you could set this to 'Yes'. Generally, this is not a problem and we have not seen anyone need to use this option.

Clicking through to the Optimisation tab, you will input options for the 3D classification run:

  • Number of iterations - provide the total number of iterations over which to run 3D classification. Typical numbers are 25 - 30 iterations.
  • Regularisation parameter - The weighting of the data statistics compared to the 3D model is used during analysis to determine if particles belong to a given group. Therefore, users can change the weighting of the data relative to the model. To more heavily weight the data, select a value of 4 for 3D classification, whereas 3 can be used to weight the data and model more equally.
  • Mask particles with zeros? - This will mask the particle data outside of the given radius to a value of 0. This will help to prevent overalignment of particles to background noise.
  • Reference mask - The reference mask is a shape with the value of '1' on the inside of the mask and '0' on the outside of the mask. This is very similar to 'solvent flattening' in X-ray crystallography. If you know the shape of your sample very well, then you can exclude all data that falls outside of a given mask.
  • Limit resolution E-step to (A) - If you think noise is overaligning your data, then limiting the resolution considered during analysis can help to prevent high resolution noise from affecting your analysis. Since 3D classification will not go past 11 or 12 Angstroms, you can limit high frequencies by specifying a limit to 12 Angstroms if you are having trouble classifying your data.

The last tab is the Sampling tab, where you input image alignment information.

  • Perform image alignment? - Set this to 'Yes' as you are trying to align and classify your data. If your data were already aligned, you could simply perform classification without alignment by choosing 'No'.
  • Angular sampling interval - This value specifies how many different 3D orientations will be searched during 3D classification. The smaller the angular sampling, the more views will be searched. Inherently, small angular samples assumes that your data are high-resolution. 7.5 degrees is a good angular sampling interval for many different sample types.
  • Offset search range - This value specifies a radius over which the model will search for the center of the particles selected. In principle, if all particles were centered perfectly, then there would be no need for a search range. But, inevitably particles are off-center by a few pixels and this will correct for this.
  • Offset search step - While searching through the search range, Relion will incrementally search at defined distances away from the center of the image. This increment is defined here as the 'step' of the searching.
  • Perform local angular searches? - After performing an initial alignment, you can perform 3D classification within defined angular ranges, thus confining the search to a smaller range, allowing the algorithm to classify more accurately conformational differences in your data.

At this point, you can can click 'Just show command' and it will output a command that looks like:

 *** The command is:
`which relion_refine_mpi` --o Class3D/run1 --i particles.star --particle_diameter 200 --angpix 1.77 --ref initial_model.mrc --firstiter_cc --ini_high 60 --ctf --iter 25 --tau2_fudge 4 --K 4 --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 --norm --scale  --j 1 --memory_per_thread 4 

Take this command and incorporate into a text file that you can use to submit to cluster (e.g. relion3d_submit.run).

####Submitting job to cluster

Depending on your specific cluster setup, these details will change. But, generally, you will need to place this Relion command into a shell script that can be submitted to your cluster (or the cloud) so that it can run over 100+ CPUs.

Most clusters will use a job submission script like Sun Grid Engine (Oracle), where you can submit a text file using the 'qsub' command:

$ qsub relion3d_submit.run

And then you can monitor the outputs into the folder Class3D.

####Example cluster submission script

Here is an example of a cluster submission script:

#!/bin/csh

#Wall time in seconds
#$ -l h_rt=3600,m_mem_free=4g 

#Name of job
#$ -N submitrelion3d

#Use current working directory
#$ -cwd 

#Use verbose output
#$ -V

#Number of CPUs
#$ -N 120

#Submission command

mpirun -np $NSLOT relion_refine_mpi --o Class3D/run1 --i particles.star --particle_diameter 200 --angpix 1.77 --ref initial_model.mrc --firstiter_cc --ini_high 60 --ctf --iter 25 --tau2_fudge 4 --K 4 --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 --norm --scale  --j 1 --memory_per_thread 4

####Outputs

As Relion runs on your cluster, there will be text output to the standard out files, which includes information that will look like this:

Auto-refine: Angular step= 3.75 degrees; local searches= false
 Auto-refine: Offset search range= 6.5 pixels; offset step= 1.5 pixels
 CurrentResolution= 18.6 Angstroms, which requires orientationSampling of at least 12.8571 degrees for a particle of diameter 160 Angstroms
 Oversampling= 0 NrHiddenVariableSamplingPoints= 479232
 OrientationalSampling= 7.5 NrOrientations= 36864
 TranslationalSampling= 3 NrTranslations= 13

For each iteration that it completes, there will be five files. For example, iteration 8 for run1:

  • run1_iter008_class00?.mrc - 3D volumes for each class, numbered from one to the number of classes requested.
  • run1_iter008_class00?_angdist.bild - A file that shows a 3D histogram of each euler angle direction used for the 3D model. Open in UCSF Chimera along with the 3D model to make sure that there are not any overrepresented classes used in the 3D reconstruction.
  • run1_iter008_data.star - text file with per-particle CTF/score information
  • run1_iter008_model.star - STAR file that should be opened in Relion to view 3D models
  • run1_iter008_optimiser.star - STAR file with 3D classification information
  • run1_iter008_sampling.star - STAR file with 3D classification information
Clone this wiki locally