= Name =
sxsort3d - 3D Clustering - SORT3D: Sort 3D heterogeneity based on the reproducible members of K-means and Equal K-means classification. It runs after 3D refinement where the alignment parameters are determined.

= Usage =


''usage in command line''

sxsort3d.py  stack  outdir  mask  --focus=3Dmask  --ir=inner_radius  --radius=outer_radius  --maxit=max_iter  --rs=ring_step  --xr=xr  --yr=yr  --ts=ts  --delta=angular_step  --an=angular_neighborhood  --center=centring_method  --nassign=nassign  --nrefine=nrefine  --CTF  --stoprnct=stop_percent  --sym=c1  --function=user_function  --independent=indenpendent_runs  --number_of_images_per_group=number_of_images_per_group  --low_pass_filter=low_pass_filter  --nxinit=nxinit  --unaccounted  --seed=random_seed  --smallest_group=smallest_group  --sausage  --chunk0=CHUNK0_FILE_NAME  --chunk1=CHUNK1_FILE_NAME  --PWadjustment=PWadjustment  --protein_shape=protein_shape  --upscale=upscale  --wn=wn --interpolation=method


=== Typical usage ===

sxsort3d exists only in MPI version.

''' mpirun -np 176 --host n1,n5,n6,n8,n9,n10,n0,n4,n3,n7 sxsort3d.py bdb:data sort3d_outdir1 mask.hdf --focus=ribosome_focus.hdf --chunkdir=/data/n10/pawel/ribosome_frank/ri3/main013 --radius=52 --CTF --number_of_images_per_group=2000 --low_pass_filter=.125 --stoprnct=5 ''' 


== Input ==
    stack:: Input images stack: (default required string)
    mask:: 3D mask: (default none)
    
    focus:: Focus 3D mask: Mask used for focused clustering (default none)
    radius:: Outer radius for rotational correlation [Pixels]: Must be smaller than half the box size. Please set to the radius of the particle. (default -1)
    delta:: Angular step for projections: (default '2')
    CTF:: Use CTF: Do a full CTF correction during the alignment. (default False) 
    sym:: Point-group symmetry: (default c1) 
    number_of_images_per_group:: Images per group: Critical number of images per group, defined by user. (default 1000) 
    nxinit:: Initial image size for sorting: (default 64)
    smallest_group:: Smallest group size: Minimum members for identified group. (default 500) 
    chunk0:: Chunk file name for 1st subset: Name of chunk file containing particle IDs of 1st subset (chunk0) for computing margin of error. (default none) 
    chunk1:: Chunk file name for 2nd subset: Name of chunk file containing particle IDs of 2nd subset (chunk0) for computing margin of error. (default none) 

    * The remaining parameters are optional.
    ir:: Inner radius for rotational correlation [Pixels]: Must be bigger than 1. (default 1)
    maxit:: Maximum iterations: (default 25)
    rs:: Step between rings in rotational correlation: Must be bigger than 0. (default 1)
    xr:: X search range [Pixels]: The translational search range in the x direction will take place in a +/xr range. (default 1)
    yr:: Y search range [Pixels]: The translational search range in the y direction. If omitted it will be set as xr. (default -1)
    ts:: Translational search step [Pixels]: The search will be performed in -xr, -xr+ts, 0, xr-ts, xr, can be fractional. (default 0.25)
    an:: Local angular search width [Degrees]: This defines the neighbourhood where the local angular search will be performed. (default '-1')
    center:: Centering method: 0 - if you do not want the volume to be centered, 1 - center the volume using the center of gravity. (default 0)
    nassign:: Number of reassignment iterations: Performed for each angular step. (default 1)
    nrefine:: Number of alignment iterations: Performed for each angular step. (default 0)
    stoprnct:: Assignment convergence threshold [%]: Used to asses convergence of the run. It is the minimum percentage of assignment change required to stop the run.  (default 3.0) 
    function:: Reference preparation function: Function used to prepare the reference volume. (default do_volume_mrk02) 
    independent:: Number of independent runs: Number of independent equal-Kmeans(default 3) 
    low_pass_filter:: Low-pass filter frequency [1/Pixel]: Low-pass filter used for the 3D sorting on the original image size. (default -1.0)
    unaccounted:: Reconstruct unaccounted images: (default False) 
    seed:: Seed: Seed used for the initial random assignment for EQ Kmeans. The program generates a random integer by default. (default -1) 
    sausage:: Use sausage filter: (default False) 
    PWadjustment:: Power spectrum reference: Text file containing a 1D reference power spectrum. (default none) 
    protein_shape:: Protein Shape: It defines protein preferred orientation angles. "g" is for globular proteins and "f" is for filament proteins. (default 'g')
    upscale:: Power spectrum adjustment strength: This parameters adjusts how strongly the power spectrum of the volume should be modified to match the reference. A value of 1 brings the volume's power spectrum completely to the reference, while a value of 0 means no modification. (default 0.5) 
    wn:: Target image size [Pixels]: If different than 0, then the images will be rescaled to fit this size. (default 0) 
    interpolation:: 3D interpolation method: Method interpolation in 3D. Options are tr1 or 4nn. (default '4nn')

== Output ==
    outdir:: Output directory: There is a log.txt that describes the sequences of computations in the program. (default required string)


= Description =

The clustering algorithm in the program combines a couple of computational techniques, equal-Kmeans clustering, K-means clustering, and reproducibility of clustering such that it not only has a strong ability but also a high efficiency to sort out heterogeneity of cryo-EM images. The command sxsort3d.py is the protocol I {P1). In this protocol, the user defines the group size and thus defines the number of group K. Then the total data is randomly assigned into K group and an equal-size K-means (size restricted K-means) is carried out. N independent equal-Kmeans runs would give N partition of the K groups assignment. Then two-way comparison of these partitions gives the reproducible number of particles.


= Method =

= Reference =
Described by A.Einstein in his first paper on spectrum of radiation
from a house heater kept at room temperature. Journal of Irreproducible
Results, 12, 1905, 12-1127.

= Author / Maintainer =
Zhong Huang

= Keywords =
    category 1:: APPLICATIONS

= Files =
applications.py

= See also =
[[http://sparx-em.org/sparxwiki/sxrsort3d|sxrsort3d]]

= Maturity =

    stable while under development:: Two programs (P1,P2) have been tested on both simulated and experimental ribosome data. For experimental ribosome data, P2 has a reproducible ratio-70-90%. P2 can 100%separate two conformations from the simulated ribosome data that contains 5 conformations. 

= Bugs =
None.  It is perfect.
