-
Notifications
You must be signed in to change notification settings - Fork 2
Processing MPCCD images by DIALS
MPCCD supports in DIALS is under development. This documents is for developers and experts!
Here, we will process a lysozyme dataset we processed with CrystFEL in earlier sections (CXIDB #33, runs 266711-266721).
First, read an excellent documentation "Processing XFEL data with cctbx.xfel and DIALS" in Computational Crystallography Newsletter 2016, July.
We assume that you have built the latest version of DIALS and CCTBX from the respository. In addition, the following patches must be applied. I am waiting for these patches to be reviewed and merged.
- https://github.com/dials/dials/issues/240
- https://github.com/dials/dials/issues/236
- https://github.com/dials/dials/issues/227
- https://github.com/dials/dials/issues/226
- https://gist.github.com/biochem-fan/fec354238323ea539ce1 (replace the file in modules/cctbx_project/dxtbx/format/)
Import one of the HDF5 files from Cheetah and run spot finding on the first several images.
dials.import run266711-0.h5
dials.find_spots datablock.json min_spot_size=1 d_max=20 scan_range=1,10
Inspect spot finding results in dials.image_viewer
. Detected spot centroids are marked by pink dots.
dials.image_viewer datablock.json strong.pickle
If the result is unsatisfactory, tweak min_spot_size
(try 1, 2, 3) and global_threshold
. The gain
is 10, because the Cheetah pipeline normalizes 1 photon to be 10 counts in the output. You do not have to specify this in a command line argument, because it is automatically set by the dxtbx
format class. Also check the resolution of the beam stop shadow.
The dxtbx
class assumes that the images came from MPCCD in the experimental hutch 4 (BL3 EH4) and the detector distance was 50.0 mm. If not, you can override the initial geometry by MPCCD_DISTANCE
and MPCCD_GEOMETRY
environmental variables. TODO: Write this
Prepare index.phil
as follows:
# input.reference_geometry = ../refined_geometry.json
# you can use 14 cores in the serial queue
mp.nproc = 14
# get more info
verbosity = 3
indexing {
known_symmetry {
space_group = P43212
unit_cell = 78.9 78.9 38.1 90 90 90
}
method = real_space_grid_search
# sometimes fft1d works better
# method = fft1d
# needs to be sufficiently high
refinement_protocol.d_min_start = 2.0
# dramatically increases accepted spots in real_space_grid_search
index_assignment.method = local
optimise_initial_basis_vectors = True
}
spotfinder {
filter.min_spot_size = 1
# mask the beam stop
filter.d_max = 20
}
refinement {
parameterisation {
# not sure which is better. The default is False
spherical_relp_model = True
beam.fix = all
detector.fix_list = Dist,Tau1
auto_reduction {
action = fix
min_nref_per_parameter = 1
}
crystal {
unit_cell {
restraints {
tie_to_target {
values = 78.9,78.9,38.1,90,90,90
sigmas = 1,1,1,0,0,0
apply_to_all = True
}
}
}
}
}
}
integration {
integrator = stills
profile.fitting = False
background {
algorithm = simple
simple {
model.algorithm = linear2d
outlier.algorithm = tukey
}
}
}
profile {
gaussian_rs {
min_spots.overall = 0
}
}
Run dials.stills_process
on first several runs. We use the following job submission script run-dials.sh
.
#!/bin/bash
#PBS -l nodes=1:ppn=14
#PBS -q serial
if [ -n "$PBS_O_WORKDIR" -a "$PBS_ENVIRONMENT" != "PBS_INTERACTIVE" ]; then
cd $PBS_O_WORKDIR
fi
source ~nureki/packages/dials/build/setpaths.sh
if [ $# -eq 1 ]; then
TARGET=$1
fi
if [ -z "$TARGET" ]; then
echo "Option: TARGET.h5 "
exit 1
fi
echo `hostname` $TARGET
mkdir $TARGET
cd $TARGET
MPCCD_DISTANCE=50.5 dials.stills_process ../index.phil ../../$TARGET/run$TARGET.h5
(We know the detector distance is 50.5 mm from our trials with CrystFEL)
And submit the job:
for i in {11..13}; do for j in {0..2}; do qsub -v TARGET=2667$i-$j run-dials.sh ; done; done
Note that this creates folders for each HDF5 file. Otherwise, we would have tens of thousands of files in a folder and suffer from huge file system overload!
As
find -name "int-*.pickle" | wc -l
shows, we could index only 1492 images out of 9470 images. This is because of the inaccurate initial metrology. We use cspad.cbf_metrology
to optimise the geometry.
First, make metrology1
folder and create metrology.phil
:
refinement {
parameterisation {
# not sure which is better. The default is False
spherical_relp_model = True
auto_reduction {
action = fix
min_nref_per_parameter = 1
}
crystal {
unit_cell {
restraints {
tie_to_target {
values = 78.9,78.9,38.1,90,90,90
sigmas = 1,1,1,0,0,0
apply_to_all = True
}
}
}
}
}
}
And run the refinement.
cspad.cbf_metrology ../2* ../metrology.phil tag=refine1 | tee refine1.log
It first refines the whole detector, then each sensor panel individually. After that, it tries to refine further levels and crashes. This is because CSPAD has one more level than MPCCD. We do not care.
The result was:
Detector 1 RMSDs by panel:
-----------------------------------------------------
| Panel | Nref | RMSD_X | RMSD_Y | RMSD_DeltaPsi |
| id | | (px) | (px) | (deg) |
-----------------------------------------------------
| 0 | 4119 | 1.0008 | 0.71287 | 0.054675 |
| 1 | 11411 | 0.54262 | 0.78264 | 0.078083 |
| 2 | 12583 | 0.55396 | 0.67085 | 0.080037 |
| 3 | 3991 | 0.97751 | 0.54345 | 0.052606 |
| 4 | 3040 | 1.0272 | 0.70569 | 0.051457 |
| 5 | 10849 | 0.57721 | 0.70723 | 0.0799 |
| 6 | 12502 | 0.51857 | 0.74374 | 0.079987 |
| 7 | 5419 | 0.9971 | 0.68209 | 0.056625 |
-----------------------------------------------------
We link metrology1/refine1_filtered_experiments_level1.json
to the top directory and add the following line to index.phil
to use the refined geometry instead of the hard-coded initial geometry.
input.reference_geometry = ../refine1_filtered_experiments_level1.json
Delete integration results from the initial trial or move it to metrology1
before running dials.stills_process
again. We got 5636 indexed images out of 9470 images. This is a wonderful improvement!
Now that we have many indexed images, we can take "half set" approach to examine the precision of the refinement.
cspad.cbf_metrology ../2* ../metrology.phil tag=refine2 split_dataset=True n_subset=5600 | tee refine.log
TODO: Write more...