Turning Frozen Biomolecules into High-Speed Science
Turning Frozen Biomolecules into High-Speed Science
Move over X-ray crystallography. An alternative technique for producing images of biomolecules at near-atomic resolutions has entered the mainstream. Cryo-EM garnered the 2017 Nobel Prize in Chemistry for three early innovators, and is now poised to transform structural biology. Unlike X-ray crystallography, cryo-EM doesn’t just capture a single static image of a molecule in a crystallized state. Instead, it captures the many different 3-D shapes of a biomolecule as it moves, twists and interacts with other molecules.
Cryo-EM eliminates the need to crystallize samples, so it can also be applied to a wider range of molecules. This opens the door to new research opportunities. For example, researchers have used cryo-EM to generate molecular models of some of the key structures implicated in Alzheimer’s disease and malaria, providing critical first steps toward developing safer and more effective drug therapies for these tragic, widespread diseases .
In a cryo-EM experiment, the sample is purified and then flash-frozen, locking a billion or so target molecules into their many different rotational and conformational states. Electrons are then beamed through the sample and captured by high-resolution detectors.
Analyzing the resulting electron densities is far from simple. Very low electron energies must be used to minimize sample movement and damage, and this results in extremely noisy data. Open-source RELION (REgularised LIkelihood OptimizatioN) has emerged as the leading application for turning these data into high-resolution images. Previous approaches required scientists to tune parameters at multiple stages of the analysis. RELION uses a largely self-contained statistical approach that has been shown to provide more reliable results across numerous experimental data sets.
Computing challenges of image reconstruction
Although RELION produces exceptionally reliable results, its compute requirements are extreme. As images are gradually isolated and classified from the raw data, every possible rotation and translation must be statistically correlated with every proposed model for hundreds of thousands to millions of particles. Given the magnitude of the task, compute times can be lengthy.
“Even with a large cluster—۱۰,۰۰۰ to 15,000 cores—it can take days or weeks to reconstruct images,” said Erik Lindahl, professor of biophysics at Stockholm University.
Lindahl is an expert in computational biology. Several years ago, he and his team modernized GROMACS, a top industry code for biomolecular simulations, providing order-of-magnitude performance improvements. He believed a similar approach could be applied to RELION.
According to Lindahl, many popular scientific codes use only about 1 percent of the theoretical peak throughput of modern computing systems. That’s because they were not initially written to take advantage of the advanced parallelism in today’s processors and clusters. His work is focused on rectifying that situation to increase performance and scalability, and help scientists tackle problems that are too large for existing technologies.
To help ensure his modernization efforts met the exacting requirements of the scientific community, Lindahl teamed up with the original developer of RELION, Sjors Scheres of the UK MRC Laboratory of Molecular Biology.
“In my opinion, Sjors is the most important computational person in the cryo-EM field today,” said Lindahl. “He has been instrumental in ensuring that our modernized RELION code increases performance without losing even the smallest amount of information or accuracy. RELION is actually a dream scenario for modern hardware, but it took a lot of work to expose the parallelism. We had to change the way the program handles data, so that each thread can flow through huge blocks of data independently.”
According to Lindahl, it’s not that hard to improve individual software routines, but then Amdahl’s Law asserts itself, exposing bottlenecks in other routines. Success requires time and persistence, but modern software tools can help.
“Once we exposed the parallelism in a clean way,” said Lindahl, “Intel Compilers generated beautiful SIMD [single instruction multiple data] code for us. The Intel Math Kernel Library [Intel MKL] was also invaluable. It solved our challenges with Fourier transforms. These tools allow us to focus on high level issues, such as extracting maximum value and quality from the data.”
Faster image reconstruction
During the software modernization work, the team optimized RELION for the latest Intel Xeon Scalable processors. RELION is floating-point intensive, and these processors support Intel Advanced Vector Extensions 512 (Intel AVX-512), which doubles the maximum number of floating point operations per clock cycle versus previous generation processors.
Lindahl and his team also found that they could replace many of the double-precision floating point operations in RELION with single-precision operations, without sacrificing accuracy. This provided another doubling of maximum throughput for many key portions of the code. The optimized code is also able to make better use of the larger number of cores and the improved memory bandwidth of the newer processors to further improve parallel throughput.
The modernized RELION code was benchmarked using a common plasmodium ribosome workload. Baseline tests showed that the unoptimized code took 47.6 hours to run on a two-socket server equipped with the previous generation Intel Xeon processor E5-2697 v4. The optimized code running on the same processor performed the same task in 5.7 hours, an 8.4x speedup. The Intel Xeon Gold 6148 processor reduced image reconstruction time to just 4.6 hours, a 10.3x performance improvement versus the previous generation hardware and software solution.
Additional benchmarks were run to measure the performance of the modernized code on two-, four-, and eight-node server clusters. The results demonstrated good scaling: 1.77x faster on 2-nodes, 3.3x faster on 4-nodes, and 4.8x faster on 8-nodes. This offers a valuable pathway for reducing image reconstruction times even more and for accommodating the heavier workloads associated with larger biomolecules.
The order of magnitude performance leap for RELION and the improved scaling will have immediate benefits for cryo-EM researchers.
“Performance improvements of this magnitude change the computing paradigm,” said Lindahl. “Results are available in a fraction of the time. Computing resources can also be used more efficiently. Many workloads that used to require a supercomputer can now be run on a small local cluster. Perhaps most importantly, research teams can achieve high performance and the highest levels of accuracy on standard x86 computing systems, with no need for accelerators. This means the application can now make efficient use of large amounts of system memory, which opens up new avenues of research into larger and more complex molecules.”
The work to modernize RELION continues. One near-term goal is to improve scalability on larger server clusters. RELION is already MPI-enabled, but the faster per-node performance of the new code can lead to communication bottlenecks on larger clusters. Lindahl is working with Intel engineers to rewrite the message-handling layer of the application to provide more efficient I/O.
Lindahl believes that simple and efficient scaling of RELION on high-performance computing (HPC) systems is essential to fully empower cryo-EM researchers.
“Our ultimate goal is to enable near real-time image reconstruction. Even at relatively low resolutions, this will allow scientists to run an experiment, view their results, adjust their sample preparations and then rerun the experiment on the same day. It will be much easier for them to get the results they need in one session, rather than waiting the three months it often takes to reschedule time on an electron microscope,” said Lindahl.
Speed isn’t the only factor. Lindahl also wants to make HPC simpler and more intuitive for cryo-EM researchers to deploy and use.
Cryo-EM has become a uniquely powerful and flexible tool for structural biologists. By modernizing RELION to reduce image reconstruction times on standards-based servers, Lindahl and Scheres are helping cryo-EM researchers speed time to results by an order of magnitude, handle larger, more memory-intensive workloads and tackle bigger and more complex biomolecules.
As they optimize their fast new RELION code for efficient scaling on HPC clusters, these advantages will continue to increase. Ultimately, the two scientists expect to empower cryo-EM researchers with near real-time access to low-resolution models and visualizations, so they can detect experimental and sample errors quickly, make the necessary adjustments, and complete their experiments during a single microscope session—potentially reducing research timelines by a matter of months.
As the data processing constraints for cryo-EM continue to fall, one thing is becoming almost as clear as the resulting image— this revolution is just getting started.