DiffFit: Visually-Guided Differentiable Fitting
of Molecule Structures to a Cryo-EM Map

\begin{picture}(0.0,0.0) 0)(0 0) 100)This is an updated version of the original paper (DOI: 10.1109/TVCG.2024.3456404, published in the IEEE Transactions on Visualization and Computer 92)Graphics 31(1)) that includes the changes from the subsequent errata (DOI: 10.1109/TVCG.2024.3502911, published in volume 31, number 2). \authororcidZainab Alsuwaykit0009-0009-7444-445X \authororcidDawar Khan0000-0001-5864-1888 \authororcidOndřej Strnad0000-0002-8077-4692 \authororcidTobias Isenberg0000-0001-7953-8644 and \authororcidIvan Viola0000-0003-4248-6574

Abstract

We introduce DiffFit, a differentiable algorithm for fitting protein atomistic structures into an experimental reconstructed \acCryo-em volume map. In structural biology, this process is necessary to semi-automatically composite large mesoscale models of complex protein assemblies and complete cellular structures that are based on measured \acsCryo-em data. The current approaches require manual fitting in three dimensions to start, resulting in approximately aligned structures followed by an automated fine-tuning of the alignment. The DiffFit approach enables domain scientists to fit new structures automatically and visualize the results for inspection and interactive revision. The fitting begins with differentiable three-dimensional (3D) rigid transformations of the protein atom coordinates followed by sampling the density values at the atom coordinates from the target \acsCryo-em volume. To ensure a meaningful correlation between the sampled densities and the protein structure, we proposed a novel loss function based on a multi-resolution volume-array approach and the exploitation of the negative space. This loss function serves as a critical metric for assessing the fitting quality, ensuring the fitting accuracy and an improved visualization of the results. We assessed the placement quality of DiffFit with several large, realistic datasets and found it to be superior to that of previous methods. We further evaluated our method in two use cases: automating the integration of known composite structures into larger protein complexes and facilitating the fitting of predicted protein domains into volume densities to aid researchers in identifying unknown proteins. We implemented our algorithm as an open-source plugin (github.com/nanovis/DiffFit) in ChimeraX, a leading visualization software in the field. All supplemental materials are available at osf.io/5tx4q.

keywords:

Scalar field data, algorithms, application-motivated visualization, process/workflow design, life sciences, health, medicine, biology, structural biology, bioinformatics, genomics, cryo-EM.

\onlineid

0 \vgtccategoryResearch \vgtcpapertypeplease specify

\authorfooter

Deng Luo (), Zainab Alsuwaykit (), Dawar Khan (), Ondřej Strnad, and Ivan Viola are with the Visual Computing Center at King Abdullah University of Science and Technology (KAUST), Saudi Arabia. E-mail: {deng.luo | zainab.alsuwaykit | dawar.khan | ondrej.strnad | ivan.viola}@kaust.edu.sa.

Tobias Isenberg is with Université Paris-Saclay, CNRS, Inria, LISN, France. E-mail: [email protected].

\teaser

Compositing a protein (PDB 8SMK [Zhou:2024:ADI]) from its three unique chains. Top row from left to right: three input chains, input target volume, the best fits in the first fitting round, the remaining voxels after zeroing-out, and the fitted chains. Bottom row from left to right: two remaining input chains, remaining region of interest in the target volume from the first round, the best fits in the second round, the remaining voxels after zeroing-out, the fitted chains. Right: the final composited structure overlaid on the original target volume \change(RMSD: 0.138). The involved computation takes \change10 seconds in total, and the human-in-the-loop interaction takes $\approx$ \change3 minutes.

DiffFit: Visually-Guided Differentiable Fitting of Molecule Structures to a Cryo-EM Map

Abstract

keywords:

DiffFit: Visually-Guided Differentiable Fitting
of Molecule Structures to a Cryo-EM Map