Sunday, May 29, 2022 at 9:00:00 AM UTC
Synopset: multiscale visual abstraction set for explanatory analysis of dna nanotechnology simulations
Deng Luo1
Application-Motivated Visualization, Data Abstractions & Types, Communication/Presentation, Storytelling, Temporal Data
We propose a new abstraction set (SynopSet) that has a continuum of visual representations for the explanatory analysis of molecular dynamics simulations (MDS) in the DNA nanotechnology domain. By re-purposing the commonly used progress bar and designing novel visuals, as well as transforming the data from the domain format to a format that better fits the newly designed visuals, we compose this new set of representations. This set is also designed to be capable of showing all spatial and temporal details, and all structural complexity, or abstracting these to various degrees, enabling both the slow playback of the simulation for detailed examinations or very fast playback for an overview that helps to efficiently identify events of interest, as well as several intermediate levels between these two extremes. For any pair of successive representations, we demonstrate smooth, continuous transitions, enabling users to keep track of relevant information from one representation to the next. By providing multiple representations suited to different temporal resolutions and connected by smooth transitions, we enable time-efficient simulation analysis, giving users the opportunity to examine and present important phases in great detail, or leverage abstract representations to go over uneventful phases much faster. Domain experts can thus gain actionable insight about their simulations and communicate it in a much shorter time. Further, the novel representations are more intuitive and also enable researchers unfamiliar with MDS analysis graphs to better understand the simulation results. We assessed the effectiveness of SynopSet on 12 DNA nanostructure simulations together with a domain expert. We have also shown that our set of representations can be systematically located in a visualization space, dubbed SynopSpace, composed of three axes: granularity, visual idiom, and information layout type. Proposed at the end are general guidelines on the use of SynopSpace to generate a new SynopSet for the visualization of other dynamic processes involving hierarchical structures.
DNA nanotechnology [1] uses DNA in its double helix form as building blocks for more complex structures, by leveraging the Watson-Crick base pairing rule, rather than as a genetic information carrier. Two main methods have been developed to assemble DNA nanostructures: scaffolded DNA origami [2] and DNA bricks [3–5]. The assembled structures are ever-increasing and range from simple geometric primitives to large multi-component systems with dynamic behavior. Well-designed nanostructures have shown promising application scenarios, ranging from fundamental studies by acting as substrates for biochemical analysis [6], to medical applications by facilitating targeted drug delivery [7]. A typical current workflow in this domain is to first design the shape of the nanostructure with computer aided design tools. These will usually be able to generate specific DNA sequences for all the strands in the designed structure. Then the structure will be subject to molecular dynamics simulation to investigate its structural stability and dynamic behavior; if it is not stable, or lacks desired properties, several rounds of modifications followed by simulation loops will be performed until the designer is satisfied. Then the finalized DNA sequences will be ordered as oligonucleotides from commercial companies and lab experiments will be performed to assemble the designed structure. Finally, tests will be run, such as observing the shape of the final assembled structure under the microscope. DNA simulations may include tens of thousands of atoms, and many thousands of time steps. If a domain expert wishes to analyze a simulation in atomistic detail through visualization, they will therefore be presented with a great deal of information per step. Visualizing all time steps in a continuous playback presents a vastly greater challenge still, for the sheer number of time steps means that in total, there are thousands of positions to analyze over millions of simulation steps, resulting in an enormous amount of data. Fast playback can in principle keep the analysis reasonably short, but in practice, this means a very large amount of information per second will be presented to the expert, vastly exceeding their ability to comprehend it and make use of it. Playing the simulation at a sufficiently low speed to understand it entails many hours of playback, sometimes more. Yet MDS often exhibit long periods devoid of any important events, which would make such a visualization process not only long and tedious, but very inefficient. Such simulations are sometimes visualized at a slightly coarser granularity, by representing only nucleotides and not individual atoms, which somewhat reduces the amount of information per frame, but falls far short of fully solving the problem. A suitable solution to this problem would provide experts with abstract visualizations to efficiently analyze uneventful phases of their simulations, while still getting an overview of the simulation’s state and its dynamics; but it also needs to provide them with very detailed visualizations for critical phases of the simulation. It would also need to provide smooth, continuous transitions between these representations, for a seamless and time-efficient experience. To solve this, we take advantage of the hierarchical nature of DNA structures to propose involves seven different representations, ranging from very abstract to very detailed. The more abstract ones present less information per simulation step, and involve less (or no) motion of the visual elements themselves, aside from their colors. This makes such representations comfortable to view at high playback speeds, and thus suited to the relatively uneventful phases of DNA simulations. They provide an overview of what happens over a long period of simulation time, but in a short amount of visualization time. On the other hand, they present insufficient detail for the analysis of important events, where experts need detailed information to gain insight into their DNA systems. For such cases, more detailed representations are required, which in turn require lower playback speeds. In summary, the contributions of our solution are: • A new abstraction set (SynopSet) that has a continuum of visual representations for the explanatory analysis of MDS for DNA nanotechnology. • For each representation, i.e., each SynoPoint in SynopSet, we lay out the design rationale and the use of the Ballchain technique used in 5 different SynoPoints. We also demonstrate smooth and continuous transitions between consecutive SynoPoints. • We show the effectiveness of this novel approach to visual organization and information abstraction on the analysis of seven DNA nanotechnology simulations, generate a summary video, and report the feedback from an expert in DNA nanotechnology. • Finally, we observe that all SynoPoints in SynopSet can be located in a visualization space dubbed SynopSpace, composed of three axes: granularity, visual idiom, and information layout type. We also propose general guidelines on how SynopSpace can be used to generate a new SynopSet for the visualization of other dynamic processes involving hierarchical structures.
DNA nanostructures are ever-increasing in complexity, and understanding their dynamic behavior is key to their research. In this work, we have proposed a visual abstraction set, SynopSet, that spans 7 representations (SynoPoints) to convey the dynamics of DNA nanotechnology simulations at multiple scales, to allow the interactive explanatory analysis of their simulations. The seamless transitions between those representations further helps the user to better understand the dynamic behavior of the MDS, with a lower cognitive burden, particularly to connect the information in the separate representations, thanks to smooth transitions. The most interesting events during a long simulation can thus be quickly identified and examined in further detail. We formalized the description of the SynoPoints with a visualization space called SynopSpace. Its construction and design principles are introduced. We believe there are a variety of other processes that can make use of our concept of visualization space. Biology alone invites further application of SynopSpace as there are many different representations at different granularities, layouts and visual idioms. We submit that the space can be adapted to other domains by adapting the Granularity axis to the domain-relevant concepts, and the Visual Idiom to commonly used visual styles for those granularities. In protein simulation, for instnace, one would only need to change NT to Amino-Acid on the granularity axis to construct the space for it. For cell division as another example, the Granularity axis could have: Cell, Compartment, Organelle, Macromolecule, Molecule, Atom; and the Visual Idiom could be adapted from: Progress bar, Ellipsoid, Stick. Our approach for placing standard visualization techniques in the context of a larger space can lead to systematic discoveries of new visual representations. When inspecting the visualization space in detail, we might see that some idioms are strongly correlated with a certain layout or granularity. It might be that most of the idioms will remain bound by these correlations, but it might also be that some idioms escape their original ties and will form surprising idioms outside their original habitat. For example, the progress bar itself was the highest abstraction, where no structural information is represented. However, here we morph our Progress Bar into multiple Heatbars, Tubes, and eventually Snakes, which is an entirely new way of using the concept of a progress bar. Without placing the progress bar into the three-dimensional visualization space, we might not have found its new promising extended form. Such research methodology applies to various other visualization scenarios where spanning a visualization space by recognizing important dimensions might lead to the development of new, surprisingly effective visual representations. While this work focuses on the visualization of the trajectories that are already generated through molecular dynamics simulations, the concept of SynopSet abstraction and SynopSpace can also be leveraged in the generation of animations for such processes, meaning that the complex behaviors can be first authored in more abstracted representations and then the more detailed animations will be automatically computed, so that the whole animation creation workflow can be made much more efficient. The large data sets1 presented in our work are another side-contribution to the visualization community. Such data sets can be used to test, for example, multiscale visualization, automated camera management systems, automated identification of events and smart labeling according to scales and events.
