Error Tolerant NMR Structure Determination by Combining Structure Templates

Project: Research

View graph of relations

Description

Proteins are important molecules in living organisms and the identification of these structures is under intensive study. However, many protein structures remain unsolved. Two common techniques to determine protein structures are X-ray crystallography and Nuclear Magnetic Resonance (NMR). X-ray crystallography determined most known structures.Compared to X-ray crystallography, NMR spectroscopy can identify important structure dynamics. However, existing systems for NMR structure calculation require high quality data, which complicates the wet lab technicians’ work. One NMR structure costs about US$150,000 and an expert’s work for ~0.5 year. This can be significantly alleviated through a new calculation system which requires less data, is error tolerant, and is algorithmically efficient.We propose such a system. Several ideas from our experiences indicate that this is possible. First, we utilize the wealth of knowledge implicitly coded in solved structures. Threading, an approach from protein structure prediction, works on this principle. Our plan is to extend threading to work on NMR data. However, threading is a hard problem, due to its similarity to the NP-hard graph isomorphism problem. Our system faces an even more difficult problem -- the optimal solution will not guarantee us the actual structure, due to real-world factors. Hence, our new algorithm will need to be examined against real-world data for meaningful modifications.Our second idea is to use probabilistic graphical models (PGMs), which capture uncertainties and dynamics in the data. The PGMs will solve three tasks. First, given the template structure and the target, we need to compute alignments. However, data errors will adversely affect our results. Here, a PGM can be used to sample alignments; correct alignments will be sampled with high probability. Second, each sampled alignment will be extended into structure models, from which we select good ones. A PGM, modeled with NMR constraints and existing energy functions, can solve this. Finally, structure dynamics in NMR data needs to be modeled in structures. PGMs can be used to model the continuations of these dynamics. In all these tasks, knowledge from solved structures can be used to bias the PGMs into providing more accurate results.The structure computed from the alignment will enter a final stage where it is refined into a “native” structure. Existing approaches are inefficient. Our project will study ways to use a pre-computed database to accelerate searches.Expected output of this project includes high quality research publications, algorithms, software for structural calculation, and training of PhD students.

Detail(s)

Project number9041805
Grant typeECS
StatusFinished
Effective start/end date1/12/1221/07/16