220 likes | 365 Views
Function first: a powerful approach to post-genomic drug discovery. Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics. Presented by Jamie Duke April 7 th , 2004. Goal:. To answer this question:
E N D
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by Jamie Duke April 7th, 2004
Goal: • To answer this question: • How can we effectively use the new information from the genome sequencing projects to accelerate the development of new therapeutics that target gene products or their functions?
A “deluge of targets” • There is an overwhelming amount of nucleotide sequences generate from high-throughput sequencing and differential expression methods • Targets are not necessarily even expressed in vivo • The actual targets are waiting to be discovered
Post-genomic drug discovery • Methods must be able to deal with a large number of targets • General strategy relies on high-throughput sequencing of large compound libraries against target proteins • Requires a knowledge of enzymatic activity, or binding against a known ligand • Methods are costly, number of targets analyzed are limited, and a second assay is generally required
Screening Strategies • Use of nuclear magnetic resonance (NMR) and x-ray crystallography • Structure based drug design has also come into play with therapeutics • Both strategies still require analyzing single proteins serially • Best method for the future involves automation
Screening Strategies • Experimental efforts are resource intensive, and limited to proteins that can be cloned • X-ray Crystallography is only possible if the protein can form diffraction-quality crystals • NMR is only possible if the protein is well behaved in solution • Structural biology is only possible with high quality structures
Structure • Structural prediction is neither easy nor cheap • Knowing tertiary structure does not guarantee the transfer of function or small molecule binding sites • Inference of function from similar sequences with known function is correct less than 50% of the time • A “similar sequence” is a sequence that is 30% or more identical, most proteins do not meet this requirement • Additionally, different structures have been known to support the same activity
Selectivity • The aim is to develop truly selective compounds from the beginning of the discovery process • Decrease the failure rate of compounds in development and ultimately lower cost and time
Function in Drug Discovery • Drug discovery starts by determining the function of the drug leads from mining the genomic data • Pathway involvement, catalytic activity, protein class or active-site chemistry • Functional features can be used to develop assays for a more straightforward path • “Parallel large-scale processes and analyses to identify function first will be key for this lead discovery approach to be successful in the post-genomic era.”
Function Assignment • Function of the sequences is often inferred through sequence “similarity” • Function is automatically transferred, and can lead to misannotation and misinterpretation • SAGE and parallel protein analysis are generally used • These experimental procedures allow for the gene product function to be identified in a complex environment yielding data which is used to validate the target • Unfortunately, low copy genes and a high false positive rate limit the use of these methods
Function first approach to structural and chemo-proteomics • Process starts with the identifying a set of protein sequences in the human proteome • Looking for sequence that have particular binding sites, carries out catalysis, or has been previously identified • Proteins are classified by their functional sites • Analysis of the families is key to specific drug design • Structures of family members are determined using protein folding algorithms • Small molecule binding sites are identified • This approach saves crucial time and money in the drug discovery process
Approximate Structure Analysis • For each protein, an approximate model is generated • Algorithms developed by Jeffrey Skolnick for the CASP competitions are used to predict the structure • Models are not perfect due to imperfect scoring functions and energy potentials
Fuzzy Functional Forms™ • A technique to identify biochemical function • An FFF is a motif that describes the chemistry and geometry of the functional site within the protein • Information is based upon known structures in the PDB • Functional residues are identified in related protein structures • Residues are selected based on the nature of the function, chemistry or structure of the site • Geometric constraints are defined for key residues in the structure
Functional Family Approach • Functional sites are (generally) well conserved in families • FFF’s are used to determine all proteins in the genome with the given functional site • Functional families are identified by: • Sites identified by FFF, and • Computational information on the functional site that yields valuable biologically relevant data necessary for drug discovery
Functional Family Approach • The ultimate goal is to identify small molecules that will selectively inhibit a single member of a family, thus reducing interaction with other proteins • This approach allows for the account and classification of functionally related proteins • Provides a better assurance of the “druggability” for a putative target
Complementary to Cell Based Information • This method allows for efficient target validation due to parallel identification • Allows for large-scale identification of function and structure without large-scale investments • Provides information that is relevant to assays being run to determine functionality and interpretation of microarray data
Rapid Analysis of the Structure • Because the structure of the protein does not need to be determined to the atomic level, it is much less computationally intense • Far fewer protein folding is done in silico because only models identified through FFF or high scoring models are folded • The process is automated • Scientists can compute more than 25,000 protein sequences and make structure-function assignments in weeks, not months or years as would take to serially test each sequence through the experimental techniques • The combination of structure and function information that allows for more reliable assignments than sequence based methods
Alternative Drug Binding Sites • One key feature of FFF is it’s ability to identify multiple active sites for one protein • A protein may be annotated as a phosphatase, but it may also have a catalytic site, a metal binding site, and a regulatory site, as does serine-threonine phosphatase • Alternative sites are also potentially druggable • The information from multifunctional
Key Information • It is key to know the structure of the functional site • Recognition of the similarities and differences among a set of potential targets allows for designing specific small molecules that are specific for each member of the family
Conclusion • The function first approach provides an effective way to mine the genomic data to lead to compounds that can be developed into drugs • Using this method, in association with biological, structural, and chemical methods will lead to drug discovery that is more efficient, and cost effective