370 likes | 656 Views
Receptor-based virtual screening. Lab version 2. Virtual screening. Goal: identify ligands that tightly bind to a protein Requirements: a computer database of random potential ligands and a structure of the target protein Repetitively dock new ligands to protein
E N D
Receptor-based virtual screening Lab version 2
Virtual screening • Goal: identify ligands that tightly bind to a protein • Requirements: a computer database of random potential ligands and a structure of the target protein • Repetitively dock new ligands to protein • Score how tightly each ligand may bind • Keep best ‘hits’; discard other ligands
Ligand database • Often databases of commercially available compounds are used – up to 2 million compounds • These take some time to analyze • We will use an NCI diversity set of about 1800 diverse compounds available from the National Cancer Institute • This database contains many interesting compounds but is not exhaustive
Protein target • We need a structure to serve as a target for ligand binding • This can be an X-ray crystallographic structure or a high-quality homology model • We need some idea of where the binding site for ligands is as well • If the protein has multiple conformations, choose the appropriate one
Scoring • To find the best ligands we must score the docked complexes • Vina does this, giving a DG score • Other scoring methods are available such as X-score and DrugScore
Automation • Virtual screening involves docking new ligands repetitively • We will dock with Vina and automate the docking with a Perl script • Automation includes selecting a new ligand from the database, running Vina, recording the docking score etc.
Output • You will get a list of hits (ligand numbers) • You can select in advance how many hits you want to look at – for a database of 2000, maybe 20 hits is a reasonable number • You can recover these hits as PDB files from the (docked_pdb folder) and view them docked to your protein
Set up • Patience! • We are trying to emulate much more functional systems • Expect delays
Preparing your computer • In the C: directory, copy the folder VirtualScreen2 • VirtualScreen2 contains most of the files you will need and many of the folders
Installing Perl • Google ‘CPAN’ (the site for Perl) • Download a ‘binary’ for Perl • For PCs this will probably be ActivePerl • Install Perl • Test Perl; get a ‘Command Prompt’ from start;Programs;accessories;CommandPrompt • Type: perl –v • You should get information about perl version
Look at a PDBQT file • Ligands have torsion (twist and bend) features • Look in the database folder db_pdbqt • Look at ligand1.pdbqt • Open file by right-clicking and using ‘open with, wordpad’ • ‘BRANCH’ data indicates where ligand1 can rotate (3 places)
Check Vina • Test files are present in \lm\VirtualScreen • These are for a receptor and drug ligand • 2rhnh.pdbqt, carh.pdbqt, config2.txt • To run Vina type at command prompt: • \lm\downloads\vina.exe --config config2.txt • The program takes a minute or so to run • Test_vina.txt should give a list of energies for 9 alternative docked conformations
Check ligand database • Go to VirtualScreen2\db_pdbqt directory • NCI diversity set = about 1800 chemicals • Parent DB from NCI is called Ncidiv_p0.0 • These are chemicals available from NCI for testing • We have about 1800 .pdbqt files, one per chemical
Target protein • Much of VirtualScreen2 relies on the target protein for binding • A single name (ideally the PDB code) should be used throughout • Any name variation will stop the program
Prepare target • In VirtualScreen2 • Make a new directory with a one_word name of your target protein –example 2rht_a • In your target directory place two PDB files: • rech.pdbqt = your receptor/protein; must be called ‘rech.pdbqt’ • xtal-lig.pdb = a reference ligand that will be used to define the binding site • Look in folder 2rht_a to see example
Making rech.pdbqt • Start with your receptor/protein without any ligand • Make a copy of the PDB file and delete lines referring to your ligand 3-letter code • Save
Making your rech.pdbqt file • Add hydrogens • There are two methods • Open your protein in DS Viewer • -- click on ‘tools’ then ‘hydrogens’, ‘add’ • You should see H’s added • Or use OpenBabel on the Command Line • Babel.exe –ipdb 2nht.pdb –opdb 2nhtH.pdb -h • (substitute the name of your protein)
Making your rech.pdbqt file • Now convert the PDB file to PDBQT, adding hydrogen bonding information • Use MGLtools (AutoDock tools) • Install if you do not have it • Start program; you will get a window • In the middle of the lower bar is ‘Grid’ • Click ‘Macromolecule’ on the menu and open your pdb+hydrogens file. • Then choose ‘output’ and save as a .pdbqt file
Making your rech.pdbqt file • The file should be ready at this point • Check that file contains hydrogens (only polar Hydrogens are included) • Check that file has hydrogen bonding info on the right margin with entries like HD (indicating hydrogen donor) or OA (oxygen hydrogen bond acceptor) or C, doing nothing
Reference ligand • The reference ligand PDB file serves only one purpose: • It defines the region of the protein that Vina will search • If the ligand is in the wrong place, Vina will search the wrong place. • Copy the ligand from a trusted protein-ligand complex file
Editing the Virtual2.pl script • Information on how the virtual screen should run is included in the script • You must tell the script what to do • At runtime this information is used
VS adjustable features • Edit Virtual2.pl • You can adjust: • Target_name – must match a folder name • Filenum (file number) – use new number to avoid deleting previous experiments • Number of ligands to screen – use ‘stop’ and ‘start’
Target_name • $target_name defines the target for analysis • It should = the name of the folder that holds rech.pdbqt • E.g. $target_name = “2rht_a”; • For the example search • There is a folder called 2rht_a that matches and has the files needed for the search
Number of ligands • You can adjust the start and stop point for searching the database • – do only 5 to start… 1800 may take days on your machine (21 hours on my machine) • Time the length of time needed to do 5 ligands and multiply by 360 to calculate the time required for the whole database • The database can be split up using ‘stop’ and ‘start’ and run at different times
Editing the script • Right click on virtual.pl and choose open with Wordpad • At the top of the script is information • The section labeled for editing can be changed • If you are going to make big changes, save a copy of the original script • You must enter the name of your protein exactly as the folder is named • Edit carefully, do not delete #’s or ;’s
Before you begin VS • Have you set the number of ligands to 5? (0-5) • This should take 3 – 30 minutes (you should time it) • If something goes wrong the first time (it usually does) no harm done. • To stop the program, use ctrl-C (repeat if necessary)
Running VS • Get a command prompt (start;programs;accessories;command prompt) • Type: cd \virtualscreen2 • (this gets you to the right directory if needed) • Type: virtual2.pl • The program should run and stop in less than an hour if you are doing 5 ligands (2-10 minutes is likely)
Looking at the results • The results are in the vs_log folder (\virtualscreen2\vs_log) • The output file has the file numbers of the hits, ranked from best to worst. • Results files are marked with filenum to avoid overwriting • Sample file: 2rht_a_results2.txt
Looking at hits • Open your hits results file or open the example file 2rht_a_results.txt • The predicted DG of binding is shown and the ligand number • A more negative DG indicates tighter binding • The average DG for all ligands is shown • For my data, ligand 438 is best
Looking at one ligand • We can look at the best hit from 2rht_a • In db_pdb look for ligand438.pdb the best hit for the example • (db_pdb contains un-docked molecules) • Look at this file with RasMol • It has a symmetric set of fused rings – this type of molecule is usually an artefact, it binds to everything – other hits may be better
Looking for a good pose • A ‘pose’ is a ligand conformation bound to a protein • To view the conformation of a docked ligand after VS, look in the docked_pdb folder • These files can also be added to a protein file to view docking • Save molecules you like, because they can be overwritten
Viewing complexes • The ligand .pdb file contents can be spliced onto the end of a copy of the receptor file used in virtual screening • The complex can be viewed in RasMol • Especially note what receptor residues the ligand contacts
Ligand – protein contacts • Splice ligand onto receptor in PDB file • Ligand should be named LIG in PDB file • Run contact12.pl script • Example: • contact12.pl 2rht_lig438.pdb LIG • Contacts appear on screen and in file ‘contact_output.txt’
The role of good judgment • The value of virtual screening is that one can go from thousands or millions of candidate drugs with 0.01% - 0.1% leads to tens or hundreds of hits with 1% -10% leads • Hits are not leads • They are a step toward getting leads