350 likes | 428 Views
A subsequent study of visualized DNA sequence comparison based on NKS. Dawei Li Ph.D The Rockefeller University E-mail: dwlidwli@gmail.com or dli@rockefeller.edu. Basic terms about DNA sequence. 1. DNA is made up of nucleotides. 2. Nucleotides include ‘A’ ‘G’ ‘C’ ‘T’.
E N D
A subsequent study of visualized DNA sequence comparison based on NKS Dawei Li Ph.D The Rockefeller University E-mail:dwlidwli@gmail.com or dli@rockefeller.edu
Basic terms about DNA sequence • 1. DNA is made up of nucleotides. • 2. Nucleotides include ‘A’ ‘G’ ‘C’ ‘T’. • 3. For same nucleotide compositions, different sequences show evidently different stability of DNA helix. For example, -GC- is much more stable than -CG-. • 4. There may be some interaction between the nucleotides.
Experimental approaches to study interaction between nucleotides • Atomic force microscope (AFM) It is a very high-resolution type of scanning probe microscope, with demonstrated resolution of fractions of a nanometer. • X-ray diffraction (XRD) The techniques are based on the elastic scattering of x-rays from structures that have long range order. • Neutron scattering The deflection of neutron particles is used as a scientific probe • Nuclear magnetic resonance (NMR) It is a physical phenomenon based upon the quantum mechanical magnetic properties of an atom's nucleus. • Infrared spectroscopy (IR Spectroscopy) It is the subset of spectroscopy that deals with the IR region of the EM spectrum
Questions Four DNA sequences: 1. CTCGGGTTATCGGCGTGGTCCGGCCGAGGGCGGCATTCCAGAAGAGGGACCCTCACGCCACCA 2. CCAGAGCGTCGCCGACCCTCTAATTGGTCTCCCCAGAAGAGGCTGAGAAGAAGGCCGAAACAG 3. AGAGTCCCAGGACACACTGTAGAAGATCAAAGCAGAAGAAGAGGGAAGAGTGGCTGAGGGAC 4. TCAGCCCATCTGCCATCCCCAAAGATAGAAGACACCCCCTTGGTTGCCCTCTAGAAGATCCAGT • Can you figure out which organisms the sequences belong to? • Is there any “inner-difference” between the sequences? • Can you figure out what are the differences and interaction? The question has not been clarified successfully by the existing approaches based on traditional mathematical rules.
Here ‘Wolfram approach’ can do something…. • It may shape the current theory of DNA sequence analysis.
Wolfram approach Wolfram approach which was described in “A New Kind of Science” in 2001 has attracted biologists’ attention. Wolfram approach provides a nontraditional visualized model for DNA sequence comparison. Compared with traditional approaches: • 1) The Wolfram approach is based on the concept that simple rules are able to produce highly complicated behaviors. • 2) It pays attention to the interaction power of regulation between any adjacent nucleotides. • 3) It can show alterations of DNA nucleotides dynamically including transposition, insertion, deletion, and duplication.
It has become possible to study some biological issues that have never been successfully clarified by traditional mathematical methods. However, study of DNA comparison with Wolfram approach is still very few.
About our study • With Wolfram approach, we 1. studied some simple rules; 2. analyzed some DNA sequences of different viruses with a special rule; 3. studied the images visually.
Hypotheses Our studies were based on four hypotheses: • 1) DNA sequence is not random, it has a rule. • 2) There is an uncertain mode of nucleotide organization that each DNA sequence follows. • 3) Simple rule can also produce complex behaviors in living organisms. • 4) Wolfram approach can reflect the rule rooted in DNA sequence.
Rule in the hypotheses The eight arrangements can produce very complicated behaviors.
Nested structure The nested structure defined by wolfram was generated based on one single paternal cell.
The sequences of SARS viruses • SARS BJ01, partial genome; • SARS BJ02, partial genome; • SARS BJ03, partial genome; • SARS BJ04, partial genome; • SARS CUHK-W1, complete genome; • SARS GZ01, partial genome; • SARS HKU-39849, complete genome; • SARS TOR2, complete genome; • SARS Urbani, complete genome; • SARS coronavirus CUHK-Su10, complete genome; • SARS coronavirus isolate SIN2774 complete genome; • SARS coronavirus TW1, complete genome; • SARS coronavirus, complete genome.
Machine Computer SGI Origin 3000 (Silicon Graphics, Inc. 64 500 MHZ IP35 processors) was used throughout our study. Each sequence was run using the same programs. Images More than 3,000 images (200,000 Mega) were generated.
The results of 13 SARS viruses The images were arranged according to the color order in chromatogram (Ref.1).
The SARS viruses behaved quite differently from other viruses. There was a very large nested structure across the beginning 10 kb region.
By comparison, we found the nested structures mainly located in the regions of replicases 1A and 1B. The replicase 1A protein gene may control the activities of the replication complex of SARS viruses.
Comparisons between SARS-CUHK and SARS-GZ • Possible origin of SARS virus (Ref.1).
Comparison of images among five different viruses (a) and (b), SARS virus and equine rhinovirus (ER), respectively, showing the nested structures. (c) Another virus in which only two small nested structures were found. (d) A typical behavior of a common virus. (e) The behavior of HIV. Note: Images in (a), (c), (d), and in (b), (e) are reduced by 15,000 and 1,600 folds, respectively (Ref.1)..
The whole genomes of equine rhinovirus (ER) and SARS virus shared similar nested structures. • SARS virus and human coronavirus 229E were very different in behavior. • No nested structure was found in HIV.
To study whether the nested structures exist in other organisms, we analyzed other ten virus genomes with Wolfram approach. • The ten types of viruses are as follow: • Avian infectious bronchitis virus (and avian infectious bronchitis virus messenger ribonucleic acid (mRNA)), • Bovine coronavirus, • Dengue virus type 4 strain 814669, • Human rhinovirus 1B, • Japanese encephalitis virus strain K94P05, • Murine hepatitis virus, • Pestivirus type 2, • Porcine epidemic diarrhea virus, • Porcine transmissible gastroenteritis virus minigenome, • West Nile virus.
The results showed that all the viruses can be classified into two groups by their behaviors: Group 1 with left bottom growth of white lines; Group 2 with right bottom growth of black lines. No nested structure was found.
We also studied the behavior of mRNA sequence of avian infectious bronchitis virus. • The current results may suggest that: • 1) the region of the nested structure may be involved in the reproduction of the virus; • 2) the coding sequence of the virus may share some kind of similar complex gene regulation cycle with SARS viruses. The nested structure may contain some special bio-information.
Results in summary • 1. SARS viruses showed the nested structure behaviors. The results suggested that the genome sequences should have specific mode of nucleotide organization. • 2. HIV showed another type of mode of nucleotide organization. • 3. The unique characteristics found in the DNA sequence of SARS viruses and the mRNA sequence of avian virus suggested the importance of the nested structure behaviors.
Discussion • Advantages Wolfram approach has some advantages: • 1. It can magnify the tiny changes in whole genome sequence for both overall and detailed analyses. • 2. It can also be used in a single nucleotide scale, such as DNA mutation and polymorphisms (SNPs and microsatellites). • 3. It pays more attention to the interaction network of power and regulation among the adjacent nucleotides. • 4. It is not only appropriate to DNA/RNA sequences, but also to protein sequences.
Disadvantages: • There are some disadvantages in our study, such as only one of the 256 rules was adapted. • As for encoding the nucleotides, quaternary system should be better than binary system, however, quaternary system will result in more rules. (48)
A typical feature of Wolfram approach: Each cell of the DNA sequence has interaction with its adjacent cells. • Scores of power: The power from two sides is not always equal, it is scored between 1 and 0, which represent ‘for’ and ‘against’, respectively.
Four behaviors The behaviors of the DNA sequences can be classified into four categories: • purely repetition with left growth as most common viruses showed; • purely repetition with right growth as HIV showed; • nested structure as SARS viruses showed; • simply identical white or black.
Nested Structure • 1. The nested structure may result from the interaction of aggregation between black and white cells. • 2. It may represent a regulation cycle. A black line may signify the beginning of a protein production cycle, and a white line may signify the end by closing off the triangle.
Interaction network of power • 1. The interaction between nucleotides includes power and anti-power. Each nucleotide receives power from adjacent nucleotides and exerts power as well. • 2. The balance can be easily broken by sequence alteration because of its sensitivity. A single mutation can cause death, this may be because the original nucleotide has a key role in the whole genome. It can also explain why some mutations can be ignored.
Mutation & SNP • Mutation is change to the nucleotide of DNA or RNA sequence. • Single Nucleotide Polymorphism (SNP) is a DNA sequence variation occurring when a single nucleotide in DNA sequence or the genome differs between members of a species.
In conclusion • The traditional intuition is that the behavior should be simple if the rule is simple. This is not true based on the data demonstrated by both Wolfram’s work and our study. The simple rule can actually capture the essential mechanisms responsible for complex phenomena in living organisms. • We applied Wolfram approach in the DNA sequence analysis. Our results supported that the approach is appropriate for visualized sequence comparison, and the approach is a useful categorizer tool. • The results may be fundamental but interesting for the subsequent studies. Further systematic investigations are necessary and the results also need experimental work to be confirmed.
Reference • Ref.1: Li. D et al. Understanding SARS with Wolfram approach. Acta Biochimica et Biophysica Sinica. 2004; 36(1):1-10. • Acknowledgement • Lin He, Zhende Huang, Jurg Ott et al.