600 likes | 858 Views
New questions about Tautomerism in Cytosine. Quantum chemical and matrix isolation spectroscopic studies. Géza Fogarasi Research Laboratory of Theoretical Chemistry, Eötvös L. University, H-1518, Pf. 32, Budapest/Hungary. 1. Background and motivation.
E N D
New questions aboutTautomerism in Cytosine Quantum chemical and matrix isolation spectroscopic studies Géza Fogarasi Research Laboratory of Theoretical Chemistry, Eötvös L. University, H-1518, Pf. 32, Budapest/Hungary.
1. Background and motivation Nucleic acid bases are the most conspicuous examples of tautomerism. The double helix of DNA is fixed by H-bonds, based on specific tautomeric forms of the bases
The significance of tautomerism was recognized from the beginnings: Watson and Crick, Nature 1953. ……………………………
For the structural chemist, very first question: energy differences However, the basic challenge for theory should be realized at the beginning: Tautomers - unlike conformers – have completely different electronic structures. At the same time, the energy differences may be as low as for conformers, just a few kcal/mol. Then: can we calculate (relative) energies for different electronic systems with an accuracy of ~ 0.5 kcal/mol?! (This is realistic for conformers, but quite questionable for tautomers.)
2. Test calculations, relative energies Formamidic acid Formamide Vinylamine Acetaldimine
Methods: Electronic theory: RHF, B3LYP, MP2, CCSD(T) Basis sets: from 6-31G(d,p) to 6-311++G(3df, 3pd) and cc-PVTZ to aug-cc-PV5Z
Table 1. Computed energies for tautomer pairs (energies, E in a.u.= 4.3594 10-18 J, differences, E in kcal = 4.184 kJ).
Table 1. contd2. a Notations follow standard convention except that in the correlation consistent basis sets “cc” is tacitly assumed, thus omitted for brevity. After the double slash ‘//’ the level of geometry optimization is indicated; ’~’ means optimization at the same level as the energy calculation.b All coupled cluster calculations were done at the MP2/aug-pVTZ geometry.
Conclusion of highest level test calculations Tautomer energy differences with the largest basis sets from above, kcal/mol Bad news: a) the two systems behave differently!! b) CC (coupled cluster) level is needed Good news: SD and SD(T) essentially the same
N H N H N H 2 2 2 4 4 4 N 5 N N 3 5 5 3 3 2 6 H 2 6 2 6 N N N O 1 O O 1 1 H H 1 2 a 2 b H H N N H H 4 4 N 5 N 5 3 3 2 6 2 6 N N O 1 O 1 H H 3 a 3 b 3. CYTOSINE tautomers The most prominent example of a “multiform” molecule: Five low-energy isomers (three tautomers plus two rotamers)
Relative energies Summary of extensive calculations 1: the “canonical” oxo form; 2b: enol; 3a: imino Watch out, black-box users!: DFT gives a qualitatively different picture! Theoretical results, e.g.: CCSD(T)[f.c.]/cc-pVTZ// rfg 1 2b 3a 1.51 0. 1.49 -0.54 0. 1.27 -0.28 0. 1.74 B3LYP/6-311++G(2d,2p): B3PW91/6-311++G(2d,2p):
Conclusion: the three low-energy tautomers of cytosine are within a range of ~ 2 kcal/mol Theoretical results CCSD(T)[f.c.]/cc-pVTZ// rfg 1 2b 3a 1.51 0. 1.49 And, rotamers: 2a 0.8 kcal/mol above 2b, 3b 2 kcal/mol above 3a
The effect of entropy: differences between tautomers made even smaller (T = 298 K, kcal/mol) ________________________________________________________________________________________________________________________________________________________________ aGeometries (moments of inertia) from CCSD/TZP, vibrational frequencies from MP2/TZP, electronic energies CCSD(T)/cc-pVTZ. bB3LYP/6-311++G(2d,2p). cTotal Gibbs free energy from nuclear motions, including the constant contributions from translation and Hrot. dRelative to tautomer 2b. e After adding the electronic energies, from Tables 3 and 5, respectively, including the corrections for non-planarity from Table 4, see also text. Note again the failure of DFT!
4. The mechanism of intramolecular proton transfer Transition states Tautomer pairs formamide- formamidic acid and formamidine - formamidine FMD-FACFIM-FIM ETS+167 TS ETS+149 TS B3LYP /6-31G(d,p) -2.82338 46.2 -0.93823 45.6/6-311++G(2d,2p) -2.88299 48.1 -0.98977 47.7/6-311++G(3df,3pd) -2.89097 47.7 -0.99666 47.2 MP2 /6-31G(d,p) -2.34658 46.8 -0.48803 47.0/6-311++G(2d,2p) -2.47062 47.4 -0.59244 47.6/6-311++G(3df,3pd) -2.53860 45.5 -0.65601 45.5/PVTZ//~ -2.53331 45.3 -0.64943 45.6 Imag. frequency (1894 cm-1)(1925 cm-1)/aug-PVTZ//~ -2.54854 45.3 -0.66421 45.6/PVQZ//aug-PVTZ -2.59031 45.4 -0.70087 45.6/aug-PVQZ//aug-PVTZ -2.59709 45.3 -0.70747 45.4/PV5Z//aug-PVTZ -2.61135 45.3 -0.71972 45.5aug-PV5Z//aug-PVTZ -2.61443 45.3 -0.72260 45.4 Vibrational frequencies have been calculated at the MP2/PVTZ level. Each system has one single imaginary frequency, indicating that the TS is indeed a first order saddle point.
Tautomer pairs formamide- formamidic acid and formamidine – formamidine contnd. FMD-FACFIM-FIM ETS+167 TS ETS+149 TS CCSD//MP2 /aug-PVTZ -2.60032 50.0 -0.72427 50.7/PVQZ -2.67723 50.0 -0.79316 50.7/aug-PVQZ -2.68423 50.0 -0.80005 50.6/PV5Z -2.71257 50.1 -0.82606 50.8/aug-PV5Z -2.71611 50.1 CCSD(T)//MP2 aug-PVTZ -2.63290 47.2 -0.75677 47.8/PVQZ -2.71169 47.1 -0.82743 47.8/aug-PVQZ -2.71926 47.1 -0.83486 47.6/PV5Z -2.74835 47.2 -0.86156 47.8/aug-PV5Z -2.75210 47.1 TS in cytosine (amine – imine) : ~ 40 kcal/mol Conclusion: all barriers are far too high for proton transfer to occur.
5. The effect of water • a) Affects the relative energies • b) Affects the TS barrier Model: supermolecule water molecule(s) added explicitly
a) Relative energies of cytosine-monohydrates Questions: 1) favorite binding positions? 2) binding energy difference between tautomers?
The favorite binding place is the same in all three tautomers. Optimized structures:
The keto form binds water significantly stronger Dissociation energies of cytosine-monohydrates (stabilization by water), kcal/mol. One single water molecule makes already the keto form 1 more stable energetically than the hydroxy form 2b.
b) The TS barrier: Water as a catalyst Test: Formamide plus water, water may mediate proton transfer
Formamide ↔ Formamidic acid Monohydrate _______________________________________________________________________________________________________ For comparison, remember: w/o water it was E ~10.5, ETS ~ 45 kcal
Cytosine 1 ↔ 3a TS with water Quantum Chemical Transition State Barrier for Cytosine.H2O and Cytosine.2H2O Method Basis set Cyt.H2O Cyt.2H2O DFT (b3lyp) 6-31G(d,p) 15.8a 15.8 6-311++G(d,p) 18.1 17.7 cc-pVTZ 17.7 17.8 MP2 6-31G(d,p) 18.3 19.3 6-311++G(d,p) 19.6 19.7 cc-pVTZ 17.1 18.0 CCSD(T)//MP2/pVTZ aug-pVDZ 19.5 -- cc-pVTZ 18.8 -- aImaginary frequency: 1577 cm-1 TS reduces by more than a factor of 2; second H2O indifferent
6. The elusive question of cytosine tautomers (experimental and theoretical information) Solid state and aqueous solution: general agreement that only the “canonical” keto form is present In the gas phase, however: hydroxy form dominates
Experimental data (spectroscopy) (temperature poorly defined, 200-300 oC) Estimated ratios for amino-hydroxy : amino-oxo : imino-oxo (2 : 1 : 3a) Matrix isolation IR Radchenko, .. Blagoi, 1984 Nowak, Lapinski, Fulara 1989 Szczesniak, .. Person, 1988 Theor., from G above 1.0 : 0.4 : 0.4 1.0(b+a rotam) : 0.5 : 0.1 Discrepancy about the “rare” imino form Molecular beam MW Brown et al., 1989 1.0 (b rotam. only) : 1.0 : 0.25
Own experiments 1.: matrix isolation IR spectrum C1(keto): 1730 (C=O) 1660 (C=C) C3a(imino): 1680 (C=N) Experimental details Cytosine sample from Sigma-Aldrich (solid sample, 99% purity), evaporation at 155 oC, matrix: Argon (99.9997%) and Krypton (99.998%). UV Spectrometer: Varian Cary 3E UV-VIS (1 nm resolution), IR spectrometer: Bruker FTS 55 (1 cm-1 resolution).
X-H stretching range C2b(enol) 3620 (O-H) C3a(imino): 3490 (N1-H) 3440 (N3-H)
UV and ionization spectra Disturbing results in the literature …. • Previous UV spectra restricted to solid state or solution • All interpretations based on the keto form. 2. Sophisticated, elegant new, multiphoton experiments in the gas phase (molecular beams) yield contradictory results.
Theoretical calculation of the UV spectrum We have carried out extensive calculations on all three tautomers Method Electronic excited states: Equation of Motion Coupled Cluster (EOM-CC) Vibrational structure: Linear Vibronic Coupling (LVC) While previous computations were restricted to vertical excitations, we have performed the first simulation of the complete spectrum of cytosine, using the Linear Vibronic Coupling (LVC) method.[12] In this method the full (electronic and nuclear) wave function is evaluated as a coordinate-dependent combination of the excited electronic wave functions at a reference geometry, normally the ground state equilibrium. The calculations require the derivatives of the excited state energy with respect to the ground state normal coordinates. The nuclear wave functions are expanded in the harmonic oscillator basis of the ground state. We used in the LVC model the four lowest excited electronic states (Table 1) and fourteen vibrational modes with several quanta on each of them.
Own experiments 2.: matrix isolation UV spectrum Theory: keto form only intensity energy (eV)
Theory: Mixture of 3 tautomers intensity energy (eV)
Theory: keto only Theory: mixture
How may tautomers form during experiments? Given that crystalline cytosine contains only the keto form the basic question is: how do the tautomers form under the experimental circumstances? As mentioned above, all quantum chemical calculations predict a high barrier of about 40 kcal/mol for unassisted transformation of cytosine monomer. Traces of water may play a crucial role as water reduces the barrier by about a half. In addition, the interesting possibility of bimolecular tautomerization was brought up recently by Rodgers. We have made our own calculations to test this idea. Three H-bonded forms of 1:1 are shown in Fig. 3. From dimer a new dimer 2b:3a can be derived, leads to 2b:2b and to 3a:3a. Each tautomerization has its own transition state barrier as indicated under the picture. Given that crystalline cytosine contains only the keto form the basic question is: how do the tautomers form under the experimental circumstances? As mentioned above, all quantum chemical calculations predict a high barrier of about 40 kcal/mol for unassisted transformation of cytosine monomer. Traces of water may play a crucial role as water reduces the barrier by about a half [7]. In addition, the interesting possibility of bimolecular tautomerization was brought up recently by Rodgers [8,9]. We have made our own calculations to test this idea. Three H-bonded forms of 1:1 are shown in Fig. 3. From dimer a new dimer 2b:3a can be derived, leads to 2b:2b and to 3a:3a [5]. Each tautomerization has its own transition state barrier as indicated under the picture. Given that crystalline cytosine contains only the keto form the basic question is: how do the tautomers form under the experimental circumstances? As mentioned above, all quantum chemical calculations predict a high barrier of about 40 kcal/mol for unassisted transformation of cytosine monomer. Traces of water may play a crucial role as water reduces the barrier by about a half [7]. In addition, the interesting possibility of bimolecular tautomerization was brought up recently by Rodgers [8,9]. We have made our own calculations to test this idea. Three H-bonded forms of 1:1 are shown in Fig. 3. From dimer a new dimer 2b:3a can be derived, leads to 2b:2b and to 3a:3a [5]. Each tautomerization has its own transition state barrier as indicated under the picture. Given that crystalline cytosine contains only the keto form the basic question is: how do the tautomers form under the experimental circumstances? As mentioned above, all quantum chemical calculations predict a high barrier of about 40 kcal/mol for unassisted transformation of cytosine monomer. Traces of water may play a crucial role as water reduces the barrier by about a half [7]. In addition, the interesting possibility of bimolecular tautomerization was brought up recently by Rodgers [8,9]. We have made our own calculations to test this idea. Three H-bonded forms of 1:1 are shown in Fig. 3. From dimer a new dimer 2b:3a can be derived, leads to 2b:2b and to 3a:3a [5]. Each tautomerization has its own transition state barrier as indicated under the picture. Given that crystalline cytosine contains only the keto form the basic question is: how do the tautomers form under the experimental circumstances? As mentioned above, all quantum chemical calculations predict a high barrier of about 40 kcal/mol for unassisted transformation of cytosine monomer. Traces of water may play a crucial role as water reduces the barrier by about a half [7]. In addition, the interesting possibility of bimolecular tautomerization was brought up recently by Rodgers [8,9]. We have made our own calculations to test this idea. Three H-bonded forms of 1:1 are shown in Fig. 3. From dimer a new dimer 2b:3a can be derived, leads to 2b:2b and to 3a:3a [5]. Each tautomerization has its own transition state barrier as indicated under the picture.
Three H-bonded dimers derived from 1. : 6.3, : 4.9, : 7.7 From dimer a new dimer 2b:3a can be derived, leads to 2b:2b and to 3a:3a. The lowest TS leads to the hydroxy form 2b. The transition states which lead, partly or fully, to the imino form 3a lie significantly higher. Thus, this model may explainwhy the imino form appears in the gas state with much smaller abundance than expected. : E(TS) = 6.3 kcal/mol : E(TS) = 4.90 kcal/mol : E(TS) = 7.7 kcal/mol : E(TS) = 6.3 kcal/mol : E(TS) = 4.90 kcal/mol : E(TS) = 7.7 kcal/mol
7. The mechanism of proton transfer Ab initio simulation of dynamics The notion of reaction mechanisms is based on the Born-Oppenheimer (B-O) approximation: atoms move on a potential energy surface (PES) defined by the electronic energy as a function of nuclear positions. In the simplest models reactions follow the minimum energy pathway (MEP), going through a transition state (TS). The MEP expressed in mass-weighted Cartesians is referred to as the internal reaction coordinate, IRC. Recent computations have shown that reactions may follow a route totally different from the IRC. (W.L. Hase, Science 2002; M. Dupuis, Science 2003).
True dynamics calculations require knowledge of the complete PES, and recent methods generate it "on the fly". The well-known Car-Parrinello method is most efficient computationally because the electronic wave function is "propagated", and not optimized, at the trajectory points. As a consequence, the system is moving close to, but not exactly on the B-O surface. In B-O dynamics, the wave function of a QC method is fully optimized in each step along the trajectory. Energy and first derivatives are determined from ab initio wf, with the atomic movements calculated from them classically. This is the approach adopted here. using Verlet's algorithm. The QC method was DFT(B3LYP)/3-21G.
Ab initio simulation of cytosine tautomerization Note the synchronous change of the relevant N – H bonds
The End Acknowledgement. Financial support has been provided by Hungarian science grants NKTH-OTKA-A07, no. K 68427 and OTKA K72423.
Before drawing conclusions, we have to carefully analyze the available experimental information. Among numerous studies on cytosine, the UV spectrum in trimethylphosphate solution[8], in films[9] and for single crystal and aqueous solution[10] are of special interest for the present work. All spectra show broad, envelop-type bands; thus, the major question has been one of the number of individual electronic transitions. Although the deconvolution seems rather uncertain, it is common in all studies that the UV spectrum is discussed in terms of four electronic transitions. Specifically, a classic circular dichroism (CD) study on cytosine-nucleosides from Eyring's group[11] relates the spectrum to benzene: "The results give […] conclusive evidence for four electronic transitions in the cytosine bases above 190 nm which may be related to the B2u, B1u and E1u bands of benzene." Other studies relied on semiempirical quantum chemical calculations to resolve the spectrum.[9b] A linear dichroism (LD) study on the single crystal gave better resolution of the absorption bands by making use of the variation of intensities as a function of the polarization direction.[10] Also, the polarization behavior indicates that no A'' transition gives appreciable contribution to the spectrum. This study reports also the aqueous spectrum that we have redrawn in Figure 1a. The authors quote four excitations, with = 37.5, 43.0, 45.2 and 50.0 x 103 cm-1, all of -* type. While previous computations were restricted to vertical excitations, we have performed the first simulation of the complete spectrum of cytosine, using the Linear Vibronic Coupling (LVC) method.[12] In this method the full (electronic and nuclear) wave function is evaluated as a coordinate-dependent combination of the excited electronic wave functions at a reference geometry, normally the ground state equilibrium. The calculations require the derivatives of the excited state energy with respect to the ground state normal coordinates. The nuclear wave functions are expanded in the harmonic oscillator basis of the ground state. We used in the LVC model the four lowest excited electronic states (Table 1) and fourteen vibrational modes with several quanta on each of them. In addition, applying a new development to be published separately, the effect of non-adiabatic couplings was also included. The theoretical and experimental spectra are shown in Figure 1. The resolution of the simulated spectra was deliberately made low, to conform to the experimental spectrum. For details see the Supplement
There is, however, a serious discrepancy with a REMPI study of laser-desorped, jet-cooled cytosine.[14] In the limited range studied, 31000-38000 cm-1, two vibronic regions were found and assigned to two tautomers! Region R1, 31800-32100 cm-1, should belong to the keto form and region R2, 36000-37600 cm-1, to the enol form. This interpretation would completely contradict our theoretical results: for the relevant transition I we have the range 36000-40000 cm-1 and accepting R1 would mean that our calculation is off by ~ 1 eV. This seems highly improbable: for four analogous -* excitations in pyrimidine, the differences are 0.3 to 0.4 eV[15] at the CCSD level and even less if triples are considered. The REMPI study on cytosine is supported by spectral hole-burning[14b] but the vibrational structure of R1 with splittings of 40-50 cm-1 is totally unexplained. It may be added that region R2, rather than R1, would be a reasonable candidate for the keto form in our results. Second, transition moments, an integral part of the calculations, can be compared with linear dichroism (LD) measurements on single crystals,[10,13] as well as on polycrystalline cytosine embedded in stretched polyvinyl alcohol sheets[9,17] (Table 2). The experimental values being by nature rather uncertain, we confirm ref.[17b] as opposed to ref.[17a] on band I. For band IV, theory makes clear selection between the alternatives in ref.[10] These LD measurements also give information about the spectrum interpretation discussed above: all experiments agree that the four absorptions involve transition moments in the plane of the molecule. In the theoretical results (Table 1), two of the four excitations are type A'' involving transition moments perpendicular to the molecular plane. By contrast, two other experiments support our results. First, a recent study[16] measured the resonance Raman spectra of cytosine around its 267 nm absorption (band I above). When scanning the exciting laser wavelengths from 290 nm to 244 nm, the spectra showed no change, "indicating that the Raman spectra are enhanced by a single electronic transition." At 244 nm a significant change in the excitation profile was observed, suggesting that "enhancement of the vibrations by the ~200 nm absorption band is becoming a factor". Thus, although the authors do not discuss explicitly the number of electronic transitions, their interpretation of the spectra is based on only two transitions, 267 and ~200 nm, corresponding to I and IV above.