Non-canonical base pairing: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 15:58, 13 June 2023 editOhanaUnited (talk \| contribs)Autopatrolled, Administrators32,771 edits →History: replace← Previous edit		Revision as of 16:07, 13 June 2023 edit undoOhanaUnited (talk \| contribs)Autopatrolled, Administrators32,771 edits →Structure: continue replacementNext edit →
Line 24:		Line 24:

	Considering the immense importance of the non-canonical base pairs in RNA structure, folding and functions, researchers from multiple domains – biology, chemistry, physics, mathematics, computer science, etc., have joined in the effort to understand their structure, dynamics, function and their consequences. The complexities associated with experimental handling of RNA further underline the importance of diverse theoretical inputs towards addressing these issues.		Considering the immense importance of the non-canonical base pairs in RNA structure, folding and functions, researchers from multiple domains – biology, chemistry, physics, mathematics, computer science, etc., have joined in the effort to understand their structure, dynamics, function and their consequences. The complexities associated with experimental handling of RNA further underline the importance of diverse theoretical inputs towards addressing these issues.

			== Types of Non-canonical Base pairs==
			Two bases may approach each other in various ways, eventually leading to specific molecular recognition mediated by, often non-canonical, base pairing interactions, in addition to strong ]. These are essential for the process of RNA single strands folding into three-dimensional structures. Early studies on such unusual base pairs by Jiri Sponer, Pavel Hobza and their group were somewhat disadvantaged due to the unavailability of suitable unambiguous systematic naming schemes.<ref>{{Cite journal\|last=Šponer\|first=Jiří\|last2=Leszczynski\|first2=Jerzy\|last3=Hobza\|first3=Pavel\|date=1996-01\|title=Structures and Energies of Hydrogen-Bonded DNA Base Pairs. A Nonempirical Study with Inclusion of Electron Correlation\|url=http://dx.doi.org/10.1021/jp952760f\|journal=The Journal of Physical Chemistry\|volume=100\|issue=5\|pages=1965–1974\|doi=10.1021/jp952760f\|issn=0022-3654}}</ref> While some of the observed base pair were assigned names following the ] nomenclature scheme.<ref>{{Cite book\|url=http://dx.doi.org/10.1007/978-1-4612-5190-3_1\|title=Principles of Nucleic Acid Structure\|last=Saenger\|first=Wolfram\|date=1984\|publisher=Springer New York\|isbn=978-0-387-90761-1\|location=New York, NY\|pages=1–8\|doi=10.1007/978-1-4612-5190-3}}</ref> others were arbitrarily assigned names by different researchers. It may be mentioned that some attempts were also made by ] and coworkers to classify base-base association in terms of adjacency of bases, through either pairing or stacking interactions.<ref>{{Cite journal\|last=Sykes\|first=Michael T.\|last2=Levitt\|first2=Michael\|date=2005-08\|title=Describing RNA Structure by Libraries of Clustered Nucleotide Doublets\|url=http://dx.doi.org/10.1016/j.jmb.2005.06.024\|journal=Journal of Molecular Biology\|volume=351\|issue=1\|pages=26–38\|doi=10.1016/j.jmb.2005.06.024\|issn=0022-2836}}</ref> There was clearly a need for a classification scheme for different types of non-canonical base pairs, which could comprehensively and unambiguously handle newer variants coming up due to the rapid increase in the sampling space. Different approaches which have evolved in response to this need are discussed below.

			=== Classification based on hydrogen bonding ===
			{\| class="wikitable sortable mw-collapsible"
			\|+
			!Interacting edges
			!Glycosidic bond orientation
			!Nomenclature
			!Local strand direction
			\|-
			\|Watson-Crick/Watson-Crick
			\|Cis
			\|cWW or ''cis'' Watson-Crick/Watson-Crick
			\|Antiparallel
			\|-
			\|Watson-Crick/Watson-Crick
			\|Trans
			\|tWW or ''trans'' Watson-Crick/Watson-Crick
			\|Parallel
			\|-
			\|Watson-Crick/Hoogsteen
			\|Cis
			\|cWH or ''cis'' Watson-Crick/Hoogsteen
			\|Parallel
			\|-
			\|Watson-Crick/Hoogsteen
			\|''Trans''
			\|tWH or ''trans'' Watson-Crick/Hoogsteen
			\|Antiparallel
			\|-
			\|Watson-Crick/Sugar edge
			\|Cis
			\|cWS or ''cis'' Watson-Crick/Sugar edge
			\|Antiparallel
			\|-
			\|Watson-Crick/Sugar edge
			\|Trans
			\|tWS or ''trans''Watson-Crick/Sugar edge
			\|Parallel
			\|-
			\|Hoogsteen/Hoogsteen
			\|Cis
			\|cHH or ''cis'' Hoogsteen/Hoogsteen
			\|Antiparallel
			\|-
			\|Hoogsteen/Hoogsteen
			\|Trans
			\|tHH or trans Hoogsteen/Hoogsteen
			\|Parallel
			\|-
			\|Hoogsteen/Sugar edge
			\|Cis
			\|cHS or ''cis'' Hoogsteen/Sugar edge
			\|Parallel
			\|-
			\|Hoogsteen/Sugar edge
			\|Trans
			\|tHS or trans Hoogsteen/Sugar edge
			\|Antiparallel
			\|-
			\|Sugar edge/Sugar edge
			\|Cis
			\|cSS or cis Sugar edge/Sugar edge
			\|Antiparallel
			\|-
			\|Sugar edge/Sugar edge
			\|Trans
			\|tSS or trans Sugar edge/Sugar edge
			\|parallel
			\|}

			]

			The nucleotide bases are nearly planar heterocyclic moieties, with ], and with several hydrogen bonding donors and accepters distributed around the edges, usually designated as W, H or S, based on whether the edges can respectively be involved in forming Watson-Crick base pair, Hoogsteen base pair, or, whether the edge is adjacent to the C2’-OH group of the ribose sugar. ] and Neocles Leontis<ref name="leontis">{{Cite journal\|last=Leontis\|first=Neocles B.\|last2=Westhof\|first2=Eric\|date=2001-04\|title=Geometric nomenclature and classification of RNA base pairs\|url=http://www.journals.cambridge.org/abstract_S1355838201002515\|journal=RNA\|volume=7\|issue=4\|pages=499–512\|doi=10.1017/S1355838201002515\|pmc=PMC1370104\|pmid=11345429\|issn=1355-8382}}</ref> used these edge designations to propose a, currently widely accepted, nomenclature scheme for base pairs. The hydrogen bonding donor and acceptor atoms could thus be classified in terms of their positioning along their three edges, namely the Watson-Crick or W edge, the Hoogsteen or H edge, and the Sugar or S edge. Since base pairs are mediated through hydrogen bonding interactions based on hydrogen bond donor-acceptor complementarity, this, in turn, provides a convenient bottoms-up approach towards classifying base pair geometries in terms of respective interacting edges of the participating bases. It may be noted that, unlike the Hoogsteen edge of purines, the corresponding edges of the pyrimidine bases do not have any polar hydrogen bond acceptor atom such as N7. However, these bases have C—H groups at their C6 and C5 atoms, which can act as ], as proposed by ].<ref>{{Cite journal\|last=Stombaugh\|first=Jesse\|last2=Zirbel\|first2=Craig L.\|last3=Westhof\|first3=Eric\|last4=Leontis\|first4=Neocles B.\|date=2009-04-01\|title=Frequency and isostericity of RNA base pairs\|url=https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkp011\|journal=Nucleic Acids Research\|language=en\|volume=37\|issue=7\|pages=2294–2312\|doi=10.1093/nar/gkp011\|issn=0305-1048\|pmc=PMC2673412\|pmid=19240142}}</ref> The Hoogsteen edge, hence, is also called Hoogsteen/C-H edge in a unified scheme for designating equivalent positions of purines as well as pyrimidines. Thus, the total number of possible edge combinations involved in base pairing are 6, namely Watson-Crick/Watson-Crick (or W:W), Watson-Crick/Hoogsteen (or W:H), Watson-Crick/Sugar (or W:S), Hoogsteen/Hoogsteen (or H:H), Hoogsteen/Sugar (or H:S) and Sugar/Sugar (or S:S).

			In the canonical Watson-Crick base pairs, the ]s attaching the N9 (of purine) and N1 (of pyrimidine) of the paired bases with their respective sugar moieties, are on the same side of the mean hydrogen bonding axis, and are hence called Cis Watson-Crick base pairs. However, the relative orientations of the two sugars may also be Trans with respect to the mean hydrogen bonding direction giving rise to a distinct Trans Watson-Crick geometric class, consisting of species which were earlier referred to as reverse Watson-Crick base pairs according to ] nomenclature. The possibility of both Cis and Trans glycosidic bond orientation for each of the 6 possible edge combinations, gives rise to 12 geometric families of base pairs (see table).

			According to the Leontis-Westhoff scheme, any base pair can be systematically and unambiguously named using the syntax <Base_1: Base_2><Edge_1: Edge_2><Glycosidic Bond Orientation> where Base_1 and Base_2 carry information on respective base identities and their nucleotide number. This nomenclature scheme also allows us to enumerate the total number of distinct possible base pair types. For a given glycosidic bond orientation, say ''Cis'', the four naturally occurring bases each have three possible edges for formation of base pairs giving rise to 12 such possible base pairing edge identities, each of which can in principle form base pairing with any edge of another base, irrespective of complementarity. This gives rise to a 12x12 ] displaying 144 pairwise permutations of base pairing edge identities, where, apart from the 12 diagonal entries, others include repeat combinations. Thus, there are 78 (= 12 + 132/2) unique entries corresponding to the ''cis'' glycosidic bond orientation. Considering both ''cis'' and ''trans'' glycosidic bond orientations, the number of base pair types amounts to 156.

			Of course, this number 156 is only an indicator. It includes base-edge combinations where base pairs cannot be formed due to absence of hydrogen bond donor acceptor complementarities. For example, potential pairing between two guanine residues utilizing their Watson-Crick edges in ''cis'' form (cWW) is not supported by hydrogen bonding donor-acceptor complementarity, and is not observed with consistent hydrogen bonding pattern. This method of enumerating the possible number of distinct base pair types also does not consider possibilities of multimodality or bifurcated base pairs, or even instances of base pairs involving modified bases, protonated bases and water or ion mediation in hydrogen bond formation. Two ] bases can form trans Watson-Crick/Watson-Crick (tWW) base pairing with their neutral as well as hemi protonated forms, possibly both, giving rise to the ]. However, both C(+):C tWW and C:C tWW, are counted as one type among 156 possible types.

			=== Classification based on ] ===
			Although significant differences are there between structures of non-canonical base pairs belonging to different geometric families, some base pairs within the same geometric family have been found to substitute each other without disrupting the overall structure. These base pairs are called isosteric base pairs. Isosteric base pairs always belong to same geometric families, but all the base pairs in a particular geometric family are not always isosteric. Two base pairs are called isosteric if they meet the following three criteria: (i) The C1′–C1′ distances should be similar; (ii) the paired bases should be related by the similar rotation in 3D space; and (iii) H-bonds formation should occur between equivalent base positions.<ref name=":22">{{Cite journal\|last=Leontis\|first=N. B.\|date=2002-08-15\|title=The non-Watson-Crick base pairs and their associated isostericity matrices\|url=http://dx.doi.org/10.1093/nar/gkf481\|journal=Nucleic Acids Research\|volume=30\|issue=16\|pages=3497–3531\|doi=10.1093/nar/gkf481\|issn=1362-4962}}</ref><ref name=":32">{{Cite book\|url=http://dx.doi.org/10.1007/978-3-540-70840-7_1\|title=Non-Protein Coding RNAs\|last=Nasalean\|first=Lorena\|last2=Stombaugh\|first2=Jesse\|last3=Zirbel\|first3=Craig L.\|last4=Leontis\|first4=Neocles B.\|publisher=Springer Berlin Heidelberg\|isbn=978-3-540-70833-9\|location=Berlin, Heidelberg\|pages=1–26\|doi=10.1007/978-3-540-70840-7_1\|chapter=RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching\|editor1-first= Nils G. \|editor1-last=Walter\|editor2-first=Sarah A.\|editor2-last= Woodson\|editor3-first=Robert T. \|editor3-last=Batey\|date=2009}}</ref> A detailed approach towards quantifying isostericity, in terms of an IsoDiscrepancy Index (IDI), which can facilitate reliable prediction regarding which base pair substitutions can potentially occur in conserved motifs, was formulated by Neocles Leontis, Craig Zirbel and Eric Westhof.<ref name=":42">{{Cite journal\|last=Stombaugh\|first=Jesse\|last2=Zirbel\|first2=Craig L.\|last3=Westhof\|first3=Eric\|last4=Leontis\|first4=Neocles B.\|date=2009-02-24\|title=Frequency and isostericity of RNA base pairs\|url=http://dx.doi.org/10.1093/nar/gkp011\|journal=Nucleic Acids Research\|volume=37\|issue=7\|pages=2294–2312\|doi=10.1093/nar/gkp011\|issn=0305-1048}}</ref> Based on IDI values and available base pair structural data, the group maintains a curated online base pair catalogue and an updated set of Isostericity Matrices (IM) corresponding to each of the 12 geometric families. Using this resource, one can comprehensively classify different types of canonical and non-canonical base pairs in terms of their positions in the Isostericity Matrices. This approach, for example, indicates that the four base pair types: A:U cWW, U:A cWW, G:C cWW and C:G cWW are isosteric to each other. Thus, as also confirmed by detailed sequence comparisons, double mutations altering A:U cWW to U:A cWW or even to G:C cWW may not disturb the structure, and, unless stability issues are involved, the function of the related RNA. It was also found that the ] G:U cWW base pair is not really isosteric to U:G cWW base pair, indicating that such double mutations may significantly affect the functioning of the corresponding RNA. On the other hand, some of the base pairs which are stabilized involving Sugar edge of the bases are mutually isosteric.


			=== Classification based on local strand direction ===
			It may be noted here that because of the geometric relationship of the bases with the sugar phosphate backbone, these 12 geometric families of base pairs are associated with two possible local strand orientations, namely parallel and antiparallel. For the 6 families with edge combinations involving Watson-Crick and Sugar edges, W:W, W:S and S:S, ''cis'' and ''trans'' families are respectively associated with antiparallel and parallel 5' to 3' local strand direction. Introduction of the Hoogsteen edge, as one of the partners in the combination, causes an inversion in the relationship. Thus, for W:H and H:S, ''cis'' and ''trans'' respectively correspond to parallel and antiparallel local strand orientation. As expected, when both the edges are H, a double inversion is observed, and H:H ''cis'' and ''trans'' correspond respectively to antiparallel and parallel local strand orientations. The annotation of local strand orientation in terms of parallel and antiparallel directions helps to understand which faces of the individual bases can be seen for a given base pair from the 5’- or the 3’ sides. This annotation also helps in classifying the 12 geometries into two groups of 6 each, where the geometries can potentially interconvert within each group, by in-plane relative rotation of the bases. However, one should note that the above theory is applicable only when the glycosidic torsion angles of both the nucleotide residues are ''anti''. Notably, crystallographic observations<ref>{{Cite journal\|last=Sokoloski\|first=J. E.\|last2=Godfrey\|first2=S. A.\|last3=Dombrowski\|first3=S. E.\|last4=Bevilacqua\|first4=P. C.\|date=2011-08-26\|title=Prevalence of syn nucleobases in the active sites of functional RNAs\|url=http://dx.doi.org/10.1261/rna.2759911\|journal=RNA\|volume=17\|issue=10\|pages=1775–1787\|doi=10.1261/rna.2759911\|issn=1355-8382}}</ref> and energetic<ref>{{Cite journal\|last=Reichert\|first=J.\|date=2002-01-01\|title=The IMB Jena Image Library of Biological Macromolecules: 2002 update\|url=http://dx.doi.org/10.1093/nar/30.1.253\|journal=Nucleic Acids Research\|volume=30\|issue=1\|pages=253–254\|doi=10.1093/nar/30.1.253\|issn=1362-4962}}</ref> considerations indicate that ''syn'' glycosidic torsions are also quite possible. Hence the above classification of parallel or antiparallel nature of strand directions, by itself, does not always provide the complete understanding.

			]

			Various functional RNA molecules are stabilized, in their specific folded pattern, by both canonical as well as non-canonical base pairs. Most tRNA molecules, for example, are known to have four short double helical segments, giving rise to a cloverleaf like two-dimensional structure. The three-dimensional structure of tRNA, however, takes an L-shape. This is mediated by several non-canonical base pairs and base triplets. The D-loop and TψC loop are held together by several such base pairs. There is a variety of non-canonical base pair varieties, which can be browsed through different websites such as NDB,<ref name=":52">{{Cite web\|url=http://ndbserver.rutgers.edu/ndbmodule/services/BPCatalog/bpCatalog.html\|title=RNA Basepair Catalog\|website=ndbserver.rutgers.edu\|access-date=2019-12-17}}</ref> RNABPDB,<ref name=":62">{{Cite web\|url=http://hdrnas.saha.ac.in/rnabpdb/\|title=RNA Base Pair Database(RNABPDB)\|website=hdrnas.saha.ac.in\|access-date=2019-12-17}}</ref> RNABP COGEST,<ref name=":72">{{Cite journal\|last=Bhattacharya\|first=Sohini\|last2=Mittal\|first2=Shriyaa\|last3=Panigrahi\|first3=Swati\|last4=Sharma\|first4=Purshotam\|last5=S. P.\|first5=Preethi\|last6=Paul\|first6=Rahul\|last7=Halder\|first7=Sukanya\|last8=Halder\|first8=Antarip\|last9=Bhattacharyya\|first9=Dhananjay\|date=2015-01-01\|title=RNABP COGEST: a resource for investigating functional RNAs\|url=https://academic.oup.com/database/article/doi/10.1093/database/bav011/2433143\|journal=Database\|language=en\|volume=2015\|doi=10.1093/database/bav011\|issn=1758-0463\|pmc=PMC4360618\|pmid=25776022}}</ref> etc., to get a better understanding.

			It may be noted that the above scheme is valid for naturally occurring nucleotide bases. However, there are plenty of examples of post-transcriptional chemical modifications of the bases, many of which are seen in tRNAs or ribosomes. It may be important to understand their structural features also.<ref>{{Cite journal\|last=Chawla\|first=Mohit\|last2=Oliva\|first2=Romina\|last3=Bujnicki\|first3=Janusz M.\|last4=Cavallo\|first4=Luigi\|date=2015-06-27\|title=An atlas of RNA base pairs involving modified nucleobases with optimal geometries and accurate energies\|url=http://dx.doi.org/10.1093/nar/gkv606\|journal=Nucleic Acids Research\|volume=43\|issue=14\|pages=6714–6729\|doi=10.1093/nar/gkv606\|issn=0305-1048}}</ref><ref>{{cite journal \|last1=Seelam \|first1=Preethi P. \|last2=Sharma \|first2=Purshotam \|last3=Mitra \|first3=Abhijit \|title=Structural landscape of base pairs containing post-transcriptional modifications in RNA \|journal=RNA \|date=2017 \|volume=23 \|issue=6 \|pages=847–859 \|doi=10.1261/rna.060749.117\|issn=1355-8382}}</ref>

	== Structure ==		== Structure ==
	=== Base pairing ===		=== Base pairing ===
	]		]
	An estimated 60% of bases in structured RNA participate in canonical Watson-Crick base pairs.<ref name="Leontis_2001" /> Base pairing occurs when two bases form hydrogen bonds with each other. These hydrogen bonds can be either polar or non-polar interactions. The polar hydrogen bonds are formed by N-H...O/N and/or O-H...O/N interactions. Non-polar hydrogen bonds are formed between C-H...O/N.<ref name="Halder_2013" />		An estimated 60% of bases in structured RNA participate in canonical Watson-Crick base pairs.<ref name="Leontis_2001" /> Base pairing occurs when two bases form hydrogen bonds with each other. These hydrogen bonds can be either polar or non-polar interactions. The polar hydrogen bonds are formed by N-H...O/N and/or O-H...O/N interactions. Non-polar hydrogen bonds are formed between C-H...O/N.<ref name="Halder_2013" />

	==== Edge interactions ====		==== Edge interactions ====
	Each base has three potential edges where it can interact with another base. The Purine bases have 3 edges which are able to hydrogen bond. Those are known as the Watson-Crick edge(WC), the Hoogsteen edge(H), and the Sugar edge(S). Pyrimidine bases also have three hydrogen-bonding edges.<ref name="Leontis_2001">{{cite journal \| vauthors = Leontis NB, Westhof E \| title = Geometric nomenclature and classification of RNA base pairs \| journal = RNA \| volume = 7 \| issue = 4 \| pages = 499–512 \| date = April 2001 \| pmid = 11345429 \| pmc = 1370104 \| doi = 10.1017/S1355838201002515 }}</ref> Like the purine, there is the Watson-Crick edge(WC) and the Sugar edge(S) but the third edge is referred to as the "C-H" edge(H) on the pyrimidine bases. This C-H edge is sometimes also referred to as the Hoogsteen edge for simplicity~~. The various edges for the purine and pyrimidine bases are shown in Figure 2~~.<ref name="Halder_2013">{{cite journal \| vauthors = Halder S, Bhattacharyya D \| title = RNA structure and dynamics: a base pairing perspective \| journal = Progress in Biophysics and Molecular Biology \| volume = 113 \| issue = 2 \| pages = 264–83 \| date = November 2013 \| pmid = 23891726 \| doi = 10.1016/j.pbiomolbio.2013.07.003 }}</ref>		Each base has three potential edges where it can interact with another base. The Purine bases have 3 edges which are able to hydrogen bond. Those are known as the Watson-Crick edge(WC), the Hoogsteen edge(H), and the Sugar edge(S). Pyrimidine bases also have three hydrogen-bonding edges.<ref name="Leontis_2001">{{cite journal \| vauthors = Leontis NB, Westhof E \| title = Geometric nomenclature and classification of RNA base pairs \| journal = RNA \| volume = 7 \| issue = 4 \| pages = 499–512 \| date = April 2001 \| pmid = 11345429 \| pmc = 1370104 \| doi = 10.1017/S1355838201002515 }}</ref> Like the purine, there is the Watson-Crick edge(WC) and the Sugar edge(S) but the third edge is referred to as the "C-H" edge(H) on the pyrimidine bases. This C-H edge is sometimes also referred to as the Hoogsteen edge for simplicity.<ref name="Halder_2013">{{cite journal \| vauthors = Halder S, Bhattacharyya D \| title = RNA structure and dynamics: a base pairing perspective \| journal = Progress in Biophysics and Molecular Biology \| volume = 113 \| issue = 2 \| pages = 264–83 \| date = November 2013 \| pmid = 23891726 \| doi = 10.1016/j.pbiomolbio.2013.07.003 }}</ref>
	]		]
	Besides the three edges of interaction, base pairs can also vary in their cis/trans forms. The cis and trans structures depend on the orientation of the ribose sugar as opposed to the hydrogen bond interaction~~. These various orientations are shown in Figure 3~~. Therefore, with the cis/trans forms and the 3 hydrogen bond edges, there are 12 basic types of base pairing geometries which can be found in RNA structures. Those 12 types are WC:WC (cis/trans), WC:HC (cis/trans), WC:S (cis/trans), H:S (cis/trans), H:H (cis/trans), and S:S (cis/trans).		Besides the three edges of interaction, base pairs can also vary in their cis/trans forms. The cis and trans structures depend on the orientation of the ribose sugar as opposed to the hydrogen bond interaction. Therefore, with the cis/trans forms and the 3 hydrogen bond edges, there are 12 basic types of base pairing geometries which can be found in RNA structures. Those 12 types are WC:WC (cis/trans), WC:HC (cis/trans), WC:S (cis/trans), H:S (cis/trans), H:H (cis/trans), and S:S (cis/trans).

	==== Classification ====		==== Classification ====
	These 12 types can be further divided into more subgroups which are dependent on the directionality of the glycosidic bonds and steric extensions.<ref>{{cite journal \| vauthors = Sponer JE, Leszczynski J, Sychrovský V, Sponer J \| title = Sugar edge/sugar edge base pairs in RNA: stabilities and structures from quantum chemical calculations \| journal = The Journal of Physical Chemistry B \| volume = 109 \| issue = 39 \| pages = 18680–9 \| date = October 2005 \| pmid = 16853403 \| doi = 10.1021/jp053379q }}</ref> With all of the various base pair combinations there are 169 theoretically possible base pair combinations. The actual number of ] combinations is lower because some combinations result in non-favorable interactions. This number of possible non-canonical base pairs is still being determined as it depends strongly on base pairing criteria .<ref>{{cite journal \| vauthors = Sharma P, Sponer JE, Sponer J, Sharma S, Bhattacharyya D, Mitra A \| title = On the role of the cis Hoogsteen:sugar-edge family of base pairs in platforms and triplets-quantum chemical insights into RNA structural biology \| journal = The Journal of Physical Chemistry B \| volume = 114 \| issue = 9 \| pages = 3307–20 \| date = March 2010 \| pmid = 20163171 \| doi = 10.1021/jp910226e }}</ref> Understanding base pair configuration is similarly difficult since the pairing is depends on the bases surroundings. These surroundings can consist of adjacent base pairs, adjacent loops, or third interactions (such as a base triple).<ref>{{cite journal \| vauthors = Heus HA, Hilbers CW \| title = Structures of non-canonical tandem base pairs in RNA helices: review \| journal = Nucleosides, Nucleotides & Nucleic Acids \| volume = 22 \| issue = 5–8 \| pages = 559–71 \| date = October 2003 \| pmid = 14565230 \| doi = 10.1081/NCN-120021955 }}</ref>		These 12 types can be further divided into more subgroups which are dependent on the directionality of the glycosidic bonds and steric extensions.<ref>{{cite journal \| vauthors = Sponer JE, Leszczynski J, Sychrovský V, Sponer J \| title = Sugar edge/sugar edge base pairs in RNA: stabilities and structures from quantum chemical calculations \| journal = The Journal of Physical Chemistry B \| volume = 109 \| issue = 39 \| pages = 18680–9 \| date = October 2005 \| pmid = 16853403 \| doi = 10.1021/jp053379q }}</ref> With all of the various base pair combinations there are 169 theoretically possible base pair combinations. The actual number of ] combinations is lower because some combinations result in non-favorable interactions. This number of possible non-canonical base pairs is still being determined as it depends strongly on base pairing criteria .<ref>{{cite journal \| vauthors = Sharma P, Sponer JE, Sponer J, Sharma S, Bhattacharyya D, Mitra A \| title = On the role of the cis Hoogsteen:sugar-edge family of base pairs in platforms and triplets-quantum chemical insights into RNA structural biology \| journal = The Journal of Physical Chemistry B \| volume = 114 \| issue = 9 \| pages = 3307–20 \| date = March 2010 \| pmid = 20163171 \| doi = 10.1021/jp910226e }}</ref> Understanding base pair configuration is similarly difficult since the pairing is depends on the bases surroundings. These surroundings can consist of adjacent base pairs, adjacent loops, or third interactions (such as a base triple).<ref>{{cite journal \| vauthors = Heus HA, Hilbers CW \| title = Structures of non-canonical tandem base pairs in RNA helices: review \| journal = Nucleosides, Nucleotides & Nucleic Acids \| volume = 22 \| issue = 5–8 \| pages = 559–71 \| date = October 2003 \| pmid = 14565230 \| doi = 10.1081/NCN-120021955 }}</ref>
	]		]
	The bonds between various bases are well defined because of their rigid and planar shape. The spatial interactions between the two bases can be classified in 6 rigid-body parameters or intra-base pair parameters (3 translational, 3 rotational) ~~as shown in Figure 4~~.<ref name="Olson_2019" /> These parameters describe the base pairs' three dimensional conformation.		The bonds between various bases are well defined because of their rigid and planar shape. The spatial interactions between the two bases can be classified in 6 rigid-body parameters or intra-base pair parameters (3 translational, 3 rotational).<ref name="Olson_2019" /> These parameters describe the base pairs' three dimensional conformation.

	The three translational arrangements are known as shear, stretch, and stagger. These three parameters are directly related to the proximity and direction of the hydrogen bonds. The rotational arrangements are buckle, propeller, and opening. Rotational arrangements relate to the non-planar confirmation (as compared to the ideal coplanar geometry).<ref name="Halder_2013" /> Intra-base pair parameters are used to determine the structure and stabilities of non-canonical base pairs and were originally created for the base pairings in DNA, but were found to also fit the non-canonical base models.<ref name="Olson_2019" />		The three translational arrangements are known as shear, stretch, and stagger. These three parameters are directly related to the proximity and direction of the hydrogen bonds. The rotational arrangements are buckle, propeller, and opening. Rotational arrangements relate to the non-planar confirmation (as compared to the ideal coplanar geometry).<ref name="Halder_2013" /> Intra-base pair parameters are used to determine the structure and stabilities of non-canonical base pairs and were originally created for the base pairings in DNA, but were found to also fit the non-canonical base models.<ref name="Olson_2019" />
Line 47:		Line 142:
	==== Hoogsteen base pairs ====		==== Hoogsteen base pairs ====
	]		]
	]s occur between adenine (A) and thymine (T); and guanine (G) and cytosine(C); similarly to Watson-Crick base pairs. However, the ] (A and G) takes on an alternative conformation with respect to the ]. In the A-U Hoogsteen base pair, the adenine is rotated 180° about the ], resulting in an alternative hydrogen bonding scheme which has one hydrogen bond in common with the Watson-Crick base pair (adenine N6 and thymine N4), while the other hydrogen bond, instead of occurring between adenine N1 and thymine N3 as in the Watson-Crick base pair, occurs between adenine N7 and thymine N3.<ref name="Nikolova_2013" /> The A-U base pair is shown in ~~Figure 5~~. In the G-C Watson-Crick base pair, like the A-T Hoogsteen base pair, the purine (guanine) is rotated 180° about the glycosidic bond while the pyrimidine (cytosine) remains in place. One hydrogen bond from the Watson-Crick base pair is maintained (guanine O6 and cytosine N4) and the other occurs between guanine N7 and a protonated cytosine N3 (note that the Hoogsteen G-C base pair has two hydrogen bonds, while the Watson-Crick G-C base pair has three).<ref name="Nikolova_2013" />		]s occur between adenine (A) and thymine (T); and guanine (G) and cytosine(C); similarly to Watson-Crick base pairs. However, the ] (A and G) takes on an alternative conformation with respect to the ]. In the A-U Hoogsteen base pair, the adenine is rotated 180° about the ], resulting in an alternative hydrogen bonding scheme which has one hydrogen bond in common with the Watson-Crick base pair (adenine N6 and thymine N4), while the other hydrogen bond, instead of occurring between adenine N1 and thymine N3 as in the Watson-Crick base pair, occurs between adenine N7 and thymine N3.<ref name="Nikolova_2013" /> The A-U base pair is shown in figure. In the G-C Watson-Crick base pair, like the A-T Hoogsteen base pair, the purine (guanine) is rotated 180° about the glycosidic bond while the pyrimidine (cytosine) remains in place. One hydrogen bond from the Watson-Crick base pair is maintained (guanine O6 and cytosine N4) and the other occurs between guanine N7 and a protonated cytosine N3 (note that the Hoogsteen G-C base pair has two hydrogen bonds, while the Watson-Crick G-C base pair has three).<ref name="Nikolova_2013" />
	]		]

	==== Wobble base pairs ====		==== Wobble base pairs ====
	]ing occur between two nucleotides that are not Watson-Crick base pairs and was proposed by ] in 1966. The 4 main examples are guanine-uracil (G-U), ]-uracil (I-U), hypoxanthine-adenine (I-A), and hypoxanthine-cytosine (I-C). These wobble base pairs are very important in tRNA. Most organisms have less than 45 tRNA molecules even though 61 tRNA molecules would technically be necessary to canonically pair to the codon. Wobble base pairing allows for the 5' anticodon to bond to a non-standard base pair~~. Examples of wobble base pairs are given in Figure 6~~.		]ing occur between two nucleotides that are not Watson-Crick base pairs and was proposed by ] in 1966. The 4 main examples are guanine-uracil (G-U), ]-uracil (I-U), hypoxanthine-adenine (I-A), and hypoxanthine-cytosine (I-C). These wobble base pairs are very important in tRNA. Most organisms have less than 45 tRNA molecules even though 61 tRNA molecules would technically be necessary to canonically pair to the codon. Wobble base pairing allows for the 5' anticodon to bond to a non-standard base pair.

	== 3-D Structure ==		== 3-D Structure ==

Revision as of 16:07, 13 June 2023

Non-canonical base pairs are planar hydrogen bonded pairs of nucleobases, having hydrogen bonding patterns which differ from the patterns observed in Watson-Crick base pairs, as in the classic double helical DNA. The structures of polynucleotide strands of both DNA and RNA molecules can be understood in terms of sugar-phosphate backbones consisting of phosphodiester-linked D 2’ deoxyribofuranose (D ribofuranose in RNA) sugar moieties, with purine or pyrimidine nucleobases covalently linked to them. Here, the N9 atoms of the purines, guanine and adenine, and the N1 atoms of the pyrimidines, cytosine and thymine (uracil in RNA), respectively, form glycosidic linkages with the C1’ atom of the sugars. These nucleobases can be schematically represented as triangles with one of their vertices linked to the sugar, and the three sides accounting for three edges through which they can form hydrogen bonds with other moieties, including with other nucleobases. The side opposite to the sugar linked vertex is traditionally called the Watson-Crick edge, since they are involved in forming the Watson-Crick base pairs which constitute building blocks of double helical DNA. The two sides adjacent to the sugar-linked vertex are referred to, respectively, as the Sugar and Hoogsteen (C-H for pyrimidines) edges.

Each of the four different nucleobases are characterized by distinct edge-specific distribution patterns of their respective hydrogen bond donor and acceptor atoms, complementarity with which, in turn, define the hydrogen bonding patterns involved in base pairing. The double helical structures of DNA or RNA are generally known to have base pairs between complementary bases, Adenine:Thymine (Adenine:Uracil in RNA) or Guanine:Cytosine. They involve specific hydrogen bonding patterns corresponding to their respective Watson-Crick edges, and are considered as Canonical Base Pairs. At the same time, the helically twisted backbones in the double helical duplex DNA form two grooves, major and minor, through which the hydrogen bond donor and acceptor atoms corresponding respectively to the Hoogsteen and sugar edges are accessible for additional potential molecular recognition events.

Experimental evidences reveal that the nucleotide bases are also capable of forming a wide variety of pairing between bases in various geometries, having hydrogen bonding patterns different from those observed in canonical base pairs. These base pairs, which are generally referred to as Non-Canonical Base Pairs, are held together by multiple hydrogen bonds, and are mostly planar and stable. Most of these play very important roles in shaping the structure and function of different functional RNA molecules. In addition to their occurrences in several double stranded stem regions, most of the loops and bulges that appear in single-stranded RNA secondary structures form recurrent 3D motifs, where non-canonical base pairs play a central role. Non-canonical base pairs also play crucial roles in mediating the tertiary contacts in RNA 3D structures.

History

Examples of few frequently observed non-canonical base pairs, (a) adenine:guanine trans Hoogsteen/Sugar-edge, (b) adenine:uracil trans Hoogsteen/Watson-Crick, (c) guanine:guanine cis Watson-Crick/Hoogsteen; (d)Protonated cytosine(+):cytosine trans Watson-Crick/Watson-Crick.

IUPAC-IUB recommended nomenclature of nucleotide base atoms of adenine, guanine, uracil and cytosine bases (created in MOLDEN).

Double helical structures of DNA as well as folded single stranded RNA are now known to be stabilized by Watson-Crick base pairing between the purines, adenine and guanine, with the pyrimidines, thymine (or uracil for RNA) and cytosine. In this scheme, the N1 atoms of the purine residues respectively form hydrogen bond with the N3 atoms of the pyrimidine residues in A:T and G:C complementarity. The second hydrogen bond in A:T base pairs involves the N6 amino group of adenine and the O4 atom of thymine (or uracil in RNA). Similarly, the second hydrogen bond in G:C base pairs involves O6 atom and N4 amino group of guanine and cytosine, respectively. The G:C base pairs also have a third hydrogen bond involving the N2 amino group of guanine and the O2 atom of cytosine. However, even till about twenty years after this scheme was initially proposed by James D. Watson and Francis H.C. Crick, experimental evidences suggesting other forms of base-base interactions continued to draw the attention of researchers investigating the structure of DNA. The first high resolution structure of a adenine:thymine base pair, as solved by Karst Hoogsteen by single crystal X-ray crystallography in 1959 revealed a structure whose geometry was very different from what was proposed by Watson and Crick. It had two hydrogen bonds involving N7 and N6 atoms of adenine and N3 and O4 (or O2) atoms of thymine. It may be noted that due to use of thymine base with methyl group representing sugar, a symmetry axis appears passing through N1 and C6 atoms and the O2 and O4 atoms appears identical. In order to distinguish this alternate base pairing scheme from the Watson-Crick scheme, base pairs where a hydrogen bond involves the N7 atom of a purine residue have been referred to as Hoogsteen base pair, and later, the purine base edge which includes its N7 atom is referred to as its Hoogsteen edge. The first high resolution structure of guanine:cytosine pair, obtained by W. Guschelbauer also was similar to the Hoogsteen base pair, although this structure required an unusual protonation of N1 imino nitrogen of cytosine, which is possible only at significantly lower pH. Experimental evidences, including low resolution NMR studies as well as high resolution X-ray crystallographic studies, supporting Watson-Crick base pairing were obtained as late as in the early '70s. Almost a decade later, with the advent of efficient DNA synthesis methods, Richard Dickerson followed by several other groups, solved structures of the physiological double helical B-DNA with a complete helical turn, based on the crystals of synthetic DNA oligomers. The pairing geometries of the A:T (A:U in RNA) and G:C pairs in these structures confirmed the common or canonical form of base pairing as proposed by Watson and Crick, while those with all other geometries, and compositions, are now referred to as non-canonical base pairs.

It was noticed that even in double stranded DNA, where canonical Watson Crick base pairs associate the two complementary anti-parallel strands together, there were occasional occurrences of Hoogsteen and other non-Watson-Crick base pairs. It was also proposed that within Watson-Crick base pair dominated DNA double helices, Hoogsteen base pair formation could be a transient phenomenon.

While canonical Watson-Crick base pairs are most prevalent and are commonly observed in a majority of chromosomal DNA and in most functional RNAs, presence of stable non-canonical base pairs is also extremely significant in DNA biology. An example of non-Watson-Crick, or non-canonical, base pairing can be found at the ends of chromosomal DNA. The 3'-ends of chromosomes contain single stranded overhangs with some conserved sequence motifs (such as TTAGGG in most vertebrates). The single stranded region adopts some definite three-dimensional structures, which has been solved by X-ray crystallography as well as by NMR spectroscopy. The single strands containing the above sequence motifs are found to form interesting four stranded mini-helical structures stabilized by Hoogsteen base pairing between guanine residues. In these structures, four guanine residues form a near planar base quartet, referred to as G-quadruplex, where each guanine participates in base pairing with its neighboring guanine, involving their Watson-Crick and Hoogsteen edges in a cyclic manner. The four central carbonyl groups are often stabilized by potassium ions (K). From the full genomic sequences of different organisms, it has been observed that telomere like sequences sometimes also interrupt double helical regions near transcription start site of some oncogenes, such as c-myc. It is possible that these sequence stretches form G-quadruplex like structures, which can suppress the expression of the related genes. The complementary cytosine rich sequences, on the other strand, may adopt another similar four stranded structure, the i-motif, stabilized by cytosine:cytosine non-canonical base pairs.

Structure of a representative G-Quadruplex consisting of Hoogsteen base pairs between every neighboring guanine residues (PDB: 1KF1).

Three G-quadruplexes stack to form four stranded telomere with different topologies for d(GGGATTGGGATTGGGATTGGG) sequence.

While non-canonical base pairs are still relatively rare in DNA, in RNA molecules, where generally a single polymeric strand folds onto itself to form various secondary and tertiary structures, the occurrence of non-Watson-Crick base pairs turns out to be far more prevalent. As early as in the 1970’s, analysis of the crystal structure of yeast tRNA showed that RNA structures possess significant non-canonical variations in base pairing schemes. Subsequently, the structures of ribozymes, ribosome, riboswitches, etc. have highlighted their abundance, and hence the need for a comprehensive characterization of Non-Canonical Base Pairs. These three-dimensional RNA structures generally possess several secondary structural motifs, such as double helical stems, stems with hairpin loops, symmetric and asymmetric internal loops, kissing loops between two hairpin motifs, pseudoknots, continuous stacks between two segments of helices, multi helix junctions etc. along with single stranded regions. These secondary structural motifs, except for the single stranded motifs, are stabilized by hydrogen bonded base pairs and several of these are non-canonical base pairs, including G:U Wobble base pairs.

It is notable in this context, that the Wobble hypothesis of Francis Crick predicted the possibility of G:U base pair, in place of the canonical G:C or A:U base pairs, also mediating the recognition between mRNA codons and tRNA anticodons, during protein synthesis. The G:U wobble base pair is the most numerously observed non-canonical base pair. While, because of its geometric similarity with the canonical base pairs, they frequently occur in the double helical stem regions of RNA structures, the geometric differences continue to draw the attention of nucleic acid researchers, providing new insights related to its structural significance. It may be noted that though the base pairs in the folded RNA structures, give rise to double helical stems, its two cleft regions – the major groove and minor groove, differ in their respective dimensions from those in DNA double helices. Unlike for those in DNA, the sequence discriminating major grooves in RNA double helices are very narrow and deep. On the other hand the minor groove regions, though wide and shallow, do not carry much sequence specific information in terms of the hydrogen bonding donor-acceptor positioning of the corresponding base pair edges. The G:U wobble base pairs, along with the various other non-canonical base pairs, introduce variations in the structures of RNA double helices, thus enhancing the accessibility of the discriminating major groove edges of associated base pairs. This has been seen to be very important for molecular recognition steps during tRNA aminoacylation as well as in ribosome functions.

Considering the immense importance of the non-canonical base pairs in RNA structure, folding and functions, researchers from multiple domains – biology, chemistry, physics, mathematics, computer science, etc., have joined in the effort to understand their structure, dynamics, function and their consequences. The complexities associated with experimental handling of RNA further underline the importance of diverse theoretical inputs towards addressing these issues.

Types of Non-canonical Base pairs

Two bases may approach each other in various ways, eventually leading to specific molecular recognition mediated by, often non-canonical, base pairing interactions, in addition to strong stacking interactions. These are essential for the process of RNA single strands folding into three-dimensional structures. Early studies on such unusual base pairs by Jiri Sponer, Pavel Hobza and their group were somewhat disadvantaged due to the unavailability of suitable unambiguous systematic naming schemes. While some of the observed base pair were assigned names following the Saenger nomenclature scheme. others were arbitrarily assigned names by different researchers. It may be mentioned that some attempts were also made by Michael Levitt and coworkers to classify base-base association in terms of adjacency of bases, through either pairing or stacking interactions. There was clearly a need for a classification scheme for different types of non-canonical base pairs, which could comprehensively and unambiguously handle newer variants coming up due to the rapid increase in the sampling space. Different approaches which have evolved in response to this need are discussed below.

Classification based on hydrogen bonding


Interacting edges	Glycosidic bond orientation	Nomenclature	Local strand direction
Watson-Crick/Watson-Crick	Cis	cWW or cis Watson-Crick/Watson-Crick	Antiparallel
Watson-Crick/Watson-Crick	Trans	tWW or trans Watson-Crick/Watson-Crick	Parallel
Watson-Crick/Hoogsteen	Cis	cWH or cis Watson-Crick/Hoogsteen	Parallel
Watson-Crick/Hoogsteen	Trans	tWH or trans Watson-Crick/Hoogsteen	Antiparallel
Watson-Crick/Sugar edge	Cis	cWS or cis Watson-Crick/Sugar edge	Antiparallel
Watson-Crick/Sugar edge	Trans	tWS or transWatson-Crick/Sugar edge	Parallel
Hoogsteen/Hoogsteen	Cis	cHH or cis Hoogsteen/Hoogsteen	Antiparallel
Hoogsteen/Hoogsteen	Trans	tHH or trans Hoogsteen/Hoogsteen	Parallel
Hoogsteen/Sugar edge	Cis	cHS or cis Hoogsteen/Sugar edge	Parallel
Hoogsteen/Sugar edge	Trans	tHS or trans Hoogsteen/Sugar edge	Antiparallel
Sugar edge/Sugar edge	Cis	cSS or cis Sugar edge/Sugar edge	Antiparallel
Sugar edge/Sugar edge	Trans	tSS or trans Sugar edge/Sugar edge	parallel

(a) Three hydrogen bonding edges of the four nucleotides (Guanine), showing nomenclature of each edge and (b) Cis and Trans orientations of the sugar moieties of the two nucleotide residues glycosidic bonds of a base pair with respect to hydrogen bonding direction. The arrows in (b) indicate glycosidic bonds as vectors.

The nucleotide bases are nearly planar heterocyclic moieties, with conjugated pi-electron cloud, and with several hydrogen bonding donors and accepters distributed around the edges, usually designated as W, H or S, based on whether the edges can respectively be involved in forming Watson-Crick base pair, Hoogsteen base pair, or, whether the edge is adjacent to the C2’-OH group of the ribose sugar. Eric Westhof and Neocles Leontis used these edge designations to propose a, currently widely accepted, nomenclature scheme for base pairs. The hydrogen bonding donor and acceptor atoms could thus be classified in terms of their positioning along their three edges, namely the Watson-Crick or W edge, the Hoogsteen or H edge, and the Sugar or S edge. Since base pairs are mediated through hydrogen bonding interactions based on hydrogen bond donor-acceptor complementarity, this, in turn, provides a convenient bottoms-up approach towards classifying base pair geometries in terms of respective interacting edges of the participating bases. It may be noted that, unlike the Hoogsteen edge of purines, the corresponding edges of the pyrimidine bases do not have any polar hydrogen bond acceptor atom such as N7. However, these bases have C—H groups at their C6 and C5 atoms, which can act as weak hydrogen bond donors, as proposed by Gautam Desiraju. The Hoogsteen edge, hence, is also called Hoogsteen/C-H edge in a unified scheme for designating equivalent positions of purines as well as pyrimidines. Thus, the total number of possible edge combinations involved in base pairing are 6, namely Watson-Crick/Watson-Crick (or W:W), Watson-Crick/Hoogsteen (or W:H), Watson-Crick/Sugar (or W:S), Hoogsteen/Hoogsteen (or H:H), Hoogsteen/Sugar (or H:S) and Sugar/Sugar (or S:S).

In the canonical Watson-Crick base pairs, the glycosidic bonds attaching the N9 (of purine) and N1 (of pyrimidine) of the paired bases with their respective sugar moieties, are on the same side of the mean hydrogen bonding axis, and are hence called Cis Watson-Crick base pairs. However, the relative orientations of the two sugars may also be Trans with respect to the mean hydrogen bonding direction giving rise to a distinct Trans Watson-Crick geometric class, consisting of species which were earlier referred to as reverse Watson-Crick base pairs according to Saenger nomenclature. The possibility of both Cis and Trans glycosidic bond orientation for each of the 6 possible edge combinations, gives rise to 12 geometric families of base pairs (see table).

According to the Leontis-Westhoff scheme, any base pair can be systematically and unambiguously named using the syntax <Base_1: Base_2><Edge_1: Edge_2><Glycosidic Bond Orientation> where Base_1 and Base_2 carry information on respective base identities and their nucleotide number. This nomenclature scheme also allows us to enumerate the total number of distinct possible base pair types. For a given glycosidic bond orientation, say Cis, the four naturally occurring bases each have three possible edges for formation of base pairs giving rise to 12 such possible base pairing edge identities, each of which can in principle form base pairing with any edge of another base, irrespective of complementarity. This gives rise to a 12x12 symmetric matrix displaying 144 pairwise permutations of base pairing edge identities, where, apart from the 12 diagonal entries, others include repeat combinations. Thus, there are 78 (= 12 + 132/2) unique entries corresponding to the cis glycosidic bond orientation. Considering both cis and trans glycosidic bond orientations, the number of base pair types amounts to 156.

Of course, this number 156 is only an indicator. It includes base-edge combinations where base pairs cannot be formed due to absence of hydrogen bond donor acceptor complementarities. For example, potential pairing between two guanine residues utilizing their Watson-Crick edges in cis form (cWW) is not supported by hydrogen bonding donor-acceptor complementarity, and is not observed with consistent hydrogen bonding pattern. This method of enumerating the possible number of distinct base pair types also does not consider possibilities of multimodality or bifurcated base pairs, or even instances of base pairs involving modified bases, protonated bases and water or ion mediation in hydrogen bond formation. Two cytosine bases can form trans Watson-Crick/Watson-Crick (tWW) base pairing with their neutral as well as hemi protonated forms, possibly both, giving rise to the ]. However, both C(+):C tWW and C:C tWW, are counted as one type among 156 possible types.

Classification based on isostericity

Although significant differences are there between structures of non-canonical base pairs belonging to different geometric families, some base pairs within the same geometric family have been found to substitute each other without disrupting the overall structure. These base pairs are called isosteric base pairs. Isosteric base pairs always belong to same geometric families, but all the base pairs in a particular geometric family are not always isosteric. Two base pairs are called isosteric if they meet the following three criteria: (i) The C1′–C1′ distances should be similar; (ii) the paired bases should be related by the similar rotation in 3D space; and (iii) H-bonds formation should occur between equivalent base positions. A detailed approach towards quantifying isostericity, in terms of an IsoDiscrepancy Index (IDI), which can facilitate reliable prediction regarding which base pair substitutions can potentially occur in conserved motifs, was formulated by Neocles Leontis, Craig Zirbel and Eric Westhof. Based on IDI values and available base pair structural data, the group maintains a curated online base pair catalogue and an updated set of Isostericity Matrices (IM) corresponding to each of the 12 geometric families. Using this resource, one can comprehensively classify different types of canonical and non-canonical base pairs in terms of their positions in the Isostericity Matrices. This approach, for example, indicates that the four base pair types: A:U cWW, U:A cWW, G:C cWW and C:G cWW are isosteric to each other. Thus, as also confirmed by detailed sequence comparisons, double mutations altering A:U cWW to U:A cWW or even to G:C cWW may not disturb the structure, and, unless stability issues are involved, the function of the related RNA. It was also found that the wobble G:U cWW base pair is not really isosteric to U:G cWW base pair, indicating that such double mutations may significantly affect the functioning of the corresponding RNA. On the other hand, some of the base pairs which are stabilized involving Sugar edge of the bases are mutually isosteric.

Classification based on local strand direction

It may be noted here that because of the geometric relationship of the bases with the sugar phosphate backbone, these 12 geometric families of base pairs are associated with two possible local strand orientations, namely parallel and antiparallel. For the 6 families with edge combinations involving Watson-Crick and Sugar edges, W:W, W:S and S:S, cis and trans families are respectively associated with antiparallel and parallel 5' to 3' local strand direction. Introduction of the Hoogsteen edge, as one of the partners in the combination, causes an inversion in the relationship. Thus, for W:H and H:S, cis and trans respectively correspond to parallel and antiparallel local strand orientation. As expected, when both the edges are H, a double inversion is observed, and H:H cis and trans correspond respectively to antiparallel and parallel local strand orientations. The annotation of local strand orientation in terms of parallel and antiparallel directions helps to understand which faces of the individual bases can be seen for a given base pair from the 5’- or the 3’ sides. This annotation also helps in classifying the 12 geometries into two groups of 6 each, where the geometries can potentially interconvert within each group, by in-plane relative rotation of the bases. However, one should note that the above theory is applicable only when the glycosidic torsion angles of both the nucleotide residues are anti. Notably, crystallographic observations and energetic considerations indicate that syn glycosidic torsions are also quite possible. Hence the above classification of parallel or antiparallel nature of strand directions, by itself, does not always provide the complete understanding.

(a) Cloverleaf model of tRNA (picture created by VARNA for PDB: 1EHZ) and (b) A typical base triplet involving residues 9, 12 and 23 of the same tRNA

Various functional RNA molecules are stabilized, in their specific folded pattern, by both canonical as well as non-canonical base pairs. Most tRNA molecules, for example, are known to have four short double helical segments, giving rise to a cloverleaf like two-dimensional structure. The three-dimensional structure of tRNA, however, takes an L-shape. This is mediated by several non-canonical base pairs and base triplets. The D-loop and TψC loop are held together by several such base pairs. There is a variety of non-canonical base pair varieties, which can be browsed through different websites such as NDB, RNABPDB, RNABP COGEST, etc., to get a better understanding.

It may be noted that the above scheme is valid for naturally occurring nucleotide bases. However, there are plenty of examples of post-transcriptional chemical modifications of the bases, many of which are seen in tRNAs or ribosomes. It may be important to understand their structural features also.

Structure

Base pairing

An estimated 60% of bases in structured RNA participate in canonical Watson-Crick base pairs. Base pairing occurs when two bases form hydrogen bonds with each other. These hydrogen bonds can be either polar or non-polar interactions. The polar hydrogen bonds are formed by N-H...O/N and/or O-H...O/N interactions. Non-polar hydrogen bonds are formed between C-H...O/N.

Edge interactions

Each base has three potential edges where it can interact with another base. The Purine bases have 3 edges which are able to hydrogen bond. Those are known as the Watson-Crick edge(WC), the Hoogsteen edge(H), and the Sugar edge(S). Pyrimidine bases also have three hydrogen-bonding edges. Like the purine, there is the Watson-Crick edge(WC) and the Sugar edge(S) but the third edge is referred to as the "C-H" edge(H) on the pyrimidine bases. This C-H edge is sometimes also referred to as the Hoogsteen edge for simplicity.

Besides the three edges of interaction, base pairs can also vary in their cis/trans forms. The cis and trans structures depend on the orientation of the ribose sugar as opposed to the hydrogen bond interaction. Therefore, with the cis/trans forms and the 3 hydrogen bond edges, there are 12 basic types of base pairing geometries which can be found in RNA structures. Those 12 types are WC:WC (cis/trans), WC:HC (cis/trans), WC:S (cis/trans), H:S (cis/trans), H:H (cis/trans), and S:S (cis/trans).

Classification

These 12 types can be further divided into more subgroups which are dependent on the directionality of the glycosidic bonds and steric extensions. With all of the various base pair combinations there are 169 theoretically possible base pair combinations. The actual number of base-pair combinations is lower because some combinations result in non-favorable interactions. This number of possible non-canonical base pairs is still being determined as it depends strongly on base pairing criteria . Understanding base pair configuration is similarly difficult since the pairing is depends on the bases surroundings. These surroundings can consist of adjacent base pairs, adjacent loops, or third interactions (such as a base triple).

The bonds between various bases are well defined because of their rigid and planar shape. The spatial interactions between the two bases can be classified in 6 rigid-body parameters or intra-base pair parameters (3 translational, 3 rotational). These parameters describe the base pairs' three dimensional conformation.

The three translational arrangements are known as shear, stretch, and stagger. These three parameters are directly related to the proximity and direction of the hydrogen bonds. The rotational arrangements are buckle, propeller, and opening. Rotational arrangements relate to the non-planar confirmation (as compared to the ideal coplanar geometry). Intra-base pair parameters are used to determine the structure and stabilities of non-canonical base pairs and were originally created for the base pairings in DNA, but were found to also fit the non-canonical base models.

Types

The most common non-canonical base pairs are trans A:G Hoogsteen/sugar edge, A:U Hoogsteen/WC, and G:U Wobble pairs.

Hoogsteen base pairs

Hoogsteen base pairs occur between adenine (A) and thymine (T); and guanine (G) and cytosine(C); similarly to Watson-Crick base pairs. However, the purine (A and G) takes on an alternative conformation with respect to the pyrimidine. In the A-U Hoogsteen base pair, the adenine is rotated 180° about the glycosidic bond, resulting in an alternative hydrogen bonding scheme which has one hydrogen bond in common with the Watson-Crick base pair (adenine N6 and thymine N4), while the other hydrogen bond, instead of occurring between adenine N1 and thymine N3 as in the Watson-Crick base pair, occurs between adenine N7 and thymine N3. The A-U base pair is shown in figure. In the G-C Watson-Crick base pair, like the A-T Hoogsteen base pair, the purine (guanine) is rotated 180° about the glycosidic bond while the pyrimidine (cytosine) remains in place. One hydrogen bond from the Watson-Crick base pair is maintained (guanine O6 and cytosine N4) and the other occurs between guanine N7 and a protonated cytosine N3 (note that the Hoogsteen G-C base pair has two hydrogen bonds, while the Watson-Crick G-C base pair has three).

Wobble base pairs

Wobble base pairing occur between two nucleotides that are not Watson-Crick base pairs and was proposed by Watson in 1966. The 4 main examples are guanine-uracil (G-U), hypoxanthine-uracil (I-U), hypoxanthine-adenine (I-A), and hypoxanthine-cytosine (I-C). These wobble base pairs are very important in tRNA. Most organisms have less than 45 tRNA molecules even though 61 tRNA molecules would technically be necessary to canonically pair to the codon. Wobble base pairing allows for the 5' anticodon to bond to a non-standard base pair.

3-D Structure

The secondary and three-dimensional structures of RNA are formed and stabilized through non-canonical base pairs. Base pairs make up many secondary structural blocks which aid the folding of RNA complexes and three dimensional structures. The overall folded RNA is stabilized by the tertiary and secondary structures canonically base pairing together. Due to the many possible non-canonical base pairs, there are an unlimited amount of structures, which allows for the diverse functions of RNA. The arrangement of the non-canonical bases also allow long-range RNA interactions, recognition of proteins and other molecules, and structural stabilizing elements. Many of the common non-canonical base pairs can be added to a stacked RNA stem without disturbing its helical character.

Secondary

Figure 7: This depicts a hairpin structure found in pre m-RNA

Basic secondary structural elements of RNA include bulges, double helices, hairpin loops, and internal loops. An example of a hairpin loop of RNA is given in Figure 7. As shown in the figure, hairpin loops and internal loops require a sudden change in backbone direction. Non-canonical base pairing allows for the increased flexibility at junctions or turns required in the secondary structure.

Tertiary

Three-dimensional structures are formed through the long-range intra-molecular interactions between the secondary structures. This leads to the formation of pseudoknots, ribose zippers, kissing hairpin loops, or co-axial pseudocontinuous helices. The three-dimensional structures of RNA are primarily determined through molecular simulations or computationally guided measurements. An example of a Pseudoknot is given in Figure 8.

Experimental Methods

Watson-Crick canonical base pairing is not the only edge-to-edge conformation possible for the nucleotide since non-canonical pairing can take place as well. Sugar-phosphate backbone has an ionic character, which makes the bases sensitive to their environment, leading to conformational changes, such as non-canonical pairing. There are various methods of prediction for these conformations, such as NMR structure determination and X-ray crystallography.

Applications

RNA has a multitude of purposes throughout the cell including regulating many important steps in gene expression. Various conformations of the non-Watson-Crick base pairs allow for a multitude of biological functions such as mRNA splicing, siRNA, transport, protein recognition, protein binding, and translation.

One common example of a biological application of non-canonical base pairs is the kink turn. A kink-turn is found throughout many functional RNA species. It consists of a three-nucleotide bulge due to three Hoogsteen base pairs. This kink-turn acts as a marker where various proteins such as the human 15-5k protein or proteins in the L7Ae family can bind. A similar scenario is described in the binding of the HIV-1 Rev-response element (RRE) RNA. RRE RNA has an extra wide deep groove that is caused by cis Watson-Crick G:A pair followed by a trans Watson-Crick G:G. The HIV-1 Rev-response element is then able to bind due to the deepened groove.

References

Watson, J. D.; Crick, F. H. C. (1953-04). "Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid". Nature. 171 (4356): 737–738. doi:10.1038/171737a0. ISSN 1476-4687. {{cite journal}}: Check date values in: |date= (help)
Nikolova, Evgenia N.; Zhou, Huiqing; Gottardo, Federico L.; Alvey, Heidi S.; Kimsey, Isaac J.; Al-Hashimi, Hashim M. (2013-07). "A historical account of hoogsteen base-pairs in duplex DNA". Biopolymers. 99 (12): 955–968. doi:10.1002/bip.22334. ISSN 0006-3525. {{cite journal}}: Check date values in: |date= (help)
Westhof, Eric; Fritsch, Valérie (2000-03). "RNA folding: beyond Watson–Crick pairs". Structure. 8 (3): R55 – R65. doi:10.1016/s0969-2126(00)00112-x. ISSN 0969-2126. {{cite journal}}: Check date values in: |date= (help)
Hoogsteen, K. (1959-10-10). "The structure of crystals containing a hydrogen-bonded complex of 1-methylthymine and 9-methyladenine". Acta Crystallographica. 12 (10): 822–823. doi:10.1107/s0365110x59002389. ISSN 0365-110X.
Courtois, Y.; Fromageot, P.; Guschlbauer, W. (1968-12). "Protonated Polynucleotide Structures. 3. An Optical Rotatory Dispersion Study of the Protonation of DNA". European Journal of Biochemistry. 6 (4): 493–501. doi:10.1111/j.1432-1033.1968.tb00472.x. ISSN 0014-2956. {{cite journal}}: Check date values in: |date= (help)
Patel, Dinshaw J.; Tonelli, Alan E. (1974-10). "Assignment of the proton nmr chemical shifts of the T–N₃H and G–N₁H proton resonances in isolated AT and GC Watson-Crick base pairs in double-stranded deoxy oligonucleotides in aqueous solution". Biopolymers. 13 (10): 1943–1964. doi:10.1002/bip.1974.360131003. ISSN 0006-3525. {{cite journal}}: Check date values in: |date= (help)
Seeman, Nadrian C.; Rosenberg, John M.; Suddath, F.L.; Kim, Jung Ja Park; Rich, Alexander (1976-06). "RNA double-helical fragments at atomic resolution: I. The crystal and molecular structure of sodium adenylyl-3′,5′-uridine hexahydrate". Journal of Molecular Biology. 104 (1): 109–144. doi:10.1016/0022-2836(76)90005-x. ISSN 0022-2836. {{cite journal}}: Check date values in: |date= (help)
Drew, H.R.; Wing, R.M.; Takano, T.; Broka, C.; Tanaka, S.; Itakura, K.; Dickerson, R.E. (1981-05-21). "Structure of a B-DNA Dodecamer. Conformation and Dynamics". Worldwide Protein Data Bank. doi:10.2210/pdb1bna/pdb. Retrieved 2019-12-17.
Wang, A.H.-J.; Fujii, S.; Van Boom, J.H.; Van Der Marel, G.A.; Van Boeckel, S.A.A.; Rich, A. (1993-07-15). "Molecular Structure of R(GCG)D(TATACGC): A DNA-RNA Hybrid Helix Joined to Double Helical DNA". Worldwide Protein Data Bank. doi:10.2210/pdb1d96/pdb. Retrieved 2019-12-17.
Heinemann, Udo; Alings, Claudia (1989-11). "Crystallographic study of one turn of G/C-rich B-DNA". Journal of Molecular Biology. 210 (2): 369–381. doi:10.1016/0022-2836(89)90337-9. ISSN 0022-2836. {{cite journal}}: Check date values in: |date= (help)
Dock-Bregeon, A.C.; Chevrier, B.; Podjarny, A.; Johnson, J.; de Bear, J.S.; Gough, G.R.; Gilham, P.T.; Moras, D. (1989-10). "Crystallographic structure of an RNA helix: [U(UA)₆A]₂". Journal of Molecular Biology. 209 (3): 459–474. doi:10.1016/0022-2836(89)90010-7. ISSN 0022-2836. {{cite journal}}: Check date values in: |date= (help)
Patikoglou, G. A.; Kim, J. L.; Sun, L.; Yang, S.-H.; Kodadek, T.; Burley, S. K. (1999-12-15). "TATA element recognition by the TATA box-binding protein has been conserved throughout evolution". Genes & Development. 13 (24): 3217–3230. doi:10.1101/gad.13.24.3217. ISSN 0890-9369.
Aishima, J.; Gitti, R.K.; Noah, J.E.; Gan, H.H.; Schlick, T.; Wolberger, C. (2002-12-11). "MATalpha2 Homeodomain Bound to DNA". Worldwide Protein Data Bank. doi:10.2210/pdb1k61/pdb. Retrieved 2019-12-17.
Nair, Deepak T.; Johnson, Robert E.; Prakash, Satya; Prakash, Louise; Aggarwal, Aneel K. (2004-07). "Replication by human DNA polymerase-ι occurs by Hoogsteen base-pairing". Nature. 430 (6997): 377–380. doi:10.1038/nature02692. ISSN 0028-0836. {{cite journal}}: Check date values in: |date= (help)
Kitayner, Malka; Rozenberg, Haim; Rohs, Remo; Suad, Oded; Rabinovich, Dov; Honig, Barry; Shakked, Zippora (2010-04). "Diversity in DNA recognition by p53 revealed by crystal structures with Hoogsteen base pairs". Nature Structural & Molecular Biology. 17 (4): 423–429. doi:10.1038/nsmb.1800. ISSN 1545-9993. {{cite journal}}: Check date values in: |date= (help)
Ethayathulla, A.S.; Tse, P.W.; Nguyen, S.; Viadiu, H. (2012-04-18). "Structure of p73 DNA binding domain tetramer modulates p73 transactivation". Worldwide Protein Data Bank. doi:10.2210/pdb3vd2/pdb. Retrieved 2019-12-17.
^ Xu, Yu; McSally, James; Andricioaei, Ioan; Al-Hashimi, Hashim M. (2018-12). "Modulation of Hoogsteen dynamics on DNA recognition". Nature Communications. 9 (1): 1473. doi:10.1038/s41467-018-03516-1. ISSN 2041-1723. PMC 5902632. PMID 29662229. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
Parkinson, Gary N.; Lee, Michael P. H.; Neidle, Stephen (2002-05-26). "Crystal structure of parallel quadruplexes from human telomeric DNA". Nature. 417 (6891): 876–880. doi:10.1038/nature755. ISSN 0028-0836.
Luu, Kim Ngoc; Phan, Anh Tuân; Kuryavyi, Vitaly; Lacroix, Laurent; Patel, Dinshaw J. (2006-08). "Structure of the Human Telomere in K Solution: An Intramolecular (3 + 1) G-Quadruplex Scaffold". Journal of the American Chemical Society. 128 (30): 9963–9970. doi:10.1021/ja062791w. ISSN 0002-7863. {{cite journal}}: Check date values in: |date= (help)
Phan, Anh Tuân; Kuryavyi, Vitaly; Luu, Kim Ngoc; Patel, Dinshaw J. (2007-09-25). "Structure of two intramolecular G-quadruplexes formed by natural human telomere sequences in K solution". Nucleic Acids Research. 35 (19): 6517–6525. doi:10.1093/nar/gkm706. ISSN 1362-4962.
Hendrix, Donna K.; Brenner, Steven E.; Holbrook, Stephen R. (2005-08). "RNA structural motifs: building blocks of a modular biomolecule". Quarterly Reviews of Biophysics. 38 (3): 221–243. doi:10.1017/s0033583506004215. ISSN 0033-5835. {{cite journal}}: Check date values in: |date= (help)
Laing, Christian; Jung, Segun; Iqbal, Abdul; Schlick, Tamar (2009-10). "Tertiary Motifs Revealed in Analyses of Higher-Order RNA Junctions". Journal of Molecular Biology. 393 (1): 67–82. doi:10.1016/j.jmb.2009.07.089. ISSN 0022-2836. {{cite journal}}: Check date values in: |date= (help)
Halder, Sukanya; Bhattacharyya, Dhananjay (2013-11). "RNA structure and dynamics: A base pairing perspective". Progress in Biophysics and Molecular Biology. 113 (2): 264–283. doi:10.1016/j.pbiomolbio.2013.07.003. ISSN 0079-6107. {{cite journal}}: Check date values in: |date= (help)
Ananth, P.; Goldsmith, G.; Yathindra, N. (2013-07-16). "An innate twist between Crick's wobble and Watson-Crick base pairs". RNA. 19 (8): 1038–1053. doi:10.1261/rna.036905.112. ISSN 1355-8382.
Šponer, Jiří; Leszczynski, Jerzy; Hobza, Pavel (1996-01). "Structures and Energies of Hydrogen-Bonded DNA Base Pairs. A Nonempirical Study with Inclusion of Electron Correlation". The Journal of Physical Chemistry. 100 (5): 1965–1974. doi:10.1021/jp952760f. ISSN 0022-3654. {{cite journal}}: Check date values in: |date= (help)
Saenger, Wolfram (1984). Principles of Nucleic Acid Structure. New York, NY: Springer New York. pp. 1–8. doi:10.1007/978-1-4612-5190-3. ISBN 978-0-387-90761-1.
Sykes, Michael T.; Levitt, Michael (2005-08). "Describing RNA Structure by Libraries of Clustered Nucleotide Doublets". Journal of Molecular Biology. 351 (1): 26–38. doi:10.1016/j.jmb.2005.06.024. ISSN 0022-2836. {{cite journal}}: Check date values in: |date= (help)
Leontis, Neocles B.; Westhof, Eric (2001-04). "Geometric nomenclature and classification of RNA base pairs". RNA. 7 (4): 499–512. doi:10.1017/S1355838201002515. ISSN 1355-8382. PMC 1370104. PMID 11345429. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
Stombaugh, Jesse; Zirbel, Craig L.; Westhof, Eric; Leontis, Neocles B. (2009-04-01). "Frequency and isostericity of RNA base pairs". Nucleic Acids Research. 37 (7): 2294–2312. doi:10.1093/nar/gkp011. ISSN 0305-1048. PMC 2673412. PMID 19240142.{{cite journal}}: CS1 maint: PMC format (link)
Leontis, N. B. (2002-08-15). "The non-Watson-Crick base pairs and their associated isostericity matrices". Nucleic Acids Research. 30 (16): 3497–3531. doi:10.1093/nar/gkf481. ISSN 1362-4962.
Nasalean, Lorena; Stombaugh, Jesse; Zirbel, Craig L.; Leontis, Neocles B. (2009). "RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching". In Walter, Nils G.; Woodson, Sarah A.; Batey, Robert T. (eds.). Non-Protein Coding RNAs. Berlin, Heidelberg: Springer Berlin Heidelberg. pp. 1–26. doi:10.1007/978-3-540-70840-7_1. ISBN 978-3-540-70833-9.
Stombaugh, Jesse; Zirbel, Craig L.; Westhof, Eric; Leontis, Neocles B. (2009-02-24). "Frequency and isostericity of RNA base pairs". Nucleic Acids Research. 37 (7): 2294–2312. doi:10.1093/nar/gkp011. ISSN 0305-1048.
Sokoloski, J. E.; Godfrey, S. A.; Dombrowski, S. E.; Bevilacqua, P. C. (2011-08-26). "Prevalence of syn nucleobases in the active sites of functional RNAs". RNA. 17 (10): 1775–1787. doi:10.1261/rna.2759911. ISSN 1355-8382.
Reichert, J. (2002-01-01). "The IMB Jena Image Library of Biological Macromolecules: 2002 update". Nucleic Acids Research. 30 (1): 253–254. doi:10.1093/nar/30.1.253. ISSN 1362-4962.
Darty, Kévin; Denise, Alain; Ponty, Yann (2009-08-01). "VARNA: Interactive drawing and editing of the RNA secondary structure". Bioinformatics. 25 (15): 1974–1975. doi:10.1093/bioinformatics/btp250. ISSN 1460-2059. PMC 2712331. PMID 19398448.{{cite journal}}: CS1 maint: PMC format (link)
"RNA Basepair Catalog". ndbserver.rutgers.edu. Retrieved 2019-12-17.
"RNA Base Pair Database(RNABPDB)". hdrnas.saha.ac.in. Retrieved 2019-12-17.
Bhattacharya, Sohini; Mittal, Shriyaa; Panigrahi, Swati; Sharma, Purshotam; S. P., Preethi; Paul, Rahul; Halder, Sukanya; Halder, Antarip; Bhattacharyya, Dhananjay (2015-01-01). "RNABP COGEST: a resource for investigating functional RNAs". Database. 2015. doi:10.1093/database/bav011. ISSN 1758-0463. PMC 4360618. PMID 25776022.{{cite journal}}: CS1 maint: PMC format (link)
Chawla, Mohit; Oliva, Romina; Bujnicki, Janusz M.; Cavallo, Luigi (2015-06-27). "An atlas of RNA base pairs involving modified nucleobases with optimal geometries and accurate energies". Nucleic Acids Research. 43 (14): 6714–6729. doi:10.1093/nar/gkv606. ISSN 0305-1048.
Seelam, Preethi P.; Sharma, Purshotam; Mitra, Abhijit (2017). "Structural landscape of base pairs containing post-transcriptional modifications in RNA". RNA. 23 (6): 847–859. doi:10.1261/rna.060749.117. ISSN 1355-8382.
^ Leontis NB, Westhof E (April 2001). "Geometric nomenclature and classification of RNA base pairs". RNA. 7 (4): 499–512. doi:10.1017/S1355838201002515. PMC 1370104. PMID 11345429.
^ Halder S, Bhattacharyya D (November 2013). "RNA structure and dynamics: a base pairing perspective". Progress in Biophysics and Molecular Biology. 113 (2): 264–83. doi:10.1016/j.pbiomolbio.2013.07.003. PMID 23891726.
Sponer JE, Leszczynski J, Sychrovský V, Sponer J (October 2005). "Sugar edge/sugar edge base pairs in RNA: stabilities and structures from quantum chemical calculations". The Journal of Physical Chemistry B. 109 (39): 18680–9. doi:10.1021/jp053379q. PMID 16853403.
Sharma P, Sponer JE, Sponer J, Sharma S, Bhattacharyya D, Mitra A (March 2010). "On the role of the cis Hoogsteen:sugar-edge family of base pairs in platforms and triplets-quantum chemical insights into RNA structural biology". The Journal of Physical Chemistry B. 114 (9): 3307–20. doi:10.1021/jp910226e. PMID 20163171.
Heus HA, Hilbers CW (October 2003). "Structures of non-canonical tandem base pairs in RNA helices: review". Nucleosides, Nucleotides & Nucleic Acids. 22 (5–8): 559–71. doi:10.1081/NCN-120021955. PMID 14565230.
^ Olson WK, Li S, Kaukonen T, Colasanti AV, Xin Y, Lu XJ (May 2019). "Effects of Noncanonical Base Pairing on RNA Folding: Structural Context and Spatial Arrangements of G·A Pairs". Biochemistry. 58 (20): 2474–2487. doi:10.1021/acs.biochem.9b00122. PMC 6729125. PMID 31008589.
Roy A, Panigrahi S, Bhattacharyya M, Bhattacharyya D (March 2008). "Structure, stability, and dynamics of canonical and noncanonical base pairs: quantum chemical studies". The Journal of Physical Chemistry B. 112 (12): 3786–96. doi:10.1021/jp076921e. PMID 18318519.
^ Cite error: The named reference Nikolova_2013 was invoked but never defined (see the help page).
^ Cite error: The named reference Hermann_1999 was invoked but never defined (see the help page).
^ Lu XJ, Olson WK (September 2003). "3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures". Nucleic Acids Research. 31 (17): 5108–21. doi:10.1093/nar/gkg680. PMC 212791. PMID 12930962.
Fernandes CL, Escouto GB, Verli H (2013-06-28). "Structural glycobiology of heparinase II from Pedobacter heparinus". Journal of Biomolecular Structure & Dynamics. 32 (7): 1092–102. doi:10.1080/07391102.2013.809604. PMID 23808670.
Storz G, Altuvia S, Wassarman KM (2005-06-01). "An abundance of RNA regulators". Annual Review of Biochemistry. 74 (1): 199–217. doi:10.1146/annurev.biochem.74.082803.133136. PMID 15952886.
Huang L, Lilley DM (January 2018). "The kink-turn in the structural biology of RNA". Quarterly Reviews of Biophysics. 51: e5. doi:10.1017/S0033583518000033. PMID 30912490.

Categories:

Misplaced Pages