Glen Report 33-15: Technical Note - DNA and RNA Nucleoside Numbering System

DNA and RNA nucleotide building blocks contain three components: a heterocyclic base, a pentose sugar, and a phosphate. For the first two components, a precise numbering and nomenclature system is needed to communicate a universally distinct chemical structure and name.  This article is devoted to facilitating the understanding of the atom numbering system and associated nomenclature in DNA and RNA nucleotide building blocks, both of which are governed by the International Union of Pure and Applied Chemistry (IUPAC) rules and guidelines.

The numbering system for the heterocyclic base is arguably the most interesting, as it covers the two types of heterocyclic bases: purine bases (A, & G), and pyrimidine bases (C, T, & U). In this numbering system, all nitrogen atoms have odd numbers, 1,3,7, and 9 (the last two only apply to purine bases). It should be noted that all structures in our catalog and website depict nucleosides in their syn-conformation for convenience, as opposed to the anti-conformation found in Watson-Crick base pairs. The syn- and anti-conformations differ only in the rotation of the N-glycosidic bond (Figure 1). 

Figure 2

Figure 1. The Numbering System for the Purine and Pyrimidine rings

While the numbering may appear arbitrary, IUPAC developed a defined numbering system to reduce ambiguity.  Key to the numbering system is where to start, and for a ring system, numbering starts on the largest ring and at the first substitution of carbon. Numbering continues in a way that enables the lowest numbers for substitutions.  

The pyrimidine ring numbering system starts by assigning number “1” to the Nitrogen atom bonded to the pentose sugar (N-glycosidic bond), before counting clockwise to complete the numbering assignments (Figure 1-A). For purine, there are two fused rings: a six-membered ring (pyrimidine), and a five-membered ring (imidazole). In adenine, the numbering system for the pyrimidine follows a counterclockwise direction, starting from the Nitrogen atom “1” to the carbon atom “6” bonded to the exocyclic amine. The imidazole ring numbering system follows a clockwise direction, starting with the Nitrogen atom “7” and ending with the Nitrogen atom at the N-glycosidic bond position, “Nitrogen number 9” (Figure 1-B). 

The numbering of the pentose sugar is more straightforward than that of the nucleobases. IUPAC assigns number “1” to the carbon atom in the Carbon-Nitrogen linkage “glycosidic bond.” Carbon “1” in the pentose sugar bonds to Nitrogen “1” in the pyrimidine ring and Nitrogen “9” in the purine ring. From Carbon “1,” the numbering follows the carbon chain sequentially in a clockwise direction to Carbon “5” (Figure 1-C). 

Glen Research offers a large catalog of phosphoramidites, and of these, a significant portion have modified nucleobases. More than 70 modified nucleobases are available. This number is for DNA, and would be notably higher if RNA, 2’-OMe, and 2’-F versions were included. The 5-position in pyrimidines is by far the most popular attachment point for modifications. This is because 1) there are a lot of natural modifications attached to this location, 2) it is convenient to modify at a position that does not interfere with hybridization, and 3) modifications from this position are generally tolerated by a lot of nucleic acid-modifying enzymes. There are 5-halide modifications for crosslinking (Figure 2-A), alkynes for click chemistry, amino-modifiers for sequence modification and oxidized modifications for epigenetic study. In addition to the 5-position, modifications can also be found at the 2-position and 4-position. For instance, 2-thio-dT is useful in examining protein-DNA interaction by acting as a photosensitizing probe, and 4-thio-dT is useful for photo cross-linking and photo affinity labeling experiments (Figure 2-B). Finally, several modifications involve multiple positions, including tricyclic cytosines (several of which are fluorescent) via the 4- and 5-positions, and 5,6 dihydropyrimidines, naturally occurring modifications that are formed as damage products due to exposure of DNA to ionizing radiation.

Figure 2

Figure 2. The Numbering System for the Pyrimidine Modifications

For the purines, the modifications are more spread out, partly because there are more positions available. From the 8-position, there are 8-Oxo-purines that allow for investigation of the structure and activity of oligonucleotides containing an 8-Oxo mutation, which is formed naturally when DNA is subjected to oxidative conditions or ionizing radiation. In addition, there are 8-Bromo-purines that can be used in crystallography studies of oligonucleotide structure and cross-linking studies of protein-DNA complex structure. Finally, there are 8-Amino-purines that can be very useful in triplex formation. It should be noted that, like 8-Oxo-dG, 8-Amino-dG can also be a mutagenic lesion. Another set of modifications for modulating the base pairing properties of the purine involve the 2- and 6-positions. Some of these will base pair with both dC and dT (dI, dK), while others form an additional interaction with dT (2-Amino-dA) or an artificial base pair with isodC (isodG).

In addition to direct substitutions on the purines and pyrimidines, there are a couple of other categories of modifications. The first are modifications directly on the rings. This would be a carbon in place of a nitrogen (deaza), or vice versa (aza). For instance, 7-deaza-G has a carbon at position 7, while 5-aza-dC contains a nitrogen at position 5 (Figure 2-C&D). These substitutions would of course affect the hydrogen bonding interactions of the resulting nucleobases. The second category would include modifications on amines or hydroxyls attached to the purine and/or pyrimidine rings. In these cases, the modifications are designated “NX” or “OX,” where “X” is the number of the ring’s atom. For example, N6-methyl-dA has a methylated exocyclic amine at the 6-position (Figure 2-E).

The list below includes all the pyrimidine and purine modifications discussed and many more. These are the DNA versions only. The list is not exhaustive, and very closely related products, like those that differ by only a protecting group, are not shown. Some products are applicable to multiple applications but are only listed once.

 

Technical Note - DNA and RNA Nucleoside Numbering System: Pyrimidine and Purine Modifications

Applications

Pyrimidine

Catalog No.

Purine

Catalog No.

Conjugation/Click Chemistry

TIPS-5-Ethynyl-dU-CE Phosphoramidite

10-1555

 

 

C8-Alkyne-dT-CE Phosphoramidite

10-1540

 

 

C8-Alkyne-dC-CE Phosphoramidite

10-1543

 

 

Cross-Linking/Halogenated Nucleosides

5-Br-dC-CE Phosphoramidite

10-1080

8-Br-dA-CE Phosphoramidite

10-1007

5-Br-dU-CE Phosphoramidite

10-1090

8-Br-dG-CE Phosphoramidite

10-1027

5-I-dC-CE Phosphoramidite

10-1081

 

 

5-I-dU-CE Phosphoramidite

10-1091

 

 

DNA Damage/Repair

5-OH-dC-CE Phosphoramidite

10-1063

1-Me-dA-CE Phosphoramidite

10-1501

5-Hydroxymethyl-dU-CE Phosphoramidite

10-1093

8-Oxo-dA-CE Phosphoramidite

10-1008

5-OH-dU-CE Phosphoramidite

10-1053

8-Oxo-dG-CE Phosphoramidite

10-1028

5,6-Dihydro-dT-CE Phosphoramidite

10-1530

8-Amino-dG-CE Phosphoramidite

10-1079

5,6-Dihydro-dU-CE Phosphoramidite

10-1550

 

 

Cis-syn Thymine Dimer Phosphoramidite

11-1330

 

 

Thymidine Glycol CE Phosphoramidite

10-1096

 

 

Duplex Stability 

5-Me-dC-CE Phosphoramidite

10-1060

2-Amino-dA-CE Phosphoramidite

10-1085

AP-dC-CE Phosphoramidite

10-1097

N6-Me-dA-CE Phosphoramidite

10-1003

dW-CE Phosphoramidite

10-1527

N6-Ac-N6-Me-dA-CE Phosphoramidite

10-1503

N4-Et-dC-CE Phosphoramidite

10-1068

Pac-2-Amino-dA-CE Phosphoramidite

10-1585

pdC-CE Phosphoramidite

10-1014

 

 

pdU-CE Phosphoramidite

10-1054

 

 

Epigenetics/DNA Methylation

5-Carboxy-dC-CE Phosphoramidite

10-1066

1-Me-dA-CE Phosphoramidite

10-1501

5-Formyl dC III CE Phosphoramidite

10-1564

O6-Me-dG-CE Phosphoramidite

10-1070

5-Hydroxymethyl-dC-CE Phosphoramidite

10-1062

 

 

PCR Sequencing/Duplex Effects

dmf-5-Me-isodC-CE Phosphoramidite

10-1065

dmf-isodG-CE Phosphoramidite

10-1078

dP-CE Phosphoramidite

10-1047

dK-CE Phosphoramidite

10-1048

 

 

dI-CE Phosphoramidite

10-1040

Sequence Modification/Amino-Modifiers

Amino-Modifier C2 dT

10-1037

Amino-Modifier C6 dA

10-1089

Amino-Modifier C6 dT

10-1039

N2-Amino-Modifier C6 dG

10-1529

Amino-Modifier C6 dC

10-1019

8-Amino-dA-CE Phosphoramidite

10-1086

Structural Studies/Activity Relationship

2’-deoxypseudoU-CE Phosphoramidite

10-1055

7-Deaza-dA-CE Phosphoramidite

10-1001

2-Thio-dT-CE Phosphoramidite

10-1036

7-Deaza-dG-CE Phosphoramidite

10-1021

4-Thio-dT-CE Phosphoramidite

10-1034

3-Deaza-dA-CE Phosphoramidite

10-1088

4-Thio-dU-CE Phosphoramidite

10-1052

5-aza-5,6-dihydro-dC-CE Phosphoramidite

10-1511

Structural Studies/Fluorescent Nucleosides 

tC-CE Phosphoramidite

10-1516

Etheno-dA-CE Phosphoramidite

10-1006

tC°-CE Phosphoramidite

10-1517