The Martin Lab has a long history of studying fundamental mechanisms in transcription. Using powerful tools of biophysical chemistry and enzymology, we have focused on the (relatively) simple, single subunit RNA polymerase from bacteriophage T7. Our work is all with purified enzyme and largely with synthetic DNA templates, affording us exquisite control over the system. The story below represents insights from our lab, but of course, also from a wide variety of excellent labs, in both the T7 and multi-subunit RNA polymerase families.
While this is written from the perspective of the T7 family of single subunit enzymes, most of it also holds true for the multi-subunit RNA polymerases. While the multi-subunit bacterial and eukaryotic RNA polymerases clearly have a common evolutionary ancestor, the single subunit polymerases have almost certainly evolved independently. That many features of transcription are the same or similar across these families recognizes the common requirements that the process has imposed on evolutionary selection.
Some words about transcription
Unlike a textbook enzyme where substrate binds, catalysis occurs, and product is released, a polymerase must move along the DNA, with repeated cycles of catalysis/movement (the primary job of a DNA polymerase). An RNA polymerase maintains a short (about 8 base pair) RNA-DNA hybrid, within a melted bubble in the DNA, but then releases RNA beyond (upstream, away from the active site) that point into solution. The DNA to which the RNA, and the substrate NTPs, bind is called the template strand – it provides the encoding information. The DNA displaced in the bubble is the nontemplate strand and is actually not needed in vitro (except at the promoter).
de novo Initiation. Unlike a DNA polymerase, an RNA polymerase must initiate transcription de novo – that is, without a pre-formed primer. And it must do so at specific sequences (promoters) in the DNA. Thus, T7 RNA polymerase binds a promoter sequence that extends upstream about 17 bases from the transcription start site. It recognizes features in the duplex region from positions -17 to -5, and melts a “bubble” extending from position -4 to about +3 (transcription starts at position +1, and there is no position 0!). Some of the binding energy from the duplex DNA contacts is used to melt open and maintain this initial bubble. The primary initiation event involves the first two substrate NTPs, sitting down at positions +1 and +2, followed by a phosphoryl transfer reaction to create the first phosphodiester bond, releasing pyrophosphate from the +2 NTP. The triphosphate from the +1 NTP is retained passively – we call this the 5′ end of the transcript. At this point, the polymerase must move forward (translocate) along the DNA (or the DNA move backward within the polymerase), to allow positioning of the +3 NTP in the active site, encoded by the +3 base in the template strand. Another phosphoryl transfer reaction occurs to generate a 3 base transcript (a 3mer RNA).
Initial transcription. To translocate again, on duplex DNA, the downstream end of the bubble must melt to expose the base at +4 in the template DNA. Since RNA polymerase retains its duplex promoter interactions, the upstream end of the DNA bubble remains fixed and so the bubble is now expanding (forward). Repeated rounds of translocation/phosphoryl transfer expands the bubble (if this region artificially contains no nontemplate strand, everything still occurs normally, in the absence of a bubble). Interestingly, another thing is happening as the RNA grows during initial transcription: the RNA-DNA hybrid duplex grows from 2, to 3, to 4, etc base pairs. In T7 RNA polymerase, the growth of that hybrid, as it translocates backwards within the polymerase active site, very quickly causes it to “bump up” against a domain in the polymerase, pushing on it and causing it to move (translocate) and rotate. Addition of RNA bases at the active site lengthens the hybrid, which serves as a growing piston to drive this motion. This is critical, as energy from phosphoryl transfer is being converted into mechanical “strain” in the form of this piston motion.
Transition to elongation. Why did nature evolve this piston/strain process? To consider this, we need to back up briefly. During the process just described, we have a 2, 3, 4, … base RNA-DNA duplexes (as the hybrid is growing). Without an enzyme present, the bubble wouldn’t form in the first place, but even if it did, it would immediately collapse back down, displacing the newly synthesize 2, 3, 4 base RNA. Remember that the energy of promoter binding was used to melt the bubble and continues to be used to maintain the upstream edge of the bubble. Therefore, during initial transcription, when the RNA-DNA hybrid is short, nature must maintain those strong promoter contacts, to maintain the bubble open. Eventually the hybrid becomes long enough (see below) to resist collapse of the bubble, and at this point, the promoter contacts can be lost. Indeed, at this point, it is important to release the duplex promoter contacts, so that the enzyme can move 10, 100, 1000 bases downstream. The promoter contacts are strong – we need an input of energy to drive release of those contacts, but the only favorable energy input in the system derives from the phosphoryl transfer reaction. The growing RNA-DNA hybrid piston is the mechanical/energetic coupling nature needs. Its growth drives a structural change that ultimately leads to weakening promoter contacts – and promoter release. The use of the growing hybrid as a piston has independently evolved (at least) twice in evolution. Although the structural details are very different, the multi-subunit RNA polymerase in bacteria and eukaryotes also uses the hybrid as a mechanistic piston to drive promoter release.
Elongation. In both T7 RNA polymerase and the multi-subunit RNA polymerases, the enzyme maintains an ≈8 base pair RNA-DNA hybrid. As the complex steps forward during a normal elongation cycle, a base pair is melted downstream, while a base pair re-anneals upstream, for what is (mostly) a smooth progression. Why 8 base pairs? It has been proposed that 8 base pairs is sufficient for the RNA to be topologically locked (helically) around the DNA template strand. It is that topological threading that confers the extreme stability of an elongating complex.
Dissociation/termination. If the topological lock is a major stabilizing factor in elongation, then one must “un-thread” that lock to achieve dissociation of the complex. It is proposed that this happens in hairpin-dependent terminators and in Rho-dependent termination in the bacterial RNA polymerase. In the former case, formation of a hairpin in the exiting RNA “pushes” the polymerase forward (hyper-forward translocation), unwinding the topological lock. In the latter case, an ATP-driven motor protein achieves the same end. In runoff transcription (perhaps rare in biology, but very common in laboratory applications), when the polymerase gets to the end of the DNA, there no downstream DNA duplex and therefore that barrier to hyper-forward translocation is gone. The polymerase slides forward and unthreads the lock, to achieve dissociation.
Pausing. Just as an elongation complex can hyper-forward translocate, it can also reverse translocate (backtracking). In either case, the RNA 3′ hydroxyl is moved out of the active site and elongation necessarily halts. Transcription is halted until the complex returns to the productive cycle position, with the 3′ hydroxyl at the active site. Backtracking has been often observed in multi-subunit RNA polymerase (but has not yet been directly demonstrated in the single subunit polymerases).