Lecture 11: Molecular Biology 2

Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

Topics covered: Molecular Biology 2

Instructors: Prof. Eric Lander

OK, so what I'd like to do today is pick up where we left off last time, with respect to how this genetic material actually functions.

We discussed last time the experiments that identified DNA as the fundamental genetic material, the transforming principle. We identified the eventual work by Crick and Watson's work at the structure of DNA as a double helix.

We mentioned why that was so tremendously important, because it contained within it in principle the secret of replication, namely two strands, each of which contained the full information, and therefore each of which included in principal serve as a template for making the other strand. And that is, after all, the big issue about life is how do you, in fact, copy life? And then, I mentioned briefly these experiments by these post-docs, Matt Meselson and Frank Stahl about 50 years ago to demonstrate that the semi-conservative model of DNA replication was right by virtue of actually labeling DNA during the course of its replication in one generation, and demonstrating that DNA actually changed in its density when you added in an isotope of nitrogen. And, it changed in its density in such a way as to be intermediate between what you'd expect from heavy, heavy, light, light. You have the intermediate.

So, that was all good experimental confirmation that this model was probably right. But now, how does it really work?

After all the excitement calms down for a moment you say, OK, that's great. We now know in principal it's there, but what actually goes on? How is DNA really replicated?

How is it really read out into information? How does it really, as Archibald Garrett noted, and as Beadle and Tatum noted, how does it really make protein as well?

How does it encode the instructions for that? Well, that was what was on people's minds in the late '50s.

And, it was Francis Crick who was the real intellectual thinker about this. And, the eventual synthesis that you guys all know, because, again, all this stuff gets taught in elementary school these days, was encapsulated in the central dogma of molecular biology, which I will summarize here diagrammatically.

The DNA is replicated to make copies of DNA.

It's read out into the intermediate RNA, and then it is translated into protein. This process: translation. This process is called transcription. And this process: replication.

And what I'd like to do is go into some detail today about how each of these processes work. Now, at the beginning, when people were trying to patch this together, it wasn't as obvious as it is to you today, that DNA goes to RNA, goes to protein. And, in fact, it was a real struggle to figure out what this RNA stuff was doing in the middle, how it could possibly give rise to protein. I want to talk about some of that.

Let me briefly mention, though, Francis Crick's term, the central dogma, because it sometimes gets criticized, the word dogma there as being like religious belief and molecular biologists treated in this way.

I've read a couple of social scientists who sort of say, dogma. In fact, Francis Crick deliberately named this the central dogma because he said there was no proof for it at the time it was put forward. He put it forward with that word precisely to emphasize that this was a working guess. But, it was merely a matter of belief that this is sort of how they were putting together the pieces.

And it was really a question of demonstrating how all these pieces work. We still call it the central dogma, but it's now, of course, extraordinarily well established. Let's look at this first piece. DNA is replicated. All right, so Meselson and Stahl tell us that, yeah, the DNA weight look like the new strand, the old strand, all that. How would you really demonstrate DNA replication? If you wanted to show me that DNA replication really happens, this DNA goes to DNA, that somehow we had to take a double strand of DNA, and it gives rise to, it's one thing to show this in a bacterium by adding the nitrogen and all that. The way to really prove this was to be in a test tube. In vitro, reconstitute for me DNA replication. Show me that in a cell free system, you can take DNA, and you can copy it as you would expect according to the Crick Watson model here. Well, that is what Arthur Kornberg set out to do. Arthur Kornberg was a biochemist, and so his interest was crack open the cell, and purify an enzyme that was able to copy DNA. Now, how do you do that? What cells should you pick?

Sorry? Why E coli? What a bacteria? It's simple, exactly. Good answer. You can grow up a lot of it, and presumably, if this DNA replication thing is right, it will apply to any organism. So, we'll go with E coli. So, what do you do? You just crack open a cell and purify components, and throw them in a test tube, and look for DNA synthesis? Well, you've got to put something in the test tube. What should we put in the test tube? Sorry? Nucleotides, because we think that this is going to be made out of nucleotides. So, we'd better add some nucleotides to our test tube.

So, actually, deoxynucleotides, we'll add some DATP, DCTP, DGTP, and DTTP, the deoxynucleotide triphosphates, altogether known as the DNTPs. OK, that's good.

So, we're going to take different fractions of the cell.

We'll add it here. We'll add some nucleotides, and what else should we add? Well, if we were going to copy DNA, maybe we ought to put in a DNA strand. Let's put in a DNA template. So, let's put in a template strand of DNA that we'll copy, here we go, and we've got our nucleotides floating around here.

And, here's our template strand, a single strand of DNA, and now we add enzymes, and we hope that it's going to somehow copy the DNA. Now, it turns out that that's a little bit optimistic because in order to copy the DNA, and I think Kornberg had this insight, it's helpful to give it a start. So, instead of just adding a single template strand, he also added a short complementary primer strand with the hope that he would be able to purify an enzyme, which even if it couldn't manage to start the synthesis of DNA, would be able to extend the synthesis of DNA.

That's a reasonable thing. Let's not ask for it all at once.

Maybe it won't be a single fraction. Maybe multiple enzymes would be needed to get going. So, he needed a primer strand, a template strand, and some nucleotides. And then he added fractions, and he looked to see whether he could get incorporation of DNA. So now, let's look at this a little more closely. The primer strand goes like this. Five prime, ah, This direction is going to matter a lot, I told you.

Phosphate T, phosphate A, phosphate C, phosphate G, phosphate T, phosphate A, stop there. Template strand, the complement to that , will start in the opposite direction.

These are anti-parallel. What matches the T: A. Keep going: T, G, C, A, T, and phosphate, phosphate, phosphate, phosphate, phosphate; I'll stop writing the phosphates in a while. Let's say T, A, G, G, C, etc. This is the five prime end. That is the three prime end, OK? And, this one will go on further, let's say.

All right, so what is the enzyme that Kornberg hopes to find going to do? What's it going to add to the strand? It's going to add an A.

All right, it wants to put in an A here. So, it's going to take a triphosphate, and it's going to catalyze the addition of a triphosphate to the growing end of this DNA chain, and which is its growing end? The three prime end of the chain there, right? It's adding it to the three prime carbon there. And, when it does that, where is it going to get the energy for catalysis here for this chemical reaction here? It's going to get it from the dehydration synthesis and the breaking of this triphosphate bond, which is a high-energy bond. You'll take off your inorganic pyrophosphate and you'll add in an A. That's it.

Then, it will go off and it'll look for, what, a T, a triphosphate with T, DTTP, and then DCTP, etc.

And it adds them in. This enzyme, this hypothetical enzyme, that can polymerize DNA like that is called polymerase. It's all very simple stuff. This is DNA polymerase. OK, and the nomenclatures here make tremendous sense. This is called DNA polymerase.

Anyway, Kornberg isolated by a lot of work DNA polymerase, and was able to demonstrate that it could in fact catalyze this reaction. This was incredibly exciting.

He got a Nobel Prize for this amongst other things, but he really demonstrated that there were proteins that could copy DNA according to this double helical model for replication.

I call your attention to the fact that the replication goes five prime to three prime always, ever, all the time. This is universal. No one has ever found a DNA polymerization system in nature where it goes the other way. And, why would that be? This is just a digression. But tell me why that would be?

Let's take our strand here, T, G, C, A, T, T, A, G, C, G, T, why not go this way?

Why not go, let's say, A, G, C, G. Let's see, what base should we put in? We'll take our triphosphate T, right?

We'll put that in. Let's see, where are we going to get the triphosphate bond; where are we going to get the energy?

The triphosphate's on the wrong end. Oh, that's not a problem because when we put this G in, it must be that its triphosphate was still there, right? So, now we'll take the next one, a triphosphate T, and now why don't we just carry out the polymerization using the triphosphate bond, the energy from the triphosphate bond, on a growing chain going in that direction? That would work, right? Just stick this guy here. It'll supply a new triphosphate at the end, and that triphosphate can be used to catalyze the next monomer.

So, what's the problem? You could put the triphosphates on the growing chain. If we went this way, the triphosphate bond would be on the growing chain, rather than in this way the triphosphate is on the monomer.

But who cares? Who might care?

If you were designing it, which way would you prefer to do it?

The one with the energy first, well, why do you care whether the triphosphate is on this big, long chain that you've made, or whether it's on this monomer because either way you've got a triphosphate bond that could be on the monomers floating around, or it could be in that last position with the growing chain. Yeah?

Could be, could be. What kind of mistake might I make?

Yep. And, you know, what other kind of mistakes can happen?

What about these high-energy triphosphate bonds: unstable?

What if they should just spontaneously hydrolyze?

Oops: big trouble, right? You've lost your triphosphate bond, and but what if this one spontaneously hydrolyzes? Aren't you in trouble? No, get another monomer, right? Clearly, it's no big deal if one of the monomers spontaneously hydrolyzes from a triphosphate to a monophosphate, but it's a big deal if you've invested all of this energy going in the other direction, and it should spontaneously hydrolyze.

So, it makes a great deal more sense to leave that high-energy bond on the monomer for the growing polymer rather than on the polymer itself.

And, in fact, of course, nature hasn't told me why it chose to do this. This is my reason why I think nature chose to do this, but I think it's very reasonable, and I think it's right. So, this is not the way it's done. This is the way it's done, and it's always done that way. No one has ever found a case where it's not. OK, so now let's look a little more closely at DNA replication. Suppose I take not just this teeny little piece that Kornberg gives, but suppose I now look at what's going on in an organism. An organism might have a big, long chromosome. DNA replication is occurring along this chromosome.

We've got to go five prime to three prime, five prime to three prime.

Let's suppose there's a primer here. Wait a second, where's the primer going to come from? If Kornberg's not there to add the primer, what does the organism do? To kind of make one itself, and I'm going to need some enzyme to make it. So, what enzyme's going to make it?

Or, primase: it turns out to be remarkably, coincidentally it's primase that makes the primer. It's funny how that works out. And so, primase makes the primer, and then what happens? Then, DNA polymerase comes along and catalyzes the addition, and works beautifully.

What about on the other strand? So, it's got a what?

Why does it have to play catch-up? Let's see, what kind of primer here?

It's got to go the other way.

OK, so let's get a primer here. So, but wait a second, now it breathes and opens up a little more. We've got to get a primer here.

And then, when it's going to open up even more we've got to get a primer there. See, this guy's going the wrong way. So, in fact, this is what happens. When the DNA opens like this, one primer here is sufficient to keep going, but here as you begin to open this up, the other strand needs the continual addition of new primers, and then what happens when this DNA sequence here, growing, meets this DNA sequence there? They've got to be ligated together. They've got to be joined together. So, this is actually getting kind of complicated. We have little DNA fragments that have to be ligated together on this strand.

Now, how are you going to ligate them together?

Chemically, you've got to catalyze a covalent bond between this little growing DNA chain and the previous growing DNA chain that was there. How are you going to ligate them?

Ligase: yes! Coincidentally, it turns out that ligase does that.

It's just wonderful the way this worked out, that ligase should do the ligation, and primase should do the primer, and all that.

All right, so this goes on and on. Now, this model, which is what would be compelled by what we're thinking about is experimentally proven. There was a scientist who demonstrated that on this strand, this one goes slower, right, because it's got to, just as you said, catch up. Playing catch up, this is what's called the lagging strand. This guy is called the leading strand. The lagging strand plays catch-up to the leading strand.

And, these little fragments can actually be really, truly identified biochemically. They were identified, in fact, by somebody called Okazaki. And, do you know what they're called, those fragments? Okazaki fragments, exactly. That's what they're called. So, that's how it goes, and it goes with this continuous replication, and then this discontinuous replication there. Now, here's another problem. This upset people a lot. Try to take a long chromosome.

In fact, let's even imagine that it's a circular chromosome like bacteria have, a big DNA circle. Imagine trying to replicate this.

All right, we're going to pull this apart some. We'll start replicating as we'll continue to pull this apart, etc., etc., but the problem is that we're going to end up with this DNA helix and this DNA helix wrapped around each other so that we're going to have double helices, or we're going to have interlaced double helices.

It's really very messy. Topologically, if I take a double helix and I copy the two strands, and the double helices went around each other 800 times before they got to the end and joined up, I've now got two circles of DNA that are inextricably linked together with what's mathematically called the linking number of 800.

That's not very good when I try to now divide my cell and say, in one chromosome to one cell and one chromosome to the other cell because I've got these two long, continuous ropes that are just so totally knotted with each other. This bothered people tremendously.

You can prove, mathematically, some of you take the topology courses that there is no way without cutting to pull apart two strings that are so intertwined with each other. So, how in the world is life going to do that? It's mathematically impossible to do that without actually cutting. So, it cuts it because it's got no choice, right? There's a theorem that says you have to cut it. So, it cuts it.

You would actually need, it turns out, that if you're going to separate out these two different double helices that are all wound up around each other, you're going to need to somehow cut the DNA, separate it, and pass it through the other side.

And, you're going to need to do that to un-knot this thing.

Now, does it change it chemically when you cut it and bring it around to the other side of the string? It's still the same molecule, right? It's the same DNA, but topologically it's different.

The two circles are now not linked to 800 times their links, 799 times, and if I keep doing that, so they are, you could call them topoisomers because they differ only in their topology, their topoisomers. So, you would need an enzyme that actually cuts the DNA, and is clever enough to pass it to the other side and then seal it back up, and cut the DNA, and pass it through the side and seal it back up.

What enzyme does that? Topoisomerase does that, that's right. And, there are topoisomerase enzymes that cut and paste the DNA to resolve this terrible linking number problem.

So, life has worked all this stuff out, and there's just fascinating work that goes on to understand, woops, all of the steps there of DNA replication. Now, I mentioned that these are actually pretty important things because processes like this are very important to rapidly growing cells. It turns out that some very good anti-cancer drugs are inhibitors of topoisomerase because rapidly growing cancer cells are highly sensitive to the need to continue to topologically untangle your DNA. And so, topoisomerase inhibitors turn out to be pretty good, well, they're not great, but they turned out to be acceptable cancer drugs.

Here's another issue: fidelity. The fidelity of DNA replication.

If I'm copying the DNA, I'm going to put in my next base.

It's a T. I want to put in an A, a G, I want to put in a C; how do I get it right? I have my DNA polymerase enzyme here.

How do I manage to get this right? Why don't I put in a G next to the T instead of an A? Well, it's energetically less favored, right? Energetically, there's some cost.

There's a delta G, an energetic difference between the right base and the wrong base. Now, if I know delta G, I from biochemistry know the equilibrium constant.

I should be able to calculate, based on the energetic difference between putting in the right base and the wrong base how often DNA polymerase makes a mistake, and it turns out you can do that.

It turns out that the equilibrium constant is about 103.

That means that DNA polymerase, remarkably, gets it right 99.9% of the time, it puts it in the right base. Isn't that impressive?

No, it's terrible. Why is that terrible? Yeah, 99.9% this is no Six Sigma performance or anything. This is pretty unimpressive stuff. I mean, a typical gene is more than 1,000 letters. That means we're going to actually make a mistake on average in every gene. This won't do.

So, what happens? Sorry? Well, clearly the energetics say that the delta G is only enough to get us a factor of 103.

We're going to need an additional mechanism, and the additional mechanism's a proofreading. It's absolutely right.

We need to proofread this because we know that initially we're going to get it wrong at an unacceptably high rate. And so, it turns out that there are two kinds of DNA proofreading that go on.

First off, DNA polymerase itself has a proofreading activity.

Whenever DNA polymerase adds a base, it kind of also has an activity that will remove a base. So, it doesn't just add bases going forward. It also has what's called an exonuclease activity that removes bases going backwards.

Now, that may seem silly, right, because it's adding and subtracting, and adding and subtracting, but it adds more than it subtracts. And, the trick is that if there's a mismatched base, it's much more likely to subtract than to add, or much more likely to subtract than if there's not a mismatched base. So, the presence of a mismatch induces the enzyme to do its removal more than if there was a match.

In that fashion, DNA polymerase is able to substantially increase its proofreading ability to about one error in 105 or 106, much better in one in 103.

Then, it turns out that there are mismatched detection and repair enzymes. They come along after DNA polymerase has done its job, and they feel along the DNA for any mismatches. Mismatches are going to create funny structures. They're going to bulge in some way.

And, mismatch repair enzymes are able to detect that something's funny, and they chop out some sequence, and they get copied back in. Now, with the proofreading that comes from these mismatched repair enzymes, you can get down to the neighborhood of one mistake in about 108 bases. In the course of the human, yes? Oh, what a great question! Because, when it has a mistake, how does it know who to correct? In bacteria, I can tell you the answer. Wouldn't it be cool if you could leave a mark on the old strand?

If the old strand could be temporarily marked in some way so that the enzyme, when it sees a mismatch, would also know which strand to cut out and re-synthesize?

It turns out that bacteria do that. Methylation enzymes actually mark the old strand. And, it takes a while before those methylation enzymes come along to mark the new strand, and it leaves a temporary mark as to who's the old strand.

I wasn't going to mention that today, but it's a great question.

So, it leaves breadcrumbs for a while that tells it who's the old strand. So, all of this gets worked out.

Yes? So, the exonucleases go backwards. They go three prime to five prime because, that's right, they only work in that direction. There are other exos that go in the other direction, but this exo on the polymerase go backwards, three prime to five.

Now, this is not just theoretical stuff. It turns out that about one person in 400, that is, probably at least one person in this class, is heterozygous for a mutation in one of the mismatch repair enzyme genes. One of the genes like MSH-2 or MLH-1 that encode the mismatch repair enzymes.

What do you think happens if you are missing one of your two copies of these mismatch repair enzymes? Nothing much. The other copy's enough. But, what do you think would happen if by chance a single cell in your body were to lose the one remaining working copy of that enzyme, the gene-encoded remaining working copy?

Then it would have no copies. What do you think the response of the cell would be? High mutation rates, and cancer. It turns out that familial, hereditary, nonpolyposis coli, a familial form of colon cancer, is caused by, in many cases, mutations in the gene or genes, actually, encoding the mismatch repair enzymes.

So, our theoretical understanding of the central dogma here is an incredibly practical disease because getting DNA replication right is important. And, that provides a very good proof that the difference between 105 or 106 here and 108 accuracy matters a great deal, that without that mismatch repair enzyme present in the cells, one is in fact going to create new mutations at an unacceptably high rate and lead to cancer. I don't know, a few other random nice facts about DNA polymerases.

They're very fast speed. The speed of a DNA polymerase is about 2,000 nucleotides per second: very impressive.

And then, one last point I can't help but mention, Arthur Kornberg discovers this enzyme, shows in a test tube, it works, people work out, how it works in detail, leading strands, lagging strands, topoisomerases, workout fidelity, all these kinds of things, great. But Kornberg's enzyme, the enzyme he purifies that copies DNA, is it actually the right enzyme? Is it the enzyme that the bacterial cells he used actually use to copy their DNA? Well, a biochemist would say, I cracked open the cell. I purified a component. It's able to carry out this function. There you go. But, what would the geneticist say? Sorry? Take out the component, and demonstrate now what? That the cell can't replicate.

It's DNA. Until you've shown that, you haven't got the other half of the proof. So, of course, some geneticists decided to put this to the test. They took many mutant bacteria.

One at a time, they grew them up, and they did Kornberg's purification to purify DNA polymerase.

This is unbelievably tedious stuff, guys. You've got to take each one.

You've got to purify it; get DNA polymerase. OK, it's there. Next one, next one, next one, next one.

But, suppose you found a mutant which couldn't make Kornberg's DNA polymerase but still grew and replicated its DNA.

That would prove that Kornberg's enzyme was not essential.

They did. It turns out that Kornberg's enzyme, DNA polymerase 1, although it can replicate DNA in the test tube is not the enzyme that cells actually use for their major DNA replication. It turns out to be a relatively more minor repair enzyme used to fill in gaps. The actual enzyme is DNA polymerase 3, not that it matters to you a great deal, but this duality between the biochemistry and the genetics is very important because just the biochemical side of the story, without showing that it was essential to the function in the organism misses a very important point there. So, the combination of genetics and biochemistry, biochemistry pointed us to a class of enzymes. The genetics, then, identifies which ones are used for which purposes in vivo, which is not that easy to do in the test tube. Anyway, I mentioned that, and obviously being a geneticist, I like tweaking the biochemists about things like that.

All right, onward. So, in our picture of DNA replication, in our picture of the central dogma, we've got DNA goes to DNA, and what about the step of transcription, DNA goes to RNA? Well, we've got to copy out our DNA into an intermediate molecule called RNA, which is going to then be used as a template for protein synthesis. Where do we start?

Somewhere in here, there's some information.

We want to make a copy of that information. How do we know where to start? Well, there's something.

There's some information that says start here, right?

There's a little sign that says, start here. Such a thing is called a promoter. And, the promoter, which we'll come and talk about more in a while, probably in a lecture or two, the promoter says here's the place to start copying the DNA into RNA, and it gets copied into the RNA by an enzyme that starts here, let's say, I don't know, T, A, T, G, G, T, A, T. On the other strand I guess it's going to be A, T, A, C, C, A, T, A. It's going to start copying here, and it's going to put in an A. Then opposite the A, it's going to put in a U, because RNA has U, A, C, C, A, U, A, etc., except this time it's doing it not out of DNA but out of RNA. How does RNA differ from DNA? So, first off, instead of deoxyribose, this is deoxyribose.

In fact, it's two prime deoxyribose. This is just plain old ribose.

Remember down there on the two prime carbon, DNA had just a hydrogen, whereas RNA has a hydroxyl.

All right, that's one difference, and it turns out that that hydroxyl is important because it would interfere in making long double helices of RNA. RNA doesn't make good, long double helices. Let's entirely do that, oxygen.

And, the other major difference between DNA and RNA?

The only other difference between DNA and RNA is that this has U where this has T, and what's the difference between T and U?

A single methyl group. That's the only difference between T and U. In this six-member ring over here, there is a methyl group.

And here in the six membered ring, there's no methyl group.

That's it. Why does RNA use U, and DNA use T? Anybody know?

It's not a big difference. That would be interesting, although I don't think it's true. I actually have no idea. I think this is fascinating. I've never had a good accounting of why it uses U and T. You need to know this, and it's true, but I don't actually have a, whereas I have a good explanation for this I don't have a good explanation for that, although maybe some of my Origin of Life colleagues have an explanation. But I've always been a little puzzled. Why does it use U instead of T? Anyway, I do know why it doesn't have the hydroxyl. Well, it has the hydroxyl there. That really does affect the base stacking, and all sorts of things like that. All right, so you, don't go away, come back. So, the DNA is used as a template to copy here a strand of RNA. Some important names: the strand that is being copied that is being transcribed is called the transcribed strand. This is called the non-transcribed strand that makes good sense. This is also called the coding strand. And, you will find it in your books as the coding strand.

Why is it called the non-coding strand?

This is called the coding strand. Why is the top strand called the coding strand? Because the RNA that I copy out will have the same sequence as the coding strand, except for T's and U's. So, the RNA copy that is made from the transcribed strand matches the sequence of the non-transcribes strand, or the coding strand.

So, you will find this confusing, but you will probably find it on tests and some things like that to know which strand you're looking at.

The coding strand is this strand which has the code that ends up, but in fact it's the template for the coding strand, the complement to the coding strand, the non-coding strand, the transcribed strand that is copied. Anyway, I've said that now, and you can, So, how does it know where to stop?

Sorry? Stop codons. Stop codons are actually about translation into protein, right, because we're going to come to stop codons in a second. There was some start signal there called a promoter, which is a start of transcription.

It turns out there was also a stop signal that says stop of transcription. And, you guys haven't probably met that before. But, there's a start signal, a stop signal, and all over the genome there are these things.

So, here's some genome. Here's some gene that's got to be read out.

And, it's read out this way, let's say. This is the coding strand.

This is what, I'll make two strands here. Now, in the next gene over here, does it go in the same direction?

It might. Or, it might not. It turns out that the orientation of genes along the chromosome, which way you read, is not a fixed thing across the entire length of the chromosome.

So, when I refer to the transcribed strand or the non-transcribed strand, that's just a local definition that says, with respect to that gene, this strand is coding, and this strand is non-coded.

But with respect to the next gene over, it could be the other way.

Now, this is not a very orderly way to do things, right?

If a good engineer did this, they'd probably get all the pieces going in line and all that. But life did this, and it turns out that evolvable systems, you know, couldn't possibly maintain that order. Things are happening all the time, and genes can come in any order. In addition, how does RNA polymerase know when to turn on the gene? Oh, sorry, what's the enzyme that polymerizes RNA? RNA polymerase, yes.

How does it know when to turn on the gene? How does it turn on the right genes in the right tissues? We'll come to that. That's gene regulation. That's a big non-trivial thing.

We'll save that one. All right, so we have all of this transcription. Let's now look at the last important part of our picture here, which is translation.

So, RNA goes to protein. So, if RNA goes to protein, we take our messenger, our RNA over there. This is an RNA.

What's the direction it's been copied? Five prime to three prime. It's a single strand of RNA that we've copied here, a single strand and molecule, and let's give it a sequence, A, U, A, C, G, A, U, G, A, A, G, C, C, C, etc. Eventually we'll get to U, A, G. How is this RNA interpreted?

Well, in an abstract sense, the way this RNA is interpreted is by a triplet code. The cell could come along and start reading three letter codons. But, does it just start anywhere?

No, it always starts at the same codon, and that codon is A, U, G. This is an initiator codon.

And it encodes a methionine. Then, the next codon down encodes lysine, arginine, etc. The interesting challenge is how in the world you get from a sequence of nucleotides to a sequence of amino acids. So, we have to now get this funny translation step between nucleotides and amino acids.

This concerned people greatly because transcription was pretty easy. Transcription was going to be the RNA, actually first replication, each nucleotide would match a nucleotide on the DNA sequence. Then, RNA polymerization, each nucleotide of RNA would match. But how are we going to get amino acids to match specific RNA sequences? How are we going to get amino acids? Now, this bothered people a great deal.

And, you know what some of the ideas were? Well, protenase, right. Some enzyme, well, actually the first ideas were very physical ideas.

It was that the RNA message there would fold up into some kind of a funny shape that would just happen to match a lysine, and then the next little bit would fold up to match, I don't know, histidine, a methianine, and a serine, and a this, because people were thinking the complementarity of DNA bases all just physical matching that it would work that the amino acids would be directly read off the RNA message.

But, it was kind of crazy to imagine that because the amino acids all have such wildly different physical properties: positive charges, negative charges, hydrophilic, hydrophobic, different sizes. It just didn't make sense, but it bothered people a great deal.

But, I would say that a lot of biochemists thought that that was sort of how it was going to have to work. The guy who really figured out what was going on did it with no experimental data whatsoever.

He did it by just sitting down and saying, that doesn't make any sense.

There's got to be another solution. And, that was Francis Crick.

Francis Crick just had an incredible mind.

He, Mendel, and a few other people had this incredible insight into things. He said, look, this just makes no sense that the physical properties are going to do it. He said, what's got to be going on is that what I want to put in a certain amino acid into a growing protein chain, I'm going to take my amino acid here. I'm going to take my codon here, and I'm going to build me some kind of an adapter.

And, this adapter molecule will, in fact, solve the problem. So, he said, because Francis Creek, in addition to being brilliant, really didn't do any experiments.

He didn't do any experiments both because he wasn't that fond of doing experiments, and because he was legendarily not very good at the bench. But, what Francis did was he exhorted all of his colleagues to go find the adapter. He had what he called the adapter hypothesis. And sure enough, Crick was dead on, just right.

The adapter hypothesis turned out to be that there was an adapter molecule who was made itself out of RNA called transfer RNA. And, transfer RNA matched up by base pairing to each codon you see, and had amino acids attached to it and so the problem of how you mediate between a three-letter code of DNA or RNA, of nucleotides, and amino acids was solved by a clever intermediate.

It turned out that they looked, they found the molecule. So, it's just one of these great examples of somebody having thought up an idea, sent people off to look for it, and it was there. And then, of course, you've got to ask, how did the amino acids get stuck onto the right transfer RNAs? And the answer is there's a bunch of specific enzymes that do precisely that job, that look at the transfer RNA, attach the amino acid, and handle that whole problem. I will next time briefly end with the ribosome, and how those transfer RNAs work to catalyze together the protein chain, and then what I want to do is turn to how this common picture of DNA, RNA, and protein varies amongst organisms. Until next time.