Adding to our Genetic Alphabet
Sometimes we take for granted how great it is to have a 26 letter alphabet. We even take for granted the fact that words can be as long as we need them to be. There's no character limit on language... except on twitter of course.
But imagine for a second if our alphabet contained only 4 letters.
It would be a bit harder to communicate right?
Now imagine if all your words had to be exactly 3 letters long. There's not many words you could say anymore, 64 to be exact.
But that's exactly how the language of our genes works, the language that controls all life on this planet. That's pretty impressive right? and they do it without even using all 64 words, they only really need 21, so most of these 3-letter words are just synonyms.
For example, the words GCU, GCC, GCA, and GCG are all ways to say alanine.
See, our DNA is composed of special units known as nucleotides. These units come in four varieties; Adenine (A) which pairs with Thymine (T), and Cytosine (C) which pairs with Guanine (G). If you need help remembering that, I always remember it as apples (A) go on trees (T) and cars (C) go into garages (G).
This is great and all, but our DNA is stuck in this thing called the nucleus, so we send out little messengers known as RNA. This is where the pairing becomes important. A special molecule called RNA Polymerase (now that I've said it you're free to forget it straight away), runs along parts of DNA called genes. Everywhere it sees a C it will place a G onto a growing strand of RNA, where it sees G it will place down a C, and where it sees T it will place down an A.
Now guess what it places down when it sees an A?
Trick question! RNA has a weird new nucleotide called Uracil (U). It's kind of like when you transcribe a recipe you've found but decide to alter one ingredient. DNA is the original recipe, and RNA is the version you've created. You're still cooking the same product, but you're just making a slight adjustment.
This new recipe is taken out of the nucleus and used to bake our product (spoiler alert: they're proteins). Three letter sections of the RNA recipe called codons are translated into amino acids (the building blocks of protein).
If nucleotides are the letters of our genetic language, then amino acids are the words. We use these words to construct sentences. So, whilst this language only has 4 letters and 20 words, these words are used to make some pretty damn long sentences. For humans these sentences are usually around 375 words long.
So whilst it would be hard for us to communicate with only 4 letters and 20 words, it seems that there's a lot you can do with 375 words in each of your sentences, because this language makes up essentially every aspect of everything alive on our planet.
Firstly, I should tell you what they're called. The 2 new nucleotides are called d5SICSTP (X) and dNaMTP (Y). Have fun trying to say that five times fast!
The next question you probably have is "how on earth will this discovery affect me?"
Well hopefully quite a lot to be honest. The new nucleotides mean that scientists can create a multitude of new biomolecules that would never occur naturally. This has vast implications for drug and biofuel research, as well as almost anything else we could care to research.
With the rise of super-bugs and climate change, this new research could provide the answer to these problems with new antibiotics and more environmentally-friendly fuels. It could even give us new building materials and cancer fighting drugs. Who knows to be honest? The possibilities are endless.
You can read more about this amazing discovery on the Smithsonian or even in the journal article itself.
But imagine for a second if our alphabet contained only 4 letters.
It would be a bit harder to communicate right?
Now imagine if all your words had to be exactly 3 letters long. There's not many words you could say anymore, 64 to be exact.
But that's exactly how the language of our genes works, the language that controls all life on this planet. That's pretty impressive right? and they do it without even using all 64 words, they only really need 21, so most of these 3-letter words are just synonyms.
For example, the words GCU, GCC, GCA, and GCG are all ways to say alanine.
![]() |
A codon table: The key to translating our 3-letter genetic words (codons). Image credit: Scott Henry Maxwell (Wikimedia Commons) |
Our current alphabet
So before I can explain the new alphabet, it would probably be useful to understand our current genetic alphabet.See, our DNA is composed of special units known as nucleotides. These units come in four varieties; Adenine (A) which pairs with Thymine (T), and Cytosine (C) which pairs with Guanine (G). If you need help remembering that, I always remember it as apples (A) go on trees (T) and cars (C) go into garages (G).
![]() |
The structure of our DNA and RNA, as well as the different bases they are composed of. Image credit: Sponk (Wikimedia Commons) |
This is great and all, but our DNA is stuck in this thing called the nucleus, so we send out little messengers known as RNA. This is where the pairing becomes important. A special molecule called RNA Polymerase (now that I've said it you're free to forget it straight away), runs along parts of DNA called genes. Everywhere it sees a C it will place a G onto a growing strand of RNA, where it sees G it will place down a C, and where it sees T it will place down an A.
Now guess what it places down when it sees an A?
Trick question! RNA has a weird new nucleotide called Uracil (U). It's kind of like when you transcribe a recipe you've found but decide to alter one ingredient. DNA is the original recipe, and RNA is the version you've created. You're still cooking the same product, but you're just making a slight adjustment.
This new recipe is taken out of the nucleus and used to bake our product (spoiler alert: they're proteins). Three letter sections of the RNA recipe called codons are translated into amino acids (the building blocks of protein).
![]() |
Creating proteins is just like baking from a recipe you've copied. Image credit: Peng (Wikimedia Commons) |
If nucleotides are the letters of our genetic language, then amino acids are the words. We use these words to construct sentences. So, whilst this language only has 4 letters and 20 words, these words are used to make some pretty damn long sentences. For humans these sentences are usually around 375 words long.
So whilst it would be hard for us to communicate with only 4 letters and 20 words, it seems that there's a lot you can do with 375 words in each of your sentences, because this language makes up essentially every aspect of everything alive on our planet.
The new language
So now that you understand our current genetic language, why do we care about having 2 new letters?Firstly, I should tell you what they're called. The 2 new nucleotides are called d5SICSTP (X) and dNaMTP (Y). Have fun trying to say that five times fast!
The next question you probably have is "how on earth will this discovery affect me?"
Well hopefully quite a lot to be honest. The new nucleotides mean that scientists can create a multitude of new biomolecules that would never occur naturally. This has vast implications for drug and biofuel research, as well as almost anything else we could care to research.
With the rise of super-bugs and climate change, this new research could provide the answer to these problems with new antibiotics and more environmentally-friendly fuels. It could even give us new building materials and cancer fighting drugs. Who knows to be honest? The possibilities are endless.
![]() |
So far these new nucleotides have only been put into E. coli, but they'll be able to use them for a lot more. Image credit: NIAID (Flickr) |
You can read more about this amazing discovery on the Smithsonian or even in the journal article itself.
Comments
Post a Comment