'The genome' is one of those phrases that gets thrown around a lot. All species have a genome, but I'm just going to specifically talk about the human genome today. You may have read in the news that the first draft of the Human Genome Project was published in 2000, and then may have read all about it again in 2003 when a further refined version was published. It's still not really 'complete'.
So what do we mean when we talk about 'the human genome'?
In it's most basic form it's all the DNA that is found in a human. Except, we're all slightly different from each other (I have blue eyes, you may have brown eyes), also within you your cells don't all have the same DNA - there will be slight mutations, lots of these won't do anything, but they're also the changes that can lead to cancer. So when we talk about the genome, we really mean a sort of averaged version of the DNA found in a human - it's often referred to as a reference genome.
Except we usually also leave out the DNA found in mitochondria, because it doesn't quite count...
Okay, so when we talk about 'the human genome', we mean all the DNA in a human (averaged, with some left out), right?
Sort of. That's the basic level of information, but just the DNA code on it's own isn't that much use to anyone. So, we also include the annotation (other information that we can attach to it), such as:
Where are the genes?
What do those genes do?
What about DNA that tells the genes when to be on or off (promoters, Transcription Factors)?
What about other weird bits of DNA like transposons (I'll talk about these soon as it's what I'm really interested in)?
A fairly recent project (which you may have also heard about in the news) called the 1000 genome project aims to reduce some of the averaging that goes on with the human genome. This project will allow us to look at differences between the genomes of different people and differences across different populations. But it still won't be a complete map of all the differences in the human genome.
The $1000 genome project aims to provide people with their own genome for $1000 (does what it says on the tin!), it's not in a great state at the moment, but as genome sequencing get cheaper and easier we get further along the journey towards really understanding the human genome.
A project to explain topics in genetics to people who don't have any formal training in it.
Monday, January 27, 2014
Monday, January 20, 2014
Intro to Molecular Genetics
When we talk about genetics, what are we actually talking about? Sometimes it seems a bit abstract, traits like eye colour get passed from parents to offspring and depending on which type is dominant or recessive it shows up (or doesn't show up) in the offspring. Is this bring back memories of drawing crosses in GCSE biology? Is it all a bit hazy? Never mind if it is, we'll get to that. Today, we're not going to worry about what genetics does, we're just going to think about what it is, on a molecular level.
The molecule we're most interested in is DNA. This stands for Deoxyribonucleic acid, the only reason you'll ever need to remember its full name is for a pub quiz, so I wouldn't worry too much about it. It looks like this:
Well, if we're being honest, it doesn't really. But this is a nice representation for now.
The overall shape is called a double helix, the red and orange bits running up the side are referred to as the backbone and the blue and white bits through the middle (the bits that look like a rung on a ladder) are called nucleotides. It's the nucleotides that we're really interested in. At some later date I might talk about the discovery of this structure, it's a good story, but it's a story for another time.
Nucleotides are the code that tells our body how to function, they're the bit that makes us human rather than chimpanzees or bananas. Nucleotides come in 4 types: Adenine (A), Cytosine (C), Guanine (G), Thymine (T). If you look closely at the diagram above, you can see that there are actually two nucleotides, one from each backbone, for each rung on the DNA ladder. These nucleotides always pair in exactly the same way Adenine pairs with Thymine and Cytosine pairs with Guanine. This means that we only need to worry about one side of the DNA molecule.
As we only need to worry about one side we can represent the nucleotides in the DNA something like this:
ATGCTGTCACCAAACTTGGAAAAAAAGTCACACGTATAA
Okay, so now we know (a bit) about DNA. But what does it do? How does it make us human?
This is the first dogma of molecular biology (there are exceptions, but we won't worry about that):
DNA -> RNA (an intermediate step that we'll worry about another time) -> Protein
Proteins are the things that actually do things in the body. One protein will make your eyes blue, a slight change in that protein will make your eyes green. So how do we get from DNA to protein?
Each three letters of DNA (called a codon) codes for a specific amino acid - amino acids are the building blocks that make up a protein. You can see here the different codes that give you different proteins.
So the string of nucleotides that I wrote further up would give us:
Met-Leu-Ser-Pro-Asn-Leu-Glu-Lys-Lys-Ser-Gln-Val-Stop
We commonly use a 3 letter code to name each of the amino acids, but they also have full names (and 1 letter codes that you can use instead). You can find out more about that here. Different amino acids have different properties, so if you change one in a protein it can alter what the protein does. You can find out more here (it's fairly complicated, I might write an easier one at some point!)
A protein always starts with a Met and always ends on a stop codon. That's how the body knows that the DNA stops being part of this protein and starts being part of something else.
So now you know a bit about DNA and molecular genetics.
The molecule we're most interested in is DNA. This stands for Deoxyribonucleic acid, the only reason you'll ever need to remember its full name is for a pub quiz, so I wouldn't worry too much about it. It looks like this:
Well, if we're being honest, it doesn't really. But this is a nice representation for now.
The overall shape is called a double helix, the red and orange bits running up the side are referred to as the backbone and the blue and white bits through the middle (the bits that look like a rung on a ladder) are called nucleotides. It's the nucleotides that we're really interested in. At some later date I might talk about the discovery of this structure, it's a good story, but it's a story for another time.
Nucleotides are the code that tells our body how to function, they're the bit that makes us human rather than chimpanzees or bananas. Nucleotides come in 4 types: Adenine (A), Cytosine (C), Guanine (G), Thymine (T). If you look closely at the diagram above, you can see that there are actually two nucleotides, one from each backbone, for each rung on the DNA ladder. These nucleotides always pair in exactly the same way Adenine pairs with Thymine and Cytosine pairs with Guanine. This means that we only need to worry about one side of the DNA molecule.
As we only need to worry about one side we can represent the nucleotides in the DNA something like this:
ATGCTGTCACCAAACTTGGAAAAAAAGTCACACGTATAA
Okay, so now we know (a bit) about DNA. But what does it do? How does it make us human?
This is the first dogma of molecular biology (there are exceptions, but we won't worry about that):
DNA -> RNA (an intermediate step that we'll worry about another time) -> Protein
Proteins are the things that actually do things in the body. One protein will make your eyes blue, a slight change in that protein will make your eyes green. So how do we get from DNA to protein?
Each three letters of DNA (called a codon) codes for a specific amino acid - amino acids are the building blocks that make up a protein. You can see here the different codes that give you different proteins.
So the string of nucleotides that I wrote further up would give us:
Met-Leu-Ser-Pro-Asn-Leu-Glu-Lys-Lys-Ser-Gln-Val-Stop
We commonly use a 3 letter code to name each of the amino acids, but they also have full names (and 1 letter codes that you can use instead). You can find out more about that here. Different amino acids have different properties, so if you change one in a protein it can alter what the protein does. You can find out more here (it's fairly complicated, I might write an easier one at some point!)
A protein always starts with a Met and always ends on a stop codon. That's how the body knows that the DNA stops being part of this protein and starts being part of something else.
So now you know a bit about DNA and molecular genetics.
Genetics for Non-Geneticists
As a rather late New Year's Resolution, and as an effort to improve my popular science writing, I've decided to start writing about genetics for non-geneticists. And the resolution bit of it is that I'm going to try to put up a new blog post every Monday.
I know from talking to friends and family that there's a large body of people out there who think genetics is really interesting, but have no (or very little) formal training in it. This blog is for those people. Sometimes I'll write primers for a specific genetics topic and sometimes I'll write about genetics in the news or pop-culture. But the aim throughout it is to try and make it accessible to non-geneticists. So if you read a post and it goes above your head, call me out on it and I'll try and do better.
Wish me luck.
I know from talking to friends and family that there's a large body of people out there who think genetics is really interesting, but have no (or very little) formal training in it. This blog is for those people. Sometimes I'll write primers for a specific genetics topic and sometimes I'll write about genetics in the news or pop-culture. But the aim throughout it is to try and make it accessible to non-geneticists. So if you read a post and it goes above your head, call me out on it and I'll try and do better.
Wish me luck.
Subscribe to:
Posts (Atom)