Thursday, April 16, 2015

Blinded by Science blog #4: How much of the human genome is the same from person to person and how much makes up what is unique in us?


Hello and welcome to another wonderful addition of Blinded by Science™.  Hmmm.  I wonder.  Does just adding the “™” symbol actually DO anything?  Probably not.  I feel like there should be paperwork involved.  I mean, there’s always paperwork involved.  Ah well, I could look into it, but it’s probably nothing I have to worry about right now.  Anyway, where was I?  Oh yes . . . welcome to another wonderful addition of Blinded by Science.  Today’s contestant comes to us all the way from sunny and warm . . . no cloudy . . . no snowy . . . no sunny again but colder this time . . . no rainy . . . ARGH never mind!  Today’s contestant comes from Ohio, a place where all four seasons can occur in a single day, unless you are talking about roadwork season, because that lasts forever (and you’ll still get a flat tire from the potholes).  Chris G asks, “How much of the human genome is the same from person to person and how much makes up what is unique in us?”

Well, since this is the first real biology-related question that I will try and answer, I am going to go all out with the answer.

As I hope you are already aware, DNA is made up of four different base pairs: adenine, guanine, cytosine, and thymine (commonly referred to as A, G, C and T).  The order of these bases makes up the genetic code.  It doesn’t seem like there would be enough information held in only those four “letters” to code for any sort of life, let alone life that is as varied as exists on Earth, but there is because of two important facts.  One is that the sizes of genomes are big.  In the case of humans, our genome contains approximately 3 billion base pairs, so the amount of different combinations is 43000000000 (a REALLY BIG number), though the amount of biologically feasible combinations is less.  What I mean by this has to do with the second important fact, the way in which DNA codes for proteins.

For life to exist, the information stored in DNA must be expressed in a form that does work (the pattern on a key doesn’t do anything by itself, but put it into its corresponding lock and now you can open a door).  In this case, genes are converted into proteins (via an RNA middleman since DNA does not leave the nucleus)* which then do basically everything necessary for life (need a specific molecule broken down, there’s a protein for that; need some ions transported across a membrane, there’s a protein for that; need to rebuild your muscle fibers after that really intense workout, proteins do that too).  This translation from RNA to protein occurs at a site called the ribosome.  The ribosome “reads” the RNA strand (which does not contain the thymine base but instead has uracil) until it comes to a specific sequence of three RNA bases (AUG) that it recognizes as the place to start synthesizing the protein.  Called the “start codon”, AUG also codes for the amino acid methionine which means all proteins start with methionine.  The ribosome then moves onto the next three bases (next codon) and depending on the arrangement adds the corresponding amino acid (AUGCCCCAC becomes methionine-proline-histidine).  Each amino acid has its own specific physical properties which affect the overall activity and function of the protein.  This is how proteins are made and why I said there are fewer biologically feasible combinations of bases (having a genome that only consists of adenine means you would never have ANY start codon, and even if you did, all the proteins would have be made up of the same amino acid).  When the ribosome reaches one of three specific codons known as the “stop codons” (UAA, UAG, or UGA), it terminates the synthesis and the newly formed protein is released to go do its job.

There are 64 different codons which correspond to 20 different amino acids (well 61 since the stop codons don’t code for amino acids) and depending on where the stop codon occurs a protein could be only a few amino acids in length or hundreds of amino acids long.  This is how only 4 bases are able to code for such an enormous variety of proteins.

Now what was the point of all that?  Honestly, I don’t remember.  Whoops.


In all seriousness, there was a point to that little biology lesson.  We’ve all heard of mutations, where something causes a change in the genetic code.  Sometimes those changes are just a switch of base pairs (AGA becomes ACA).  This can cause problems when the change causes a different amino acid to be inserted (AGA codes for arginine while ACA codes for threonine) which might end up changing the properties of the protein.  Other times, the change could be “silent” (both AGA and AGG code for arginine).  Basically, as long as mutations are not selected against, they can be present in the genetic code of some members of a species.

Mutations can also occur in regions of the genome that do not code for genes (there are many other parts to the genome than just the protein coding genes, but that would likely be better served as a topic on another day).  Again, as long as these mutations are not selected against, they can persist in some members of a species.

And this brings me back to the question of today’s blog (finally).  Mutations are just one way that members of species can differ genetically (other ways such as epigenetic changes or copy number variations also exist).  So taking this all into account (well as much as we can with our current technology as certain parts of the genome are still hard to read), it is estimated that humans are 99.5% similar to other humans (as a point of comparison, humans are estimated to be between 96% to 94% similar to chimpanzees based on genetic analysis)**.  So we are really similar to one another, but not identical (not even identical twins are 100% similar genetically), which I think is a good thing.  Wouldn’t life be so much more boring if we were all the same?

*This is an extremely simplified version of events.  I mean, what would life be like if there weren’t lots of exceptions to the rule?  There are some cases of some RNAs having biological activity on their own without being translated into protein first.  We don’t need to get into that today though.

**I really wanted to add a mention of the genetic similarity of viruses within the same species because it is so much less than human similarity.  I think I recall having been taught either 60% or 70% similar (though it could still be much lower), but I could not find a citation that mentioned it.  Suffice it to say; based on the amount of genetic similarity between some viruses of the same species, humans and chimpanzees (as well as other types of apes) would be members of the same species.  Think about that.

No comments:

Post a Comment