Project #4: Genetic drift

Today's project is a biological simulation illustrating the phenomenon of genetic drift -- a change in the frequency of the forms of a gene within a population, caused by differences in the rates at which organisms having different genotypes survive to adulthood.

In our simulation, at least initially, an organism has only one characteristic that is determined by its genetic constitution. Any individual organism has two genes for this characteristic, one inherited from each of its parent organisms. Each gene is of one of two forms, or alleles, which we'll represent by the symbols 'dominant and 'recessive. These terms refer not to the frequency of the different alleles, but to their influence on the external and easily observable traits of the organism (its phenotype): An organism having one or two dominant alleles generally differs in appearance from an organism having none.

An organism's two genes constitute its genotype. There are, therefore, three possible genotypes: dominant/dominant, dominant/recessive, and recessive/recessive. (Since the gene inherited from one parent is no different in nature from the gene inherited from the other, recessive/dominant is the same genotype as dominant/recessive.)

We now imagine a population or community of such organisms in a stable environment over time. Each of our organisms, if it survives to adulthood, has the opportunity to reproduce, contributing one of its genes to each of its offspring. In reality, survival is a complicated interaction between an organism's genetic constitution and its environment. However, since in our simulation we have posited a stable environment, we can model survival by specifying just three probabilities, a survival rate for each of the three possible genotypes. For example, we might specify that organisms with the dominant/dominant genotype have a 48% chance of surviving to adulthood, while those of the dominant/recessive genotype have a 76% chance and those of the recessive/recessive genotype a 37% chance.

When these survival rates differ considerably, there is a tendency for the overall frequencies of the alleles to shift, from one generation to the next, towards an equilibrium point, optimizing the chances of survival for the average organism. When the survival rate of the dominant/recessive genotype is between the other two survival rates, the equilibrium point is not particularly interesting: it is reached when the frequency of the dominant allele is 100% (if the survival rate for dominant/dominant is the highest) or 0% (if the survival rate for recessive/recessive is the highest). However, when the survival rate of the dominant/recessive genotype is the highest or the lowest, the frequency at equilibrium may be anywhere between these extremes.

The program at /home/stone/courses/scheme/html/genetic-drift.ss, when completed, will trace changes in the frequency of the dominant allele in a population over many generations, simulating the breeding and culling of the offspring of each generation to produce the next. Here's a sample interaction that shows how the program is supposed to work:

Initial population size: 500
Frequency of the dominant allele: 0.32
Survival rate for organisms having the dominant/dominant genotype: 0.48
Survival rate for organisms having the dominant/recessive genotype: 0.76
Survival rate for organisms having the recessive/recessive genotype: 0.37
Number of generations: 20

Dominant alleles: 396/1000 (40%)  Genotypes: 32 dominant/dominant, 332 dominant/recessive, 136 recessive/recessive.
Dominant alleles: 446/1000 (45%)  Genotypes: 52 dominant/dominant, 342 dominant/recessive, 106 recessive/recessive.
Dominant alleles: 471/1000 (47%)  Genotypes: 71 dominant/dominant, 329 dominant/recessive, 100 recessive/recessive.
Dominant alleles: 487/1000 (49%)  Genotypes: 85 dominant/dominant, 317 dominant/recessive, 98 recessive/recessive.
Dominant alleles: 514/1000 (51%)  Genotypes: 95 dominant/dominant, 324 dominant/recessive, 81 recessive/recessive.
Dominant alleles: 522/1000 (52%)  Genotypes: 99 dominant/dominant, 324 dominant/recessive, 77 recessive/recessive.
Dominant alleles: 514/1000 (51%)  Genotypes: 96 dominant/dominant, 322 dominant/recessive, 82 recessive/recessive.
Dominant alleles: 543/1000 (54%)  Genotypes: 117 dominant/dominant, 309 dominant/recessive, 74 recessive/recessive.
Dominant alleles: 561/1000 (56%)  Genotypes: 128 dominant/dominant, 305 dominant/recessive, 67 recessive/recessive.
Dominant alleles: 568/1000 (57%)  Genotypes: 129 dominant/dominant, 310 dominant/recessive, 61 recessive/recessive.
Dominant alleles: 584/1000 (58%)  Genotypes: 139 dominant/dominant, 306 dominant/recessive, 55 recessive/recessive.
Dominant alleles: 593/1000 (59%)  Genotypes: 143 dominant/dominant, 307 dominant/recessive, 50 recessive/recessive.
Dominant alleles: 609/1000 (61%)  Genotypes: 153 dominant/dominant, 303 dominant/recessive, 44 recessive/recessive.
Dominant alleles: 579/1000 (58%)  Genotypes: 135 dominant/dominant, 309 dominant/recessive, 56 recessive/recessive.
Dominant alleles: 555/1000 (56%)  Genotypes: 117 dominant/dominant, 321 dominant/recessive, 62 recessive/recessive.
Dominant alleles: 571/1000 (57%)  Genotypes: 128 dominant/dominant, 315 dominant/recessive, 57 recessive/recessive.
Dominant alleles: 563/1000 (56%)  Genotypes: 118 dominant/dominant, 327 dominant/recessive, 55 recessive/recessive.
Dominant alleles: 574/1000 (57%)  Genotypes: 134 dominant/dominant, 306 dominant/recessive, 60 recessive/recessive.
Dominant alleles: 563/1000 (56%)  Genotypes: 123 dominant/dominant, 317 dominant/recessive, 60 recessive/recessive.
Dominant alleles: 578/1000 (58%)  Genotypes: 127 dominant/dominant, 324 dominant/recessive, 49 recessive/recessive.

Each line of output shows the frequency of the dominant allele and the number of individuals of each genotype in one of the successive generations of the population.

Part 1

Unfortunately, as you'll discover, the program does not in fact work this way. The first step in the project is to find and correct the errors that currently impede its operation. There are several of these. (The good news is that you can find most of them by testing individual procedures separately.)

Begin, as usual, by copying the erroneous version of the program into your home directory:

bourbaki% cp /home/stone/courses/scheme/html/genetic-drift.ss genetic-drift.ss

If you like, you can then run it as a free-standing program:

bourbaki% mzscheme -r genetic-drift.ss

However, the results are not too enlightening, and you might do better to work your way systematically through the program, testing each new procedure that you read. The repeat and fold-list procedures are correct (as you can confirm by comparing them to the original listings in the first lab on folding).

Part 2

Once the program is working, consider what changes one would need to make in it to demonstrate genetic drift in a setting where there are three alleles and six genotypes. (Blood type in human beings is an instance of this arrangement: The alleles are called A, O, and B, and the A and B alleles are both dominant. The genotypes are AA, AO, AB, BO, BB, and OO. People of genotype AA or AO have type A blood; those of genotype BO or BB have type B blood; those of genotype AB have type AB blood; those of genotype OO have type O blood.)


This document is available on the World Wide Web as

http://www.cs.grinnell.edu/~stone/courses/scheme/genetic-drift-project.xhtml

created March 27, 2000
last revised April 3, 2000

John David Stone (stone@cs.grinnell.edu)