The first genetic markers were morphological markers, now frequently called 'naked eye polymorphisms' (neps). Over 200 of these were identified, classified and mapped in barley(in a very rough sense) before the advent of molecular marker techniques. Several of these segregate in normal populations of barley, and remain useful. Generally useful neps include markers which appear to have no selective value, including rough/smooth awns, or long/short rachilla hairs. Many of the other 200 neps result in a reduction in plant vigor. Our first QTL analysis (Good, bad and untested ideas in plant qtl analysis, 1991, Blake, Lybeck and Hayes) utilized a parent with multiple (11) recessive neps- we used these as chromosomal anchor markers. Unfortunately, their phenotypic effects were so dramatic they skewed plant performance and were responsible for most of the detectable QTL effects in the population.
The informativeness of a single locus marker can be evaluated as a function of the number of alleles present in your population and their relative abundance:
Polymorphism Information Content
where fi2 is the frequency2 of the ith allele.
If you have two alleles, each with a frequency of 0.5, then your PIC value is 1- 2 x .52, or 0.5
If you have two alleles, one with a frequency of 0.1 and one with a frequency of 0.9, then your PIC value is 1 - .01 - .81 = 0.18.
If you have four alleles, each with a frequency of 0.25, the PIC value is 1-4 x .252 = .75
With PIC values, the higher the more generally useful the marker.
Nei provided a straightforward method of estimating genetic distance
among lines. This distance estimate is:
Storage proteins (and seed proteins in general) are excellent markers. The major seed proteins in cereals are products of gene families which readily accumulate variation. The prolamin seed proteins of the grasses are the best varietal identification tools currently available- highly informative, easy to isolate and stable. Great markers. In barley three gene clusters encode the major prolamin gene families, the B, C and D hordeins on barley chromosome 5 (1H). A combination of factors- multiple copies, little selective value, internal repeat sequences all contribute to the accumulation of variation in these gene families. The B hordeins have a high PIC (near .9), the C hordeins are generally (depending on the population) around .6, and the D hordeins are at best .4. Within a narrow germplasm base, the D's are often 0.
Restriction Fragment Length Polymorphisms (RFLPs) are not remarkably
polymorphic in general terms, but there's an infinite number of them. We
select the clones which span or are adjacent to variable regions of the
genome, and use them to effect. An interesting derivative of RFLP analysis
is Restriction Landmark Genome Scanning (RLGS). Take good quality DNA and
fill in the sheared ends with Klenow fragment. Digest with an 8-base cutter,
and lable the cut ends with Klenow. Then digest these with a 6-base cutter
and electrophorese the fragments in a long agarose tube gel. Pull the tube
out, degrade the fragments with a third nuclease, and resolve the fragments
on a slab gel. Conceptually, this isn't a bad idea. Practically it's a
recipe for disaster. One sample per final gel, three electrophoresis steps,
all radioactive. This means that you have to run 300 high quality gels
to assay 100 individuals. However, the density of available markers might
make the process worthwhile.
STSs- The general idea was: if you knew that an RFLP identified a polymorphism,
maybe PCR could do the same thing. The problem with the idea:
STSs only
look within a primed sequence, and primers often anneal at multiple locations.
Sometimes these are useful, sometimes not. This topic has recently become an
important one: STSs are the source of SNPs (single nucleotide polymorphisms). I recommend reading our two publications of conversion of STSs to fluor-tagged
SNPs. MS #1
was part of Deven See's Master's thesis, while MS#2 was a general lab effort. Our laboratory will look at
variation at a few loci.
Microsatellites: Are some sequences more prone to accumulate variation
than others? Forensic analysis is built around Jeffrey's observation that
small direct repeat sequences accumulate variation with remarkable speed.
Several scientists utilized an approach developed by Ostrander et al.,
1995, to identify sequences carrying short direct repeat sequences (e.g.
CACACACACACA). Jeffreys et al (1985) estimated a 2% mutation rate for minisatellite
sequences. Although I believe this estimate to be a poor one, it's better
than any available for microsatellites in crops. We should generate a good
mutation rate estimate for microsatellites in cultivated plants.
RAPDs: The guys from DuPont did this to us. Take either one or two random
12 base primers (412 available), and use these against genomic
DNA in a PCR reaction. Sometimes something gets amplified in one genotype
which isn't in another. PIC values? Meaningless. This technique did more
to slow the characterization of the genetics underlying useful variation
in crops than any idea since 'Rain follows the plow'. Miserable technique,
lousy reproducibility, leading nowhere. If you rely on this, you'd best
get a job driving a truck.
AFLPs: Amplified Fragment Length Polymorphisms are the thinking man's
version of RAPDs. Digest genomic DNA with a 6-cutter and a 4-cutter, and
ligate on linker sequences. Amplify the whole gamish with primers against
the linkers. To simplify the analysis, use a secondary amplification in
which the primers have arbitrary two or three base overhangs. Use a lable
on the 6-cutter primer, and detect product size polymorphisms on a sequencing
gel. We'll be doing this in a few weeks in the lab. This couples the robustness
of RLGS with the inherent simplicity of RAPDs. As demonstrated by the folks
who cloned ml-o, this can be a remarkably useful technique. We developed
a pretty useful software package,
Genographer,
to deal with the data gathered through AFLP analysis.
Single Nucleotide Polymorphisms
Until recently, nobody in his right mind considered sequencing allelic
variants. Larson et al (1996) did, in order to find STSs in locations which
were useful. This is now a growth area in our field. Our group
pioneered this labor-intensive effort in the small grains, producing several
theses and many manuscripts from the attempt to utilize the most minute
of mutations, the single base change, as markers. Initially we utilized
only those SNPs which were assayable using restriction endonucleases.
Recently, See et al (accepted with revision) developed a general approach toward
the use of these mutations. In human genetics, this has turned into an
industry.
Crop Diversity, an overview
Wheat, barley, soybean and many other of our most important crops are
most generally grown as inbred varieties. Rice and maize are most
generally grown as simple hybrids, although Xiao demonstrated that at least
with rice this is due to plant breeders' ineffectiveness in bringing together
unlinked desirable gene combinations, not to classic heterosis. It's
easy to argue that our current germplasm pools are genetically narrow.
Where will our genetic improvements come from in the future?
The USDA, the Vavilov Institute (Russia), and several European organizations
have collected, catalogued and maintained the genetic diversity of the
world for decades. I strongly recommend going to the library and
reading any of Nicolai Vavilov's books on the biogeography of crops.
We will obtain the genes for the future from our collections of the world's
diversity.
They're really pretty amazing.