A genome is all the DNA in an organism. DNA is made up of four nucleotide-forming bases A, T, C, and G. The exact order of nucleotides in a genome is called its DNA sequence. Attempts have been undertaken to develop genome sequences of tall fescue.
Expressed Sequence Tags are short stretches of DNA sequence (usually <500 nucleotides) derived from a gene expressed in the cell or tissue being studied. Transcribed regions give rise to messenger RNA (mRNA), which can be used to make complementary DNA (cDNA). A set of genes is expressed as mRNAs from a cell or tissue at very different levels at any given time. Both the gene identity and the expression levels provide valuable clues to the biology of a cell or tissue. Thus, ESTs are key molecular tools used for genomic research. In addition, ESTs provide a valuable source of potential molecular markers; these allow for the construction of comparative maps of expressed genes in related species (Cato et al., 2001; Kantety et al., 2002). The ESTs are only a tag of the cDNA (2000 or more nucleotides) and often are sufficient to identify a specific mRNA and its corresponding gene.
Several EST databases have grown exponentially because of their potential use in plant and animal genetic improvement programs (Messing and Llaca, 1998). Each of four major plant species studied-Arabidopsis thaliana (L.) Heynh., rice, maize, and wheat-has more than one million ESTs in the dbEST of the National Center for Bioinformatics (NCBI) database. A total of 44,512 Festuca ESTs has been published in dbEST (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=nucleotide; verified 10 Jan. 2010). Most of the sequences were derived from tall fescue (now considered Lolium) and a few from F. mairei St.-Yves and F. rubra L. Most of these ESTs were contributed from the research programs of the Noble Foundation, where high-quality ESTs were generated from nine cDNA libraries of tall fescue representing tissues from different plant organs, developmental stages, and abiotic stress conditions (Table 21-1). About 5000 ESTs from each of eight libraries were developed. A ninth cDNA library constructed from field-stressed plants was sequenced and 1211 ESTs were developed. Heat responsive gene transcripts were cloned by using differential expression between a heat-tolerant and a heat-sensitive fescue genotype. Of the 2495 sequences generated, 656 were singlets, and the remaining 1839 EST's were grouped into 434 clusters (Zhang et al., 2005). All these EST sequences are available at the dbEST.
|Click Link to Expand|
A total of 5320 genomic sequences developed from a (GA-CT)n enriched tall fescue genomic library is available at the NCBI. Genomic DNA from a pool of 31 ‘Kentucky 31' (KY-31) plants was isolated and used for the library construction. A biotinylated (GA)n oligonucleotide was used to capture and enrich for DNA fragments containing (GA/CT)n repeats. A protocol suggested by Hamilton et al. (1999) was used for the library construction, with minor modifications.
Publicly available ESTs have become a cost effective, time efficient, and unique source of SSR markers (Scott et al., 2000; Cordeiro et al., 2001; Eujayl et al., 2002; Thiel et al., 2003; Eujayl et al., 2004; Saha et al., 2004). Vast EST databases are available, and a large number of SSR markers has been developed for species such as rice, wheat, barley (Hordeum vulgare L.), and maize. Because EST-SSRs are derived from transcribed regions of DNA, they are more conserved and show less marker polymorphism than genomic SSRs (Cho et al., 2000; Thiel et al., 2003). However, EST-SSRs are associated with expressed genes and usually are concentrated in the gene-rich regions of the genome. Thus, many microsatellite markers are needed to get good genome-wide coverage.
In five cereal species, it was found that 1.5 to 4.7% of the ESTs contained SSRs suitable for marker development. The percentage of tall fescue ESTs containing SSRs is lower (1.3%) than those for barley (3.4%), wheat (3.2%), rice (4.7%), and sorghum [Sorghum bicolor (L.) Moench] (3.6%) (Kantety et al., 2002; Saha et al., 2004). It is similar to that found in maize, 1.5% (Kantety et al., 2002). Analysis of 20,000 tall fescue ESTs resulted in the development of 157 primer pairs (PPs), approximately 0.8% of the total sequences (Saha et al., 2004). Subsequently, 43,000 tall fescue ESTs developed at the Noble Foundation were searched for SSRs and a total of 780 PPs developed. The first group of 157 PPs was released to the public domain (Saha et al., 2004). An additional 348 primers have been developed and are being assessed for their transferability across 16 different grass species.
The tall fescue genome appears rich in GC rather than AT nucleotides, as was reported for some other plant genomes, for example, soybean [Glycine max (L.) Merr.], and other legumes (Brown-Guedira et al., 2000). The distribution of di-, tri-, tetra-, and pentanucleotide repeats in tall fescue was similar to that reported by earlier investigators (Eujayl et al., 2004; Thiel et al., 2003; Kantety et al., 2002). Usually, dinucleotide repeats are the most abundant in genomic SSRs, whereas trinucleotide motifs are the most abundant in EST-SSRs (La Rota et al., 2005). Among 20,000 tall fescue ESTs, trinucleotide motifs were the most abundant type of SSRs (70%), followed by di- (20%), tetra- (5%), and pentanucleotide (5%) motifs. The CCG/GGC motif was the most abundant trinucleotide repeat, while GA/CT was the most abundant dinucleotide repeat (Saha et al., 2004). Detailed information on 145 tall fescue EST-SSR primers, including primer sequences, marker name, EST ID, expected band size, annealing temperature, SSR repeat motif and number, observed band size range, amplification results on 11 genotypes of seven species, and gene functional annotation, is available as electronic supplementary material (Saha et al., 2004).
In species where EST databases are not well established, genomic libraries are considered an important source for SSR marker development. In tall fescue, EST sequences are few. Furthermore, EST-SSR loci have a tendency to be clustered in gene-rich regions of the genome, and this clustering limits the potential for genome-wide coverage (Warnke et al., 2004; Yu et al., 2004; La Rota et al., 2005; Saha et al., 2005). Genomic SSRs are highly polymorphic. They tend to be distributed widely throughout the genome, giving better map coverage than EST-SSRs (Taramino et al., 1997; Warnke et al., 2004; La Rota et al., 2005; Saha et al., 2005). Construction and screening of partial genomic libraries, and sequencing of SSR-positive clones are considered effective methods for microsatellite development (Rafalski et al., 1996). Microsatellite markers can be developed either from nonenriched or highly enriched genomic libraries (e.g., normal genomic libraries vs. genomic libraries enriched for a particular repeat motif) (Edwards et al., 1996). Highly enriched genomic libraries significantly reduced the cost and effort for microsatellite development compared to nonenriched libraries (Kijas et al., 1994; Edwards et al., 1996).
A (GA-CT)n enriched genomic library of tall fescue was developed. A total of 5712 sequences was characterized, and after deleting questionable and short sequences, 5320 high quality sequences were screened for SSR motifs. About 20% of the sequences were found to be SSR positive. Nine hundred six of these were candidate SSR sequences that had repeat motifs >18 bases, and at least 25 bases of flanking DNA sequences at both ends (Saha et al., 2006). Five hundred eleven PPs were developed. Four hundred twenty-five of these PPs were singleton SSRs. Thus, genomic SSR primers were developed from 56% of the SSR containing sequences. The sequences of the 511 PPs, along with the marker name, sequence ID, primer sequences, primer melting temperature, and expected sizes are available at Saha et al. (2006).
|<--Previous||Back to Top||Next-->|