Summary of Allele Discovery of Economic Pine Traits 2
Our long term goal is to genetically dissect complex traits and understand the relationship between naturally occurring genetic and phenotypic variation in forest trees. In this project we will identify relationships between naturally occurring genetic and phenotypic variation in Pinus taeda.
Candidate Gene Selection
- Our original proposal was to choose 5,000 candidate genes based on: 1) expression in wood forming and/or pathogen-challenged tissues in pine, 2) knowledge of the gene.s role in xylem formation, disease resistance, or regulation of gene expression, and 3) maximal length of contigs or singleton sequence to facilitate primer design.
- After reclustering, primer design and sequence validation, the 40,000 original EST contigs dropped to 7,900 validated primer pairs (Figure 1).
- We chose to resequence all 7,900 amplicons.
Figure 1. An illustration of how the 40,000 EST contigs were processed to arrive at the final number of resequenced EST unigenes.
PCR and Sequencing Primer Design Pipeline
- Our original proposal included construction of a primer design bioinformatics pipeline.
- We felt it was unnecessary to develop this pipeline after reviewing the primer design software implemented at Agencourt
- As described previously, 7,900 amplicons were selected for resequencing of which 7,424 representing 6,924 unique candidate genes yielded data able to be analyzed. Of those, 6,178 amplicons yielded high-quality data able to be processed reliably with bioinformatic pipelines. The remaining 1,246 amplicons are being processed manually.
- Resequencing was performed in a range-wide sample of 18 loblolly pine trees. This was changed from 24 to allow primer tests in the following five species: Monterey pine (Pinus radiata), sugar pine (Pinus lambertiana), Norway spruce (Picea abies), Douglas-fir (Pseudotsuga menziesii) and coast redwood (Sequoia sempervirens). This established the feasibility of the Comparative Resequencing Across the Pinaceae project (CRSP; DBI-0638502).
- Data for the 6,178 amplicons have been submitted to Trace Archive (accession nos.: 2072163883–2072419182) and PopSet (accession nos.: FJ043059–FJ147084). Polymorphisms within these genes are currently being deposited in dbSNP, as well.
Table 1. The number of primer sets yielding data for all species included in ADEPT2. Primer sequences are available through our online database, DiversiTree.
|Species||Number Successful||Percent Total|
Sequence Analysis and SNP Identification
- We developed and used custom software to call bases in, assemble and align the 363,400 chromatograms obtained from Agencourt.
- Polymorphisms identified through PineSAP are also being validated through genotyping.
- We also developed and used a custom analysis pipeline to estimate summary statistics for each resequenced amplicon. This software, DnaSAM, fully interfaces with PineSAP.
News & Updates
- FTP site now live to obtain annotated and raw sequence datasets.
- 241,796 reads submitted to the Trace Archive
- 6,178 amplicons to the PopSets repository via tbl2asn program
- Illumina Webinar Series: GoldenGate and Infinium SNP Genotyping for Association Studies in Trees