ADEPT2 – Neale Lab

Summary of Allele Discovery of Economic Pine Traits 2

Our long term goal is to genetically dissect complex traits and understand the relationship between naturally occurring genetic and phenotypic variation in forest trees. In this project we will identify relationships between naturally occurring genetic and phenotypic variation in Pinus taeda.

Candidate Gene Selection

Our original proposal was to choose 5,000 candidate genes based on: 1) expression in wood forming and/or pathogen-challenged tissues in pine, 2) knowledge of the gene.s role in xylem formation, disease resistance, or regulation of gene expression, and 3) maximal length of contigs or singleton sequence to facilitate primer design.
After reclustering, primer design and sequence validation, the 40,000 original EST contigs dropped to 7,900 validated primer pairs (Figure 1).
We chose to resequence all 7,900 amplicons.

Figure 1. An illustration of how the 40,000 EST contigs were processed to arrive at the final number of resequenced EST unigenes.

PCR and Sequencing Primer Design Pipeline

Our original proposal included construction of a primer design bioinformatics pipeline.
We felt it was unnecessary to develop this pipeline after reviewing the primer design software implemented at Agencourt

Resequencing

As described previously, 7,900 amplicons were selected for resequencing of which 7,424 representing 6,924 unique candidate genes yielded data able to be analyzed. Of those, 6,178 amplicons yielded high-quality data able to be processed reliably with bioinformatic pipelines. The remaining 1,246 amplicons are being processed manually.
Resequencing was performed in a range-wide sample of 18 loblolly pine trees. This was changed from 24 to allow primer tests in the following five species: Monterey pine (Pinus radiata), sugar pine (Pinus lambertiana), Norway spruce (Picea abies), Douglas-fir (Pseudotsuga menziesii) and coast redwood (Sequoia sempervirens). This established the feasibility of the Comparative Resequencing Across the Pinaceae project (CRSP; DBI-0638502).
Data for the 6,178 amplicons have been submitted to Trace Archive (accession nos.: 2072163883–2072419182) and PopSet (accession nos.: FJ043059–FJ147084). Polymorphisms within these genes are currently being deposited in dbSNP, as well.

Table 1. The number of primer sets yielding data for all species included in ADEPT2. Primer sequences are available through our online database, DiversiTree.

Species	Number Successful	Percent Total
Pinus taeda	7424	100
Pinus radiata	6429	84.2
Pinus lambertiana	2234	30.1
Picea abies	1024	13.8
Pseudotsuga menziesii	750	10.1
Sequoia sempervirens	40	0.53

Sequence Analysis and SNP Identification

We developed and used custom software to call bases in, assemble and align the 363,400 chromatograms obtained from Agencourt.
Polymorphisms identified through PineSAP are also being validated through genotyping.
We also developed and used a custom analysis pipeline to estimate summary statistics for each resequenced amplicon. This software, DnaSAM, fully interfaces with PineSAP.

News & Updates

FTP site now live to obtain annotated and raw sequence datasets.
241,796 reads submitted to the Trace Archive
6,178 amplicons to the PopSets repository via tbl2asn program
Illumina Webinar Series: GoldenGate and Infinium SNP Genotyping for Association Studies in Trees

Research

Members

Participating Organizations