Since 2010, Illumina technology has been the next generation sequencing platform of choice for most sequencing projects in the HMGC Sequencing Core. We currently offer whole genome sequencing (WGS), high-throughput whole exome sequencing (WES), mRNASeq, smallRNASeq, and reduced representation bisulfite sequencing (RRBS) on each of our three HiSeq 2500 or our HiSeq 2000 instruments. The current output for our systems is approximately 1Tb per 6 day run (HS2500) and 500Gb per 10 day run (HS2000). The HiSeq 2500 generates approximately 500M reads per lane (PE 2x125), which is adequate for 40-50x average depth of coverage for 2-3 whole human genomes per flowcell, 100-125x average depth of coverage for 24 whole exomes per flowcell, or 10 RNAseq samples per lane at 50M reads per sample. The sequencing media of this system is a microscope slide-sized flowcell with 8 physically separated channels. DNA libraries are prepared by shearing dsDNA into fragment lengths determined by the type of sequencing being completed. Following shearing to the proper fragment size using a Covaris E210, adaptors are ligated to each end. Prepared libraries are loaded onto the flowcell using the cBot (Illumina). A lawn of oligos on the surface chemistry of the inside of the flowcell is complementary to the adapters ligated to libraries during library preparation steps. DNA libraries are PCR amplified in the flowcell directly on the cBot, a process that takes approximately 2-4 hours. After amplification on the cBot, the flowcell contains approximately 1-4 billion clusters of between 1000 and 1500 clonally amplified fragments. The flowcell is then loaded onto the HiSeq 2000/2500 sequencer.
On the sequencer, fluorescently labeled nucleotides containing dideoxy terminators are washed into the lanes of the flowcell and incorporated into the individual clusters directly adjacent to the sequencing primer. Following incorporation, fluorescence is excited with a series of lasers and imaged using one of the on-board CCD cameras. After imaging, dideoxy terminators are cleaved off and washed away, making way for the next set of labeled nucleotides to be washed into the flow cell. This continuous process is repeated for a minimum of 100 cycles per read direction.
When pooling of samples is desired, indexing adapters are utilized. An extra 6 or 8-base sequence is incorporated into one or both library adapters, which allows the reads to be bioinformatically differentiated in the downstream analysis steps.
After data generation is complete, the reads are fed into the HMGC Sequencing Core’s secondary analysis pipeline. The WGS and WES pipeline uses BWA/GATK to de-multiplex, align the reads to a reference genome, and call variants based on the alignment. FASTQ files, BAM files, and VCF files are then released to the client. For researchers requiring further analysis, custom tertiary analysis can be completed after a consultation with the HMGC Sequencing Core's bioinformatician.
Applications for Illumina HiSeq 2000/2500: Whole genome sequencing, Whole exome sequencing, Custom capture sequencing, mRNA sequencing for gene expression analysis/transcript identification, small RNA sequencing, RRBS sequencing for establishing methylation patterns.
The HMGC Sequencing Core deploys one Illumina MiSeq which is used primarily for control of all libraries prior to being loaded onto one of the HiSeq instruments. Illumina sequencing is highly dependent on the cluster density in each lane of the flowcell. Failure to reach maximum cluster density will result in fewer quality reads during HiSeq sequencing. The MiSeq is used to ensure all lanes of the HiSeq instruments run at optimal cluster density to produce the highest quality data possible. This instrument is also used for projects such as whole genome sequencing of smaller organisms such as bacteria and viruses. The chemistry of the Illumina MiSeq is very similar to the HiSeq instruments, however the MiSeq utilizes a single-lane flowcell which has the potential for longer paired-end read lengths – up to 2x250bp.
The HMGC Sequencing Core offers Sanger DNA sequencing using the ABI 3730xl DNA Analyzer. By utilizing M13-tagged PCR primers, the HMGC Sequencing Core can rapidly sequence small regions of interest in the genome or confirm single nucleotide variants found using next-generation sequencing technology.
Region specific PCR primers are designed to amplify approximately 800-1000bp. Each primer is tagged with an M13 sequence on the 5’ end to facilitate high-throughput sequencing. DNA is amplified using standard touchdown PCR conditions. The PCR product is then QC’d on an agarose gel and diluted and/or purified for the sequencing reaction. The sequencing reaction is performed using Applied Biosystems Big Dye Terminator (BDT) v3.1 chemistry and M13 sequencing primers. BDT v3.1 contains a mixture of standard deoxy-nucleotide-triphosphates (dNTPs), which are used to elongate the DNA strand, and fluorescently labeled dideoxy-nucleotide-triphosphates (ddNTPs), which act as chain terminators during DNA replication. The result of this reaction is various sized fragments of DNA which are separated by electrophoresis on the ABI 3730xl, and later assembled into contigs bioinformatically. After the sequencing reaction is complete, the samples are purified using the Millipore Montage SEQ Sequencing Reaction Clean Up Kit in order to remove unincorporated salts and ddNTPs. The samples are then sequenced on the ABI 3730xl in either 96- or 384-well format. Raw data files will either be analyzed by the HMGC DNA Sequencing Core using the Phred/Phrap/Consed software package or sent to individuals requesting service to be analyzed on the analysis software of their choosing.