Date of Award
Spring 5-2024
Degree Type
Thesis
Degree Name
MS Microbiology
Department
Biology
Advisor
Tinchun Chu, PhD
Committee Member
Jane L Ko, PhD
Committee Member
Jessica A Cottrell, PhD
Keywords
16S rRNA, next generation sequencing, freshwater microorganisms, biodiversity, R programming, bioinformatics
Abstract
The 16S rRNA gene encodes the small subunit rRNA molecule of prokaryotic ribosomes, featuring nine variable regions (V1 to V9) separated by highly conserved regions. Metagenomic sequencing of variable regions, which exhibit considerable diversity among bacterial species, is widely used for classification of microbial populations at the genus or species level. While numerous sequence analysis tools exist for taxonomic classification, many require annual subscription fees, per-analysis charges, or offer limited reference database options. This study aimed to develop a bioinformatic pipeline for 16S rRNA amplicon sequencing analysis using open-source tools in RStudio. Ten freshwater samples were collected and analyzed using the pipeline, sequencing the V3-V4 region on an Illumina MiSeq platform with paired-end 250 bp reads. The analysis workflow utilized Bioconductor package in R, allowing access to open source bioinformatic software packages. Proteobacteria, Bacteroidota, and Firmicutes were the dominant phyla identified from a sample collected from Lake Hopatcong, NJ with relative abundances of 50.8%, 19.9%, and 4.5%, respectively. The top genera classified were Flavobacterium and Limnohabitans, with relative abundances of 8.8% and 4.2% respectively. These taxonomic classifications were consistent with a commercial amplicon analysis service. Across all samples, the most abundant phylum was either Proteobacteria or Firmicutes. Additionally, two DNA isolation methods were compared for their impact on microbial diversity outcomes. Isolating DNA directly from microorganisms on filter membranes yielded greater diversity compared to plating onto growth medium, with the latter method resulting in a 92.7% loss of observed alpha diversity. This study establishes a bioinformatic pipeline for 16S rRNA amplicon sequencing analysis, providing abundance charts, diversity results, and principal coordinate analysis using solely open-source tools. Furthermore, it highlights that viable but nonculturable bacteria constitute the majority in a freshwater environment.
Recommended Citation
Guo, Vanessa, "Optimizing Biodiversity Analysis: Building A Bioinformatic Pipeline with R for 16S rRNA Amplicon Sequencing" (2024). Seton Hall University Dissertations and Theses (ETDs). 3168.
https://scholarship.shu.edu/dissertations/3168
Included in
Bacteriology Commons, Bioinformatics Commons, Environmental Microbiology and Microbial Ecology Commons