Marine Science Faculty Publications

Document Type


Publication Date



Test Dataset, Query Sequence, Representative Sequence, Percentage Sequence Identity, Single Base Insertion

Digital Object Identifier (DOI)


Background: High-throughput sequencing makes it possible to rapidly obtain thousands of 16S rDNA sequences from environmental samples. Bioinformatic tools for the analyses of large 16S rDNA sequence databases are needed to comprehensively describe and compare these datasets.

Results: FastGroupII is a web-based bioinformatics platform to dereplicate large 16S rDNA libraries. FastGroupII provides users with the option of four different dereplication methods, performs rarefaction analysis, and automatically calculates the Shannon-Wiener Index and Chao1. FastGroupII was tested on a set of 16S rDNA sequences from coral-associated Bacteria. The different grouping algorithms produced similar, but not identical, results. This suggests that 16S rDNA datasets need to be analyzed in multiple ways when being used for community ecology studies.

Conclusion: FastGroupII is an effective bioinformatics tool for the trimming and dereplication of 16S rDNA sequences. Several standard diversity indices are calculated, and the raw sequences are prepared for downstream analyses.

Was this content written or created while at USF?


Citation / Publisher Attribution

BMC Bioinformatics, v. 7, art. 57

© 2006 Yu et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Included in

Life Sciences Commons