Using IBM Content Manager for Genomic Data Annotation and Quality Assurance Tasks

Document Type


Publication Date



Quality assurance, Bioinformatics, XML, Data models, Content management, Database systems, Genomics

Digital Object Identifier (DOI)



As the amount of heterogeneous genomic data and related annotations continues to grow, a flexible and easy-to-access data management solution is required to integrate such data and diverse annotation tasks. This preliminary report describes the benefits of using IBM DB2® Content Manager software by conducting task-oriented grape genome annotations, along with data quality-assurance checks throughout the annotation process. To demonstrate the usability of this application, we describe the implementation of two real-life content-based genome annotation case scenarios: 1) expressed sequence tags annotation; and 2) sequence annotation related to simple sequence repeat markers. The IBM DB2 Content Manager allows users to easily construct content-based genomic information applications as rapidly built and readily adapted customized content documents with attributes within an easy-to-use interface system. Users can simultaneously conduct the annotation quality checks while making annotations by utilizing a built-in standardized data quality-control assurance procedure referred to as annotation “routing.” The system provides search features or cross-links with different annotation contents or data formats. The data quality workflow and procedure within the system also resulted in accuracy and consistency in the data annotation and curation lifecycle.

Was this content written or created while at USF?


Citation / Publisher Attribution

IBM Journal of Research and Development, v. 55, issue 6, p. 1-8