Domain Knowledge and Data Quality Perceptions in Genome Curation Work

Document Type


Publication Date



Behaviour, Assessment, Curation, Domain knowledge, Genome

Digital Object Identifier (DOI)



Purpose: The purpose of this paper is to understand genomics scientists’ perceptions in data quality assurances based on their domain knowledge.

Design/methodology/approach: The study used a survey method to collect responses from 149 genomics scientists grouped by domain knowledge. They ranked the top-five quality criteria based on hypothetical curation scenarios. The results were compared using χ2 test.

Findings: Scientists with domain knowledge of biology, bioinformatics, and computational science did not reach a consensus in ranking data quality criteria. Findings showed that biologists cared more about curated data that can be concise and traceable. They were also concerned about skills dealing with information overloading. Computational scientists on the other hand value making curation understandable. They paid more attention to the specific skills for data wrangling.

Originality/value: This study takes a new approach in comparing the data quality perceptions for scientists across different domains of knowledge. Few studies have been able to synthesize models to interpret data quality perception across domains. The findings may help develop data quality assurance policies, training seminars, and maximize the efficiency of genome data management.

Was this content written or created while at USF?


Citation / Publisher Attribution

Journal of Documentation, v. 71, issue 1, p. 116-142