Graduation Year


Document Type




Degree Granting Department

Mathematics and Statistics

Major Professor

Nataša Jonoska, Ph.D.

Committee Member

Gregory McColm, Ph.D.

Committee Member

Masahico Saito, Ph.D.

Committee Member

Richard Stark, Ph.D.

Committee Member

Stephen Suen, Ph.D.


codes, Watson-Crick Involution, DNA codes


The set of all sequences that are generated by a bio-molecular protocol forms a language over the four letter alphabet Delta = [A,G,C,T]. This alphabet is associated with natural involution mapping Theta, A maps to T and G maps to C which is an antimorphism of Delta* In order to avoid undesirable Watson-Crick bonds between the words the language has to satisfy certain coding properties. Hence for an involution Theta we consider involution codes: Theta-infix, Theta-comma-free, Theta-k-codes and Theta-subword-k-codes which avoid certain undesirable hybridization. We investigate the closure properties of these codes and also the conditions under which both X and X+ are the same type of involution codes. We provide properties of the splicing system such that the language generated by the system preserves the desired properties of code words. Algebraic characterizations of these involutions through their syntactic monoids have also been discussed. Methods of constructing involution codes that are strictly locally testable are also given. General methods for generating such involution codes are given and teh information capacity of these codes show to be optimal in most cases. A specific set of these codes were chosen for experimental testing and the results of these experiments are presented.